The Randomness Problem: How Lava Lamps Protect the Internet

The Randomness Problem: How Lava Lamps Protect the Internet


SciShow is supported by Brilliant.org. Go to Brilliant.org/SciShow to get 20% off
of an annual Premium subscription. [♩INTRO] You jog onto the field for another NFL football
game or at least your team’s video game avatars
do, which, I mean, let’s face it, is about as
close as you’ll get. The virtual ref tosses his virtual coin to
see who gets dibs on the ball, and…you lose! Once again, you’re stuck letting the computer
receive first. It happens so often it seems unfair. Come to think of it, what if it is? After all, the computer’s following a fixed
program. How could it ever give you a fair, random
coin toss? The whole point of computers is that they’re
predictable! And if those coin tosses weren’t random,
how would you ever know? These questions matter for more than just
games. Getting computers to behave randomly is crucial
for simulations, scientific studies, and cybersecurity. But for all its importance, randomness is
surprisingly hard to define, and even harder to produce. Computers just aren’t that good at being
random. In fact, most computerized randomness is basically
faked; it’s totally predictable. For real, honest-to-goodness randomness, engineers turn to some pretty odd tricks. It turns out a lot of the internet is kept
secure by lava lamps, Geiger counters, and the occasional rooftop
microphone! There are lots of uses for digital randomness. Think of a pollster who wants to call a representative
sample of people, or medical researchers deciding who in their
study gets which treatment. For the statistics to work, they have to choose
the phone numbers or treatment groups randomly. Another example is predicting the behavior
of very complex systems, like weather forecasts or traffic patterns. These systems may not have simple, neat equations that tell you how they’ll
develop. Instead, simulations often model the world as a collection of small randomized components. So a traffic simulator might assume that cars
randomly show up at a highway entrance at an average rate of
10 per minute, then watch how the simulation proceeds. Randomness is also a critical part of basic
network technology. If your laptop sends a packet of data over
the wifi network at the same moment as your roommate’s, the packets collide they garble each other, and each computer
has to resend. But it wouldn’t help anything if each computer,
following the same deterministic program, tried to retransmit at exactly the
same time. You’d just get collisions over and over! Instead, they wait a random amount of time
after a collision. This type of symmetry breaking is impossible without some form of randomness. But the most common use for randomness is
both high-stakes and very necessary: encryption, which requires
secret data that an adversary can’t predict. For example, every time you connect to a secure
website like an online banking portal, your computer
and the server running the site have to agree on encryption
keys. The keys are like a more sophisticated version
of a secret decoder ring: each computer can scramble its messages so
that only someone with the same key can decode it. Obviously, no one should be able to guess
the secret key, or they’ll be able to see your bank account information or even
steal your money. So your computer and the server base the keys
on random numbers to make them nearly impossible for your online nemesis
to guess. Randomness is so valuable that when the RAND
Corporation published a 600-page book with a million random digits
in 1955, it was considered a landmark contribution
to science. Now that computers have gotten faster, the RAND book is mostly useful as a source
of hilarious Amazon reviews. But getting computers to generate numbers
that are actually random is still much harder than you might think. The first problem to solve is how to even
define randomness, because our intuitions about what’s random
aren’t always great. Take this demonstration from paleontologist
and author Stephen Jay Gould. In which image would you guess the dots were
placed totally at random? If you guessed the one on the left, you’re
not alone. Most people see the clumps in the right image
as patterns. But in fact, it’s just the opposite! It’s the even spacing on the left that’s
the result of a pattern: the program that generated the dots was modified to forbid
them from being too close together. Here’s another brain twist: tossing a coin
twenty times is just as likely to produce ten heads and then ten tails as it
is to mix them up. Neither sequence is inherently less random. To define randomness properly, the trick is
to think not just about particular string of numbers, but about an infinite sequence
of them. If a so-called random coin flipper kept producing
clumps of heads and clumps of tails forever, then we could definitively
say it’s not random. Mathematicians have used the concept of infinite
sequences to come up with a few different definitions for randomness,
but the simplest one describes randomness as unpredictability. If you were betting on future digits in the
sequence, you couldn’t find an algorithm a mathematical procedure for a computer to
follow whose predictions would make money in the
long run. The other definitions all turn out to be equivalent
to this: if a sequence is random, you can’t describe
it using some sort of algorithm. Maybe you can start to see the problem here:
if by definition, you can’t use a program to describe a random sequence, how are you supposed to write a program to
generate one? Not only that, but if you try to build a random
number generator, how can you tell if you’ve succeeded? You can’t check a whole infinite sequence
for randomness, or check all possible betting algorithms to
see if any make money. When it comes to generating random numbers,
most of the time, computers make do with pseudo-random number
generators, or PRNGs. The numbers they generate aren’t truly random
… but they’re close. The algorithms take a seed, some value to
start with, generate a number from it, update the seed,
and repeat. For instance, mathematician John von Neumann
suggested the middle square method: you take the seed,
square it, and grab the middle few digits as the next
random number. That number also serves as the next seed. Unfortunately, the middle square method tends
to start looping through the same cycle of numbers. But computer scientists have designed many
more sophisticated algorithms that do better, some with awesome names like
“Fortuna” and “the Mersenne Twister.” To check if a pseudo-random number generator is a good approximation of randomness, you can use a combination of theoretical analysis
and empirical testing. Theoretical analysis looks not at a particular
sequence of numbers, but at the process that produced them. Even if the process wasn’t truly random,
mathematicians can sometimes prove statements like “This algorithm will generate
at least 14 gazillion digits before the sequence wraps around and starts
repeating.” For empirical testing, you just generate some
random numbers and see if the statistics look right, things like whether evens and odds are equally
common. The stats will always be a bit off, so you
can never say for sure that the sequence was randomly generated. But you can at least estimate how likely the
stats you’re seeing should be. Under most statistical tests, good pseudo-random
numbers are indistinguishable from true randomness… if you don’t know the initial seed. That’s good enough for most purposes. I mean, who cares if your virtual football
game isn’t 100% unpredictable? It’s not a big deal for things like medical
research or predicting traffic patterns, either, as long as the numbers produce the
same statistics as randomness. In fact, sometimes pseudo-randomness has real
advantages. A randomized program is easier to troubleshoot
if you can make it do all the same things again by providing the same seed. And in some uses of PRNGs, determinism is
a core design feature. For example, a remote car key fob works by
running a PRNG with a seed known only to it and your car. When you click the unlock button, the fob
radios its next random number to the car, which checks that number against the
next outputs of the PRNG. To a thief, the numbers look unpredictable. But the system only works because the key
and the car produce the same series of numbers from their shared seed. Sometimes, though, PRNGs aren’t enough. Even if the algorithm is flawless, an attacker
who somehow figures out the seed can predict all the numbers. Ironically, the website Hacker News once became a poster child for this problem. Back in 2009, a security researcher realized
that the site’s random number generator, which assigned IDs to logged-in
users, was seeded with the time when the server started up. By triggering a server restart, the attacker
was able to narrow down the possible seeds to a small set of numbers. That let him predict other users’ IDs and
impersonate them. A similar problem a decade earlier allowed another team of researchers to cheat at online
poker. A leaked seed could be especially catastrophic
for a PRNG that was used for encryption: if an attacker uncovered the seed,
they could guess the encryption keys, then go back and decrypt all the previous
messages. So when the stakes are high, people get creative looking for sources of
true randomness. The most common place to look is external
noise factors that someone could in principle influence, but which the software can’t predict. For example, the website Random.org has a
rooftop microphone that captures atmospheric noise, unpredictable radio waves coming mostly from
lightning and space. Most computers use more readily available
sources of noise, like the timing between mouse clicks or incoming
network messages. Some software will even ask you to scoot your
mouse around while it’s generating big encryption keys. For the highest-stakes uses, though, you want
intrinsic randomness, something that’s guaranteed unpredictable by the laws of physics no matter what the
attacker knows or does. The Internet giant Cloudflare, which provides
encryption for about 10% of all Internet traffic, points
a video camera at a wall of lava lamps in the lobby of its
San Francisco office. Meanwhile, in its London office, a camera watches three pendulums hanging from
pendulums. Both are examples of chaotic systems: the tiniest inaccuracy in your knowledge of
their configuration will send your future predictions wildly off-base. So those camera streams are effectively random. Intrinsic randomness can also come from smaller-scale
physics. Some hardware random number generators record
tiny fluctuations in the movements of electrons in the circuitry. And Geiger counters trained on decaying hunks
of uranium will click at a rate determined by quantum mechanics, which is
inherently probabilistic. So, by the strict definition, it’s more
likely than not that your football game isn’t truly random. It’s probably flipping a coin based on a
simple PRNG seeded with the time you started playing. Even so, losing a bunch of coin flips is probably
just a brief unlucky streak; you’d be hard-pressed to show any bias with
larger statistical tests. For high-stakes contexts like security, though,
rock-solid randomness is essential, and there’s a whole cottage
industry for supplying it. Lava lamps may not be the height of cool anymore,
but as it turns out, they are still useful for security. Or maybe programmers just really miss the
70s. If you miss 1970s bedroom decor, or even if
you don’t, you can still strengthen your programming
skills at Brilliant.org/SciShow with the Computer Science Fundamentals course. Even if you know nothing about computer science, the lessons and quizzes build on each other. So it’s impossible to finish a lesson without
improving. And if you are a confident computer programmer, the quizzes might still bend your brain a
little bit, and you can always work on problems posed by other Brilliant.org
premium members. Brilliant is offering the first 200 SciShow
viewers to go to Brilliant.org/SciShow a 20% discount off of an annual premium subscription. So check it out! You’ll definitely learn and have fun, and you’ll be helping to support SciShow
while you do! So, thanks! [♩OUTRO]

Danny Hutson

100 thoughts on “The Randomness Problem: How Lava Lamps Protect the Internet

  1. Note: At 3:55, the comparison between different sets of coin flips was between two specific sequences, like 10 heads and then 10 tails versus alternating heads and tails, not between half heads/half tails and any other mixed-up order. Thanks to those who pointed out the potential for confusion!

  2. Computers (generally) resolve packet collisions with a system known as Binary Exponential Backoff, which eschews the requirement for random numbers.

  3. Some sources of random : Any radio set between two stations, an analog TV set on any station (bonus point if the station is unused), a mike on a busy street, a feed from some radio telescope, …

  4. What about TRNGs? (True Random Number Circuits) Are those not used or are they just not used enough to warrant mention?

  5. Actually lava lamps streamed over the internet are deterministic. Because you only need the time of the start of the streaming and of course the stream itself. So also not really rnd for protecting the inernet 😉

  6. My mind wanders off sometimes. This is something I was thinking about in my car the other day. Glad I finally got an answer

  7. I've always wanted a truly random shuffle option on my music players. I want to be able to select a large number of songs or even my entire library at once and play on shuffle… But it seems that once you go over several hundred songs, or thousand song playlists…Every music player starts doing the same pattern… It plays the same songs on repeat..anyone else have this same issue?

  8. This is a great video. Good job! You should do one about the mathematical limitations of computers beyond just RNG. i.e. how a computer "cheats" to do anything but addition 😀

  9. Yup. Familiar with this problem.
    Hence all sorts of secondary problems associated with people forgetting that binary computers cannot engender true randomness.
    Fractals come to mind.

  10. Too bad the Lava Lamps do nothing to protect us from unhinged political activists banning people who they disagree with from social media services and from funding sources and also manipulating search engine results to influence public opinion for political reasons.

  11. You're a little off on how SSL/TLS works. SSL works more like this: You come up with a super secret random hash and a public hash. The Public key is a lock, the secret hash is a key.

    When you want to talk to someone you ask for their public key, then you use that key to encrypt your message to them, now only they can decrypt ot because only they have thier private key, where as when they talk tp you they use your public key to encrypt the message to you

  12. Another thing that I think should be stated even more so is that all of these methods are equally as useful as each other in their own niches.
    PRNGs are great for stuff where you need a guarantee of certain amounts of randomness, for example. Predictable outcomes in gambling, games. Literally controllable randomness. It's grafting difficulty levels on to an RNG. Great for procedural generation as well.
    TRNGs are great for security, for certain sciences.

    Also, I laughed at the dot diagram. I know that diagram all too well.
    I actually wanted to avoid the true random outcome for some cases, but want it for others.
    Both for the same code, in fact. Just made it a flag that can get set depending on the need.

    Fuzzy logic processor units are seemingly making a bigger appearance in processors these days.
    Hopefully they become more popular because they are great for saving power too. You use things that don't require precise math, you can run it over an fuzzy logic system, then correct it after so many steps. A good example is animation of a game, or animating a window moving, scrolling a page fast where detail isn't going to be needed because you couldn't hope to read anything anyway, etc.
    They're great for low power systems.
    BUT where they shine is randomness for AI behaviours. Fuzzy-weighted averages are great for that stuff! Our brain does a slightly similar thing when it does decision-making, but at an even more fundamental level over millions of more nodes compared to the beefy-but-limited nodes in fuzzy logic processors.

  13. I know this is a very late comment but if anyone wont to look at the lava lamps, mention in this video, you can go here. https://youtu.be/1cUUfMeOijg Watch Tom Scott talk about the whole process.

  14. True Randomness = Impossible…

    Artificial/Natural Randomness = Barely predictable and easy to break down on how that type randomness really work…

    Infinite Randomness = ERROR

    Personal Randomness = YOU AND ME… Start the *Personal Judgement Sport*!

  15. Hmm… "We can make religion out of this?".

    (Come on i can't be the first one to directly think of this quote after hearing this info?)

  16. The example of the lava-lamps still do not provide a secure truly random solution. This example is the point where the understanding of randomness and security in this video break down.

    It is not that predictability and subsequent security in random numbers is about the ability to guess one o a set of numbers, it is more about the ability to statistically influence the outcome which would be a problem.

    Consider the lava-lamps. While their "motion" seems very random indeed, the mechanism that produces it, namely heat, is quite easy to manipulate. There could even be an inherent seasonal bias, but if you have access to the buildings AC/heating system or information about it, the procedure would already be compromised.

    The question then simply is, how much of a heat variance if necessary to nudge the output to say favor more 1's than 0's. If no heat produces only 0's, or some pattern that clearly is bias, then there must by definition be a threshold at which point you can influence and predict outcomes.

  17. Hey Streaky, since you asked I recently did a pretty thorough evaluation of the elliptical filters on the market. Maybe you'll find it interesting: http://ianstewartmusic.us/blog/center-that-sub

  18. Compare the point about any specific permutation having the same probability (coins) with the point being made with the black-on-white dots. What justification do we have to guess between the two sets if they were generated randomly or not? The answer is there are more ways for the placement of dots to look "messy" than "neat"

  19. There is a website for d20 rolls which measures the spin of multiple individual photons to come up with a truly random roll of the dice.

  20. Anyone who is tempted to roll his own broken random number generator (RNG) should at least XOR its output with /dev/urandom of any non-ancient linux machine.

    Things like the video stream of lava-lamps is just marketing, it is snakeoil, nothing more!

    Behind /dev/urandom (and /dev/random) of Linux machines is a cryptographically secure pseudo random number generator (CSPRNG) that has been audited by many people, which constantly adds entropy from many different hardware events to its seed. The /dev/random variant is the same, it just blocks if for some time no entropy has been added to the seed.

    If people would just read their randomness from /dev/urandom instead of implementing their own broken random number generator (or setting the seed to time()) , then many of these security problems had been avoided.

    If you don't trust some single source of randomness, like I wouldn't trust any weird implementation or like many people think that /dev/urandom is bad just because /dev/random exists (it isn't true, urandom is OK!), then XOR the outputs of RNGs together. Even if one of the RNGs is so broken that it only produces nothing but ones or nothing but zeros, the XOR operation with a functioning CSPRNG makes sure that the output is cryptographically secure.

    And always remember, the less complexity a system has, the more secure it is. A function that just reads from /dev/urandom is less complex than some convoluted procedure to extract randomness from a video of lava-lamps.

  21. gear drops in WoW dongeons before the system was changed to personal loot had seeds baced partly on who was the first person in the instance. I think time of day may have been a factor too. love watching 1 in 8 chance of a drop taking 20 chances, and repeating the other drops day after day. Before the change, I would reroll the seed if i was not getting what I wanted from the instance

  22. Complete randomness is impossible since one instant of time is fully dependant on the previous and the next one. Not even probability in quantum physics is random, since it follows a pattern. It makes sense from a deterministic point of view, events that are guided by probability are just points that you come across as you travel in time but they were determined in the same instant as the universe and its whole history was created. No less real before, during, or after they have happened. Like driving along a countryside road counting cattle, you don't know where the next cow is gonna be but its not random, the cow is there before you pass it and will not disappear because you consider it to be history. After driving for a while you can say that with a certain probability, there is going to be a certain number of cows within a certain time, just like probability in quantum physics, and if you use this information to create a cow crypto, computer models will be able to break it by starting with the most likely key and continue with less and less likely ones until it finds the right one.

  23. My go-to source for truly random numbers is anu.edu.au. They base their results on quantum vacuum:
    https://qrng.anu.edu.au/

  24. Or, they could have used the hardware based True Random Number Generator (TRNG) that has been available in pretty much every PC and server since at least as far back as 2006, either on the motherboard chip-set or on the CPU itself.

    There are also PCIE cards available that generate high speed true random number sequences specifically for network encryption purposes, some of which are based upon random quantum phenomenon (qStream TRNG for example).

    Using images of lava lamps is just silly.

  25. i know that early video games used player input as the base of their randomness. they would check frame by frame what the player was doing. This dose make any scenario able to be reliably replicated in a frame-by-frame input editor, but to a normal player over the course of an entire game it would seem truly random.

    Pokemon red and blue are a great example of this, though the RNG calculations start when you reload the game, not from every input from the entire game (In other words if you turn off the game the randomness resets

    in summary, humans are a great source of randomness in programing

  26. Eh, 30 years ago I thought about this too and used a series of webcam pointed outside at traffic for generating random numbers. It was just an experiment though, no production use.. otherwise would have had to address things like nighttime where you don't get nearly as much color.. just random gradients.

  27. On the random dots, I thought the one on the right was random, because the one on the left has even spacing.
    Which is opposite to most people.

  28. The day we figure out the universe's 'encryption key' (Singularity/the state of the universe just before the big expansion), is the day we will predict the future.

  29. '… lava lamps may not be the hight of coolness …", sacrilege! Otherwise, excellent, clear discussion, as far as I am competent to comment (undergraduate level Maths, emphasis in probability and statistics, but far from an "expert", completed), also very accurate.

  30. Computers predictable… ha ha ha ha ha

    The more complex our software gets, the more unpredictable computers behave. Windows itself is unpredictable from identical machine to identical machine.

  31. So in short, it's super easy to build a circuit or device that uses the environment to generate true randomness. Easy solutions include tiny fluctuations in the air or the physical orientation of your phone, or like on that graphics card, variations in electron flow. We only use the time because it's cheap and easy when pseudorandomness is perfectly fine.

  32. This video scarred me for life. Whenever I'm playing a game I'm looking for patterns now, I'll never believe loot boxes "rng"

  33. If true randomness exists in the real world, does that mean that physics will never be able to explain everything through formula’s?

  34. 661083078164559602620747296673926729160100283928628268373718736535154142893937363682825281638
    Easy

  35. in older pokemon games, if you use a savestate before a battle or wild pokemon encounter, and reload that savestate, the wild pokemon will often be the same and the trainer will use the same moves given you use the same moves as well. you can change it by starting the battle at a different time or going into menus and whatnot(i think). there are certain glitches in games that let you manipulate the rng or be able to time your movements so the rng is where it needs to be. if im not mistaken, the glitch that gets you straight from the deku tree to the final boss fight in loz oot works like this.

  36. Stefan needs some personality. Like… what does he think about this stuff? Is he interested? Is he just reading a script? He's a great presenter but could be geater!

Leave a Reply

Your email address will not be published. Required fields are marked *