Curiosities in Pi: Why 222444 and 666111 Appear in the First 20,000 Digits

Why is there a 222444 and a 666111 close to each other in the first 20,000 digits of pi?

Have you ever noticed that sometimes certain sequences of digits appear unexpectedly in the decimal expansion of pi? For instance, why does 222444 and 666111 seem to crop up near each other in the first 20,000 digits? In this exploration, we will delve into the probabilities and curiosities surrounding these patterns and why they may arise.

Digits of Pi and Uniform Distribution

The decimal expansion of pi (pi;) is a celebrated continuous, non-terminating, non-repeating number that fascinates mathematicians and enthusiasts alike. As an irrational number, pi's digits are believed to be randomly distributed, meaning no particular digit or sequence is more likely to appear than any other.

When we scrutinize the first 20,000 digits of pi, we observe that they are close to uniformly distributed for each digit:

1: 1,997 times 2: 1,986 times 3: 1,987 times 4: 2,043 times 5: 2,082 times 6: 2,017 times 7: 1,953 times 8: 1,961 times 9: 2,020 times 0: 1,954 times

It's important to note that while these digits appear almost uniformly, proving that pi's digits are evenly distributed in every possible sequence length is currently out of reach. The question of whether individual digits and sequences occur with equal frequency remains an open problem in mathematics.

Recognizing Patterns in Random Data

Human beings often recognize patterns in data due to cognitive biases and predispositions. Our brains are wired to find meaning in sequences, even in random data. For example, in the first 20,000 digits of pi, we see several patterns that might strike us as peculiar:

14 15 33 11 32823 44 55 111 314 159265358979323846264 33 83279502884197169399375105820974944592307816406286208 99 8628034825342 11 706798214808651 32823 066470938 44 609 55 058223172535940812848 111 745028410270193852

Patterns like these are not statistically significant because they occur randomly and are not expected to be any more frequent than any other arbitrary sequence of digits.

Empirical Evidence and Randomness

To better understand these patterns, let's consider a different approach. Instead of analyzing pi, we can generate a random string of 20,000 digits and observe if similar patterns emerge. When you generate a truly random sequence, you might find sequences like aaabbb that seem remarkable simply because they are common in random data.

In fact, there is an 85% chance that any random 20,000-digit string will contain at least one substring of the form aaabbb. This raises the question: Is the appearance of 222444 and 666111 in the first 20,000 digits of pi a coincidence or a noteworthy pattern?

Avoiding Retrospective Bias

The key to avoiding misleading interpretations is to clearly define what patterns you are looking for before examining the data. If you were searching for patterns like aaabbb without a clear hypothesis, you might adjust your criteria based on the results you find. This can lead to a phenomenon known as optional stopping, where you selectively notice patterns that align with your expectations.

Remember, in a long enough random sequence, many curious patterns will appear, and it is essential to have a clear and explicit hypothesis before you start your analysis. Otherwise, you might mistakenly believe that these patterns are more significant than they actually are.

In conclusion, the appearance of sequences like 222444 and 666111 in the first 20,000 digits of pi is simply an example of the random nature of these digits. If you want to explore these patterns further, make sure to set your criteria upfront to avoid falling into the trap of retrospective bias.