Prime Frequency
- George Fane
- Feb 3, 2020
- 3 min read
While creating my fourth version of a Prime Number Finder, I didn't start by using massive numbers under which to find primes. I started small with primes under 99, then 999, then 9999, and so on. The number of primes under each of those upper bounds kept decreasing as a proportion: 25, 168, and 1229 respectively. I wanted to look into the frequency of prime numbers further.
Frequency with C++
My first run at studying prime frequency, about a month ago, used the same algorithm of finding primes as PNF v4, but outputted prime position divided by the corresponding prime number instead of only the prime number. Copying the console output over to a spreadsheet proved much more tedious than initially expected. As it turned out, I couldn't paste the 10,000 frequencies all at once; instead, I had to copy it roughly 1,000 at a time. This meant only printing frequencies between certain ranges of prime position, like 1 to 1,000, copying that, pasting it into a spreadsheet, adjusting the ranges to 1,001 to 2,000, and repeating the cycle ten times total.
Thankfully, this pain in the neck yielded fruit:
With a power series regression I found a line of best fit (the red line) with a fairly good R-squared value. We should see if this regression holds up for much larger prime positions.
Frequency with Java (Sieve)
After writing my article on Better Prime Finding Methods two days ago, I began to think again about ways to bring over my primes to a spreadsheet, especially from my Sieve of Eratosthenes program in Java that could quickly output all primes below ten million. I recalled one of the assignments in C++ class two years ago, for which we wrote our output to a file instead of the console like usual. Thankfully, learning how to do this in Java was a quick Google search away. I modified my Java Sieve program to print prime frequencies to a .txt file:
From the file, I was able to copy 664,579 frequencies to an Excel spreadsheet all at once. Nice improvement over 1,000 at a time.

Because Excel allows a maximum of 250 data points per chart series, I wrote a formula to include one point per 3000 prime positions:
With the below formula, I found the prime corresponding to my chosen prime positions:

XLOOKUP is a new Excel function that can supplant VLOOKUP, HLOOKUP, and my usual tool for this sort of thing, INDEX MATCH. XLOOKUP's first parameter is the value to find, the second parameter is the area from which to find the value, and the third is the area from which to return a value, which corresponds to the position of the lookup value in the second parameter's area.
I used a very similar formula to select frequencies:

Finally, I projected the prime frequency by position using Chart 1's regression formula:

Then I realized that Google Sheets more easily helps me create charts with multiple series, so I copied Columns E through H to a Sheets file. There, I checked whether my projection based on the first 10.000 primes might hold into the hundreds of thousands of primes:
It did not. Perhaps if primes were so easy to predict, number theory would not be so interesting a field.
Through this article, I learned a new Excel function, XLOOKUP, that would've made my internship last semester a little easier, when I used INDEX MATCH to see whether names and emails existed in a database. I also learned how to print to a text file rather than the console, enabling me to study the output more easily. This helped me especially because I generated and worked with many more primes than ever before, all of those below ten million.
Thank you for reading!
George Fane
Included Links
Program 1 - PNFv4 in C++: https://onlinegdb.com/ByNUpoAhH
Chart 1 - Frequency in Google Sheets from PNFv4 in C++: https://docs.google.com/spreadsheets/d/1iz15E3KW8n5oljrOOrjOgP9J2Z7onYrwP_qOE_W_1KI/edit?usp=sharing
Program 2 - Prime Sieve in Java: https://onlinegdb.com/r1f8Jz1TS
Chart 2 - Frequency in OneDrive Excel from Prime Sieve in Java: https://northvilleschools-my.sharepoint.com/:x:/g/personal/fanege_northvilleschools_net/ES4lzzz1MlFAnKX4QbNYexMBZCpVDhgjrb_iB59lKURQGQ?e=qtpMwy
Chart 3 - Chart 2's Selected Frequencies copied into Google Sheets: https://docs.google.com/spreadsheets/d/1H7c_MTjXOnb98VZ3ymQnfVnjil3vsJ5dxU11rVEaNtY/edit?usp=sharing
Comments