Contents

Benford's law states that in many naturally occurring collections of numbers, the leading digit is disproportionately likely to be small. The digit 1 appears as the first digit about 30.1% of the time, while 9 appears only 4.6% of the time — far from the 11.1% each digit would claim if leading digits were uniformly distributed 3. The law applies to electricity bills, street addresses, stock prices, river lengths, populations of countries, physical constants, and molecular weights 2.
The probability that a number's first digit is d follows the formula P(d) = log₁₀(1 + 1/d), which yields this distribution 3:
| Leading digit | Predicted frequency |
|---|---|
| 1 | 30.1% |
| 2 | 17.6% |
| 3 | 12.5% |
| 4 | 9.7% |
| 5 | 7.9% |
| 6 | 6.7% |
| 7 | 5.8% |
| 8 | 5.1% |
| 9 | 4.6% |
Canadian-American astronomer Simon Newcomb first described the pattern in 1881 after noticing that earlier pages of logarithm tables — the ones covering numbers beginning with 1 — were far more worn than later pages 1. He published a two-page note in the American Journal of Mathematics proposing that the mantissae (fractional parts) of the logarithms of naturally occurring numbers are uniformly distributed, which directly implies the first-digit law 16.
The observation lay dormant for 57 years until physicist Frank Benford independently made the same discovery in 1938, also prompted by worn logarithm tables 2. Where Newcomb had offered a brief theoretical note, Benford compiled 20,229 observations across 20 different data sets: surface areas of 335 rivers, populations of 3,259 U.S. cities, 104 physical constants, 1,800 molecular weights, 5,000 entries from a mathematical handbook, 308 numbers from an issue of Reader's Digest, street addresses of the first 342 people listed in American Men of Science, and 418 death rates, among others 23. The law now carries Benford's name — itself an instance of Stigler's law, which holds that no scientific discovery is named after its original discoverer 3.
The law tends to hold when data spans several orders of magnitude. An intuitive explanation comes from exponential growth 4. Consider a population starting at 1,000 that doubles annually. Growing from 1,000 to 2,000 requires the population to double, so the leading digit remains 1 for a full doubling period. Growing from 8,000 to 9,000 requires only a 12.5% increase, so the leading digit 8 passes quickly. This asymmetry resets at each new order of magnitude, producing a persistent excess of small leading digits 4.
The law is also the only first-digit distribution that is invariant under changes of unit. River lengths follow it whether measured in miles, meters, or furlongs; multiplying every value by a constant (the conversion factor) leaves the distribution of leading digits unchanged 45. In 1995, mathematician Theodore Hill proved that this scale invariance, generalized to base invariance, uniquely characterizes Benford's law. His paper in the Proceedings of the American Mathematical Society showed there is exactly one probability measure on the positive reals that is invariant under multiplication by any base 5.
Hill also proved a result about mixed distributions: if you draw numbers from many different probability distributions and pool them, the combined set tends toward Benford's law 34. This explains why collections of unrelated numbers — such as all the figures printed in a newspaper — conform to the pattern even though no single generating process produces them.
When people fabricate financial data, they tend to distribute leading digits uniformly or follow their own intuitive biases, producing digit patterns that deviate from Benford's law 3. In 1992, Mark Nigrini's doctoral dissertation at the University of Cincinnati demonstrated that Benford's law could be applied systematically to detect accounting fraud, pioneering the field of "digital analysis" in auditing 7.
Financial adviser Wesley Rhodes was convicted of investor fraud after prosecutors showed that the leading digits in his fabricated documents deviated from Benford's distribution 4. Computer scientist Jennifer Golbeck applied the law to social networks, finding that the follower counts of a typical user's followers conform to Benford's law, but those of bot networks do not — a technique she used to identify Russian bot accounts on Twitter 84.
Researchers have also applied Benford's law to macroeconomic data. Analysis of Greek government statistics revealed digit patterns inconsistent with the law, lending support to claims that Greece had manipulated economic figures in its application to join the eurozone 4. The 2009 Iranian presidential election drew similar scrutiny when the vote tallies in certain provinces showed anomalous digit distributions 4.
Not all data sets follow Benford's law. Adult human heights, measured in feet, cluster around values starting with 4, 5, and 6. Telephone numbers and lottery results do not follow it either 3. The law applies most reliably to data that spans multiple orders of magnitude and arises from multiplicative processes or the aggregation of diverse sources 3. Data sets with narrow ranges or artificially constrained values — such as prices set to end in .99 — will deviate 3.
Benford's law states that in many naturally occurring collections of numbers, the leading digit is disproportionately likely to be small. The digit 1 appears as the first digit about 30.1% of the time, while 9 appears only 4.6% of the time — far from the 11.1% each digit would claim if leading digits were uniformly distributed 3. The law applies to electricity bills, street addresses, stock prices, river lengths, populations of countries, physical constants, and molecular weights 2.

The probability that a number's first digit is d follows the formula P(d) = log₁₀(1 + 1/d), which yields this distribution 3:
| Leading digit | Predicted frequency |
|---|---|
| 1 | 30.1% |
| 2 | 17.6% |
| 3 | 12.5% |
| 4 | 9.7% |
| 5 | 7.9% |
| 6 | 6.7% |
| 7 | 5.8% |
| 8 | 5.1% |
| 9 | 4.6% |
Canadian-American astronomer Simon Newcomb first described the pattern in 1881 after noticing that earlier pages of logarithm tables — the ones covering numbers beginning with 1 — were far more worn than later pages 1. He published a two-page note in the American Journal of Mathematics proposing that the mantissae (fractional parts) of the logarithms of naturally occurring numbers are uniformly distributed, which directly implies the first-digit law 16.
The observation lay dormant for 57 years until physicist Frank Benford independently made the same discovery in 1938, also prompted by worn logarithm tables 2. Where Newcomb had offered a brief theoretical note, Benford compiled 20,229 observations across 20 different data sets: surface areas of 335 rivers, populations of 3,259 U.S. cities, 104 physical constants, 1,800 molecular weights, 5,000 entries from a mathematical handbook, 308 numbers from an issue of Reader's Digest, street addresses of the first 342 people listed in American Men of Science, and 418 death rates, among others 23. The law now carries Benford's name — itself an instance of Stigler's law, which holds that no scientific discovery is named after its original discoverer 3.
The law tends to hold when data spans several orders of magnitude. An intuitive explanation comes from exponential growth 4. Consider a population starting at 1,000 that doubles annually. Growing from 1,000 to 2,000 requires the population to double, so the leading digit remains 1 for a full doubling period. Growing from 8,000 to 9,000 requires only a 12.5% increase, so the leading digit 8 passes quickly. This asymmetry resets at each new order of magnitude, producing a persistent excess of small leading digits 4.
The law is also the only first-digit distribution that is invariant under changes of unit. River lengths follow it whether measured in miles, meters, or furlongs; multiplying every value by a constant (the conversion factor) leaves the distribution of leading digits unchanged 45. In 1995, mathematician Theodore Hill proved that this scale invariance, generalized to base invariance, uniquely characterizes Benford's law. His paper in the Proceedings of the American Mathematical Society showed there is exactly one probability measure on the positive reals that is invariant under multiplication by any base 5.
Hill also proved a result about mixed distributions: if you draw numbers from many different probability distributions and pool them, the combined set tends toward Benford's law 34. This explains why collections of unrelated numbers — such as all the figures printed in a newspaper — conform to the pattern even though no single generating process produces them.
When people fabricate financial data, they tend to distribute leading digits uniformly or follow their own intuitive biases, producing digit patterns that deviate from Benford's law 3. In 1992, Mark Nigrini's doctoral dissertation at the University of Cincinnati demonstrated that Benford's law could be applied systematically to detect accounting fraud, pioneering the field of "digital analysis" in auditing 7.
Financial adviser Wesley Rhodes was convicted of investor fraud after prosecutors showed that the leading digits in his fabricated documents deviated from Benford's distribution 4. Computer scientist Jennifer Golbeck applied the law to social networks, finding that the follower counts of a typical user's followers conform to Benford's law, but those of bot networks do not — a technique she used to identify Russian bot accounts on Twitter 84.
Researchers have also applied Benford's law to macroeconomic data. Analysis of Greek government statistics revealed digit patterns inconsistent with the law, lending support to claims that Greece had manipulated economic figures in its application to join the eurozone 4. The 2009 Iranian presidential election drew similar scrutiny when the vote tallies in certain provinces showed anomalous digit distributions 4.
Not all data sets follow Benford's law. Adult human heights, measured in feet, cluster around values starting with 4, 5, and 6. Telephone numbers and lottery results do not follow it either 3. The law applies most reliably to data that spans multiple orders of magnitude and arises from multiplicative processes or the aggregation of diverse sources 3. Data sets with narrow ranges or artificially constrained values — such as prices set to end in .99 — will deviate 3.