Monday, July 26, 2010

Design Detection Heuristics

Benford's law provides a useful heuristic to detect data that has been produced by a person. This is very useful to detect fraud, tampering, vote rigging and other activities where one needs a little help. It appears thought that the application of Benford's law is more of an art than a science and rather than being the smoking gun one would like, it serves as the starting point for an investigation or a trigger for caution.

I've developed a Splunk App that adds a new command to the Splunk search language that calculates the first digit distribution, which can then be used to graph the field of interest.

* | benford field=price | table digit price benford

Other digits can be selected as follows

* | benford field=price digit=2 | table digit price benford

Here's some sample transactions I generated

The benford command will calculate the distribution of the first digit and produce a table, which can be graphed.

The following graph illustrates the digit distribution compared to the benford distribution.

The following graph was created using real transactional data.


  1. You can download a copy of the App from

  2. Even more interesting
    "Benford's Law And A Theory of Everything"

    A new relationship between Benford's Law and the statistics of fundamental physics may hint at a deeper theory of everything.,7nezUbfaxqEE