Monday, July 26, 2010

Design Detection Heuristics


Benford's law provides a useful heuristic to detect data that has been produced by a person. This is very useful to detect fraud, tampering, vote rigging and other activities where one needs a little help. It appears thought that the application of Benford's law is more of an art than a science and rather than being the smoking gun one would like, it serves as the starting point for an investigation or a trigger for caution.

I've developed a Splunk App that adds a new command to the Splunk search language that calculates the first digit distribution, which can then be used to graph the field of interest.

* | benford field=price | table digit price benford

Other digits can be selected as follows

* | benford field=price digit=2 | table digit price benford


Here's some sample transactions I generated

The benford command will calculate the distribution of the first digit and produce a table, which can be graphed.











The following graph illustrates the digit distribution compared to the benford distribution.













The following graph was created using real transactional data.

2 comments:

  1. You can download a copy of the App from http://bit.ly/9JBoPm

    ReplyDelete
  2. Even more interesting
    "Benford's Law And A Theory of Everything"

    A new relationship between Benford's Law and the statistics of fundamental physics may hint at a deeper theory of everything.

    http://pubsub.com/Benfords-Law-And-A-Theory-of-Everything_Tech-Physics-4DWoeoTi3oA,7nezUbfaxqEE

    ReplyDelete