Skip to main content

Big Data 101: Using Large-Scale Data Mining to Find Fraud

Posted on March 10, 2015

You may have heard the term “big data” or “data mining,” but what do those terms mean? Today’s WatchBlog sheds light on how GAO analyzes large amounts of data to identify instances of potential improper payments or fraud.

What Is Data Mining and How Does GAO Use It? Data mining allows us to quickly identify relevant patterns in large databases, typically compiled from multiple sources. We’ve used this technique multiple times to identify potential improper payments or fraud on a large scale. For example, we’ve used data mining to

  • Identify outliers or other particular patterns. For example, we found that about 83,000 Department of Defense employees and contractors who held or were determined eligible for secret, top secret, or other clearances had more than $730 million in unpaid federal tax debt as of June 30, 2012; and
  • Create maps that allow us to easily determine whether there are suspect patterns. For example, the map below shows an expected distribution of greater numbers of people living along the coast receiving relief funds following Hurricane Sandy.


(Excerpted from interactive graphic in GAO-15-15)

What’s Next for Big Data?
In January 2013, we hosted a forum on using data analytics for oversight and law enforcement agencies. Participants discussed how data mining can also be used to help prevent fraud. For example, predictive analytics—a type of sophisticated data mining—can identify fraudulent claims before they are paid. This may help end the “pay-and-chase” model, where agencies and law enforcement must track down and try to recover fraudulently obtained funds after the money has already gotten into fraudsters’ hands.

Interested in learning more about big data and data mining? Check out our Government Data Sharing Community of Practice where you can find the minutes from prior meetings, register for upcoming events, and sign up to receive e-mails.

Comments on GAO’s WatchBlog? Contact


Related Products

About Watchblog

GAO's mission is to provide Congress with fact-based, nonpartisan information that can help improve federal government performance and ensure accountability for the benefit of the American people. GAO launched its WatchBlog in January, 2014, as part of its continuing effort to reach its audiences—Congress and the American people—where they are currently looking for information.

The blog format allows GAO to provide a little more context about its work than it can offer on its other social media platforms. Posts will tie GAO work to current events and the news; show how GAO’s work is affecting agencies or legislation; highlight reports, testimonies, and issue areas where GAO does work; and provide information about GAO itself, among other things.

Please send any feedback on GAO's WatchBlog to