Statistics

Sometimes even the simplest analytics are extremely useful.  They often tell you what is happening in the detailed transactions without you having to make assumptions about the data.  STATISTICS is one of those commands: with a couple clicks of the mouse you can get:

  • Min and Max
  • Highest and Lowest ‘X’ values
  • Number of Positive, Negative and Zero valued transactions
  • Average
  • Median, Mode and Q25 and Q75

But like any analysis, you need to know why you are running the analysis and you need to be able to interpret and understand the results.

Why run STATISTICS?  I run Statistics to get an overview of my data.  It helps me to understand the nature of the transactions, to verify my scope, to highlight anomalies, and much much more.  Here are a couple of examples:

  • Count
    • I use this to verify the control totals. In one case I found that my Total for a critical field was out by a significant amount.  It turned out that the numeric fields had 3 decimal points (implied) and not two as I was told.  Without verify my control total before doing additional analysis, I would have wasted a lot of time.
  • Min/Max values:
    • All transactions should be in the previous fiscal year. The Min value should be less than or equal to the start of the fiscal year and the max should be greater than or equal to the end of the fiscal year.  I found billing transactions that were in the wrong fiscal year.
    • All credit card (p-card) transactions should be less than $500.00. Not only did I find amount greater than $500, but I also found numerous credits being processed – which led to a fraud discovery.
  • Highest/Lowest values:
    • The Maximum value was significantly higher than the second highest value. It ended up being an error where the invoice number was entered into the amount field – making the amount much larger than it should have been.
  • Number of Positives/Negatives and Zero values
    • Found hundreds of thousands of transactions with zero values which led to a training issue. A/P clerks could use bulk entry forms for simple invoices.  The form automatically generated a blank invoice entry form with 100 lines.  If the clerk did not type “Cntl –“ before saving the entered invoice details, the blank (unused) lines remained and were processed as invoices with zero value.  This was causing the system to run slow and generating exception reports.
    • Found several hundred Negative “quantity received” amounts. The Receipting clerk at a warehouse was entering negative receipts after entering the correct receipt amount.  Invoice was paid based on the correct receipt amount; but the negative receipt entry (without an invoice number) reduced the quantity in inventory and allowed her to steal the items.
  • Mode value
    • This identified a fraud where an employee was entering meal charges when not on travel status. Amounts under $10.00 did not require approval.  The employee entered hundreds of meal claims at $9.99 which was pickup by the mode option.
    • The most often occurring date was just before bonuses were determined. Salespersons were entering a large number of fictitious sales on the day before performance bonuses were calculated.  The sales were reversed at later dates.
    • The most occurring amount for contracts was just under the “must go competitive” value. Managers were establishing contracts at $49,900 – which was below the $50,000 limit for sole-sourced contracts.

In general, STATISTICS lets me see what the data looks like from a high-level perspective.  However, it also can quickly identify anomalies (e.g. negative when all transactions should be positive; values outside the expected range; unusual maximum values, etc.).  .  This helps with determining risk, sampling strategies, identification of control weaknesses, materiality amounts, etc.  When combined with STRATIFY, I can get an overview of the ranges (or strata) of the data which provides even more useful information.

Leave a comment