Duplicates Invoices – Root Cause Analysis

Cost recovery firms make millions of dollars identifying and recovering duplicate payments.  They often have well developed analytics that can identify duplicate payments while reducing the number of false positives.  You will pay 25-50% but you are getting money back, so it feels like a win-win.  However, there are two things to keep in mind: 1) they go after the low hanging fruit and the largest possible duplicates; and 2) they don’t identify the root cause of the duplicates – so you bring them back next year. A preferrable solution is to develop your own duplicate analytics and use this to not only identify duplicates, but also to support a root cause analysis. Even if you don’t have the expertise to develop the analytics, a consultant can provide you with a turnkey solution that you will be able to run year after year – for less than the 25-50% you are giving a cost recovery firm for just one year.

The identification of duplicates ultimately comes down to finding a balance between the False Positives, False Negatives and True Duplicates.  The ideal situation is one where the number of False Positive (records identified as duplicates which are not duplicates) is Low; the number of False Negatives (records identified as not duplicate when they are duplicates) is Low; and the number of True Duplicates (correctly identified duplicates) is High.  This balance is directly affected by the criteria you select for the duplicates analysis.  If you tighten your criteria for what constitutes a duplicate (e.g., using “Vendor Name and Address” instead of “Vendor Name”), you will reduce the number of False Positives, but you will also increase the number of False Negatives.  Conversely if you loosen the criteria (e.g., using “Vendor Name” instead of “Vendor Name and Address”) you will get less False Negatives, but more False Positives.  So, the selection of the criteria directly affects the identification of duplicates, and the number of False Positive and False Negatives.  Where the balance point exists for your organization will depend on the objective of your test for duplicates (why are you trying to identify duplicates); the impact of having duplicates; the risk tolerance level for duplicates; and the amount of time and effort you are willing to spend on identifying and validating the duplicate test results.

The following outlines some of the steps you can take to reduce both False Positive and False Negatives when identifying potential duplicate invoice payments.

False Positives: invoice payments identified as potential duplicates when they are not duplicates.

  • Reversed invoices.  A key step to reducing the number of False Positives is to determine the process for reversing an invoice transaction.  This could be a reversal flag, the transaction id of the reversing entry, the use of a specific type of clearing transaction (e.g., clearing transactions starting with “17” are reversals), etc.  
  • Recurring or Monthly payments.  Some organization use the same invoice for recurring or monthly payments.  Since the Invoice number, vendor number and amount or the same each month, these could be identified as potential duplicates.  Does the system have a ‘recurring entry’ flag, or do you want to include a criterion that the duplicates must be with, for example, 20 days of each other.
  • Generic invoice numbers. This is an issue with the accounts payable process whereby, unusually non-vendor invoices, generic invoice numbers are used.  For example, the reimbursement of ravel expenses for ten employees that attended the same conference might be recorded with the invoice number “Reimburse_Jul2022” for each employee.  Personally, I would report this as a business practice that reduces the effectiveness of the duplicate invoice payment controls, and the A/P manager should be notified of the issue and encouraged to develop a procedure for ensuring invoice numbers are unique.
  • Duplicate Criteria too Loose: having duplicate criteria that casts too wide a net, for example, same invoice number and amount.  Two valid companies could easily have invoices with invoice number “2023-01” and amount $500.  This would be identified as a potential duplicate (a False Positive). 

False Negatives: invoice payments that are not identified as duplicates when they are duplicates.

  • Normalize data for consistency.  Each field used as part of the criteria to determine if an invoice payment if a duplicate or not should be normalized.  For example, if the duplicate criteria are same Invoice Number and Same Vendor Name, then at a minimum you should convert these fields to uppercase and remove all special characters (e.g., invoice “#A123”, “A-123”, “A 123”, and “a123” all become normalized as “A123”).  Similar technics should be used to normalize Vendor Names (see https://caats.ca/2019/01/04/identifying-potential-duplicates/ for more info on cleaning/normalizing data).
  • Duplicate Criteria too Strict: having duplicate criteria that is too restrictive, for example, requiring invoices to have the same: Invoice number, vendor number, invoice date and invoice amount.  If you have duplicates in the vendor table and the invoice was paid to the same vendor with different vendor numbers, then it would not be identified as a duplicate (a False negative).

Root Cause

Once you have run your duplicate analysis and have identified the potential duplicates, you can run additional analysis to determine the root cause – the controls that are allowing the invoices to be paid twice and not preventing or detecting them. The first step is to identify the existing IT controls; for SAP, a duplicate is identified if two documents have the same Company Code, Invoice Number, Vendor Number, Invoice Date and Amount. If any one of these is not the same, the system will not identify the second document as a potential duplicate.

Aside: SAP can be configured such that potential duplicates

  1. are not identified – no warning is issued to the A/P clerk
  2. warning is at the bottom of the screen – appears when the clerk enters the document with option to cancel entry or accept it. Since it is hard to notice, the clerk may simply press enter again which accepts the document without realizing that it was flagged as a potential duplicate.
  3. Warning box appears in the center of the screen with option to cancel entry or accept it.

So, there are five field that can affect the identification of duplicates. Knowing this makes it easy to determine the controls that are not working. Given that you have found duplicates, the criteria applied is likely not the same as the IT controls, although you could still find duplicates with the exact criteria as the system, because the clerk can always accept the entry (i.e., ignore or override the duplicate warning).  I have had cases where the A/P clerk told me that they thought the system was wrong because they had an original invoice in their possession so it could not be a duplicate. I have also had cases where the A/P clerk entered a comma at the end of the invoice number to address the system warning about a possible duplicate.

  • Root cause #1, clerks do not understand, see, or believe the system duplicate warning.

While I run several different duplicate criteria, my favorites are same invoice number, vendor name, and amount, and same invoice number, vendor number, and amount. I should mention that I clean the invoice number and the vendor name. I also add a criterion that the duplicates must be paid with ‘20’ days of each other; and, of course, have not been reversed. Note: I am not trying to find all potential duplicate. I am trying to reduce the False Positives and identify the control weaknesses to prevent future duplicates. I also want to have at least some criteria that do not include the system’s criteria (e.g., I run at least one test with the criterion same vendor number and another without criterion same invoice date).

The different duplicate analysis results are combined into a single file (Dups_All) with an indicator of which criteria identified the duplicate (many duplicates will be identified by more than one test (e.g., a duplicate identified by same invoice number, same vendor number and amount will also be flagged as a duplicate by the criteria same invoice number and amount). Using a total duplicate rank will allow you to focus on the highest risk duplicates.

Looking at the system’s criteria for duplicates individually:

  • Same company code – should be a system entry based on the user logon. This means that the user would have to process an invoice for “0001” company code while logged on to “0002”.
  • Same Vendor number – this can be compromised by duplicates in the vendor table or entry of the wrong vendor number. A follow-up test would be to run an analysis on the vendor table to identify potential duplicate vendors.
  • Same invoice number – this can be compromised by inconsistent invoice number entry by the clerk.
  • Same invoice date – this can be compromised by the intentional or accidental entry of the incorrect date; but, in my experience with SAP accounts payable, it is more often the fact that the invoice date defaults to the entry date and is not changed by the clerk.
  • Same amount – harder to identify with duplicate criteria. I usually use anomaly detection based on vendor and amount and vendor and invoice number sequencing.

So, all the criteria can be compromised by the A/P clerk when the invoice is entered. Therefore, these are the controls we should be testing.

Using same invoice number, same vendor name and same amount results we can perform additional analysis of the duplicates results file

  • Invoice date – run duplicate analysis on same invoice number, vendor name and amount but different invoice date.
    • Root cause #2 – invoice date is being entered incorrectly or (more like) being allowed to default to the entry date. Invoice entered today and the same invoice entered tomorrow will have different invoice dates.
  • Invoice number – run duplicate analysis on same (clean) invoice number, vendor name and amount but different (original) invoice number.
    • Root cause #3 – inconsistent entry of invoice numbers by A/P clerks.
  • Vendor number – run duplicate on same invoice number, vendor name and amount but different vendor number
    • Root cause #4 – duplicates in the vendor table.

Now you can provide recommendations that address the root cause:

  1. Clerks not understanding, seeing, or believing duplicate warning – A/P clerk training, and IT configuration of warning.
  2. Clerks allowing invoice date to default to entry date – A/P clerk training and (possibly) IT invoice date entry control.
  3. Entry of invoice number – A/P clerk training, and procedure or how to enter invoice number.
  4. Duplicate vendor numbers – Authorization controls over vendor creation, and A/P clerk training on how to search for existing vendor before creating a new vendor.

The duplicate analysis may not have identified all possible duplicates, but by addressing the control weaknesses – improving the preventive and detective controls – the number of duplicates in subsequent periods will be reduced.

Dave Coderrre


Leave a Reply

Your email address will not be published. Required fields are marked *