Value of Peer Community (formerly the ACL Forum)

Recently there was a discussion on the ACL peer community between an ACL expert user and myself over when certain commands were included in the basic set of ACL commands (e.g. Cross Tab).  I had built a cross tab script before the command existed in ACL, but was wrong on both the date I created the script and when it was included as part of the ACL command set.  This got me to thinking about other scripts that had been written by me or many others that expand the analysis capabilities of ACL.  The functionality of some of these scripts in now included in ACL, but they were ahead of their time when built.  In some cases, the functionality still only exists as a script.

I am sure that many of these script are now in the script hub and strongly encourage you to check out the script hub whenever you have what you think is a unique problem.  It is more than likely that someone has had the problem before; the script was posted on the peer community board; and then included in the script hub.

I wish I could attribute each of the following scripts to the original owners, but my memory is not what it used to be, and often it was a collaborative effort with input/improvements from various users.  I am listing these scripts to highlight the power of ACL scripts and the value of reading the peer community and script hub often.

One_Byte – the first time I saw this approach (more of an approach than a script) was to remove or convert unwanted characters in the middle of a record.  In my case, I wanted to remove CRLF in a description field.  Note:  written before the ability to remove unwanted characters from a delimited file was added to the ACL data definition functionality.

Unzip – uses the ACL EXECUTE command and a VB program to extract zip files (unzip).

Transpose – takes a file where the first column becomes the field names and the other columns become data (rows).

My_Benford – compares the first digit(s) distribution to your “audited” first digit(s) distribution.  Used when your data does not follow Benford’s Law distribution.  Note: this was written before Benford was added as a command.

Create_Workspace – creates a workspace from a file which contains the workspace conditions and values.

Levenschtein_Distance – compares field values for duplicates but allows the values to be close but not exact.  Note: written before ACL added fuzzy duplicates as a command.

Total_Within_X_Days – for example, identifies all credit card numbers where there total charges are more than $10K within any 10 day period.

Wildcard_Search – allows you to search for values and include a wildcard (e.g. D*ve – where “*” can be any character). Note: written before REGEX and REGFIND were added to ACL’s functionality.

Vendor_Clean – cleans the Vendor Name and Address – normalizes the vendor name (e.g. “Corp” and “Corporation”) and the address (e.g. “St” and “Street”).

AlphaNum_Dups – converts a string to alphabetic/numeric order and then searches for duplicates (e.g. “Mr. Smith 126” and “Smith 126 Mr” both become “HIMMRST126”)

QGrams_Dup – looks for duplicates based on grouping of the selected field.

Regression_Anal – perform regression analysis.

Project_info – extracts project items (e.g. scripts, table layouts) from an ACL project to a separate file (e.g. Bat or Fmt file).

Keyword_Search – compares a file with keywords in another file.  This functionality has been developed by numerous users using different approaches based on the size of the files, number of keywords, etc.

Import_All_XLSX – imports all excel files within a physical directory.

Sequential_invoices – looks for sequential invoices from the same vendor (possible fraud).

Invoice_Out_of_Sequence – invoice is not part of a vendors usual invoice sequence (possible fraud).

Make_Dir – creates a physical directory.

These are just a few of the hundreds of scripts that I have built or borrowed from others.  But all of them made their way to the peer community and on to the script hub.  I hope that these examples have shown you that ACL is powerful (even before the ability to run R and Python) and is only limited by your imagination.  I have yet to come across an analysis that ACL could not handle.

This article has 7 Comments

  1. Something like this – using RECOFFSET()
    DEFINE FIELD Dup_Within_X_Days COMPUTED (Co_Code = RECOFFSET(Co_Code,1) AND Invno = RECOFFSET(Invno,1) AND Amount = RECOFFSET(Amount,1) AND ABS(Doc_Dte – RECOFFSET(Doc_Dte,1)) < %v_within_X_Days%) OR (Co_Code = RECOFFSET(Co_Code,-1) AND Invno = RECOFFSET(Invno,-1) AND Amount = RECOFFSET(Amount,-1) AND ABS( Doc_Dte - RECOFFSET(Doc_Dte,-1)) < %v_within_X_Days%)

  2. One command/technique I noticed isn’t covered very well (I attended the in-person ACL courses years ago) was looping. Do you happen to have a scripting primer for that? I feel many of my scripts could be improved if I understood this better.

    1. Hi Rob – long time…
      I put together a couple of example for you to consider.

      LOOP Examples

      The first example illustrates how LOOP can be used to separate a string (e.g. Keywords = ‘fraud; hide; avoid; cheat’) into separate records. Each keyword (or holiday dat) would be writtent to a serparate record.

      SET LOOP 0

      v_loop_ctr = OCCURS(“%v_Keywords%”, “;”)
      v_ctr = 1
      GROUP WHILE RECNO() = 1
      LOOP WHILE v_ctr <= v_loop_ctr EXTRACT ALLTRIM(SPLIT(v_Keywords,',',v_ctr)) AS "Keyword" v_ctr AS "Count" to temp_keyword v_ctr = v_ctr + 1 END END The second example shows how you could use the string (keywords) without extracting to separate records. v_loop_ctr = OCCURS("%v_Keywords%", ";") GROUP LOOP WHILE v_ctr <= v_loop_ctr AND v_found = F v_found = T IF FIND(SPLIT(v_Keyword_list, ";", v_ctr), Merchant) EXTRACT FIELDS ALL SPLIT(v_Keyword_list, ";", v_ctr) AS “Match_Code” IF v_Found = T to PCard_Results v_ctr = v_ctr + 1 END v_ctr = 1 v_found = F END In the third example, I am creating an alphabetically(and numerical) sort version of the letters and number in a vendor name (e.g. BCA123 LTD becomes "123ABCDLT"; this helps to identify identical duplicates such as "JOHN SMITH" and "SMITH, JOHN" which both become "HHIJMNOST". ASCII 57 = "A"; 58 = "B" Should I use CHR() and LOOP to INCLUDE only "A", then "B", etc and the same with ASCII 90 = "0" concatenating each INCLUDE() result. SET LOOP 0 v_ctr=48 v_name = BLANKS(50) GROUP LOOP WHILE V_CTR<=57 v_Name=ALLTRIM(v_name) + INCLUDE(Clean_VenName,CHR(v_ctr)) v_ctr=v_ctr+1 END v_ctr=65 LOOP while v_ctr<=90 v_Name=ALLTRIM(v_name) + INCLUDE(Clean_VenName,CHR(v_ctr)) v_ctr=v_ctr+1 END EXTRACT Vend_No Vendor Clean_VenName as 'Name' v_name AS "AlphaNum_Name" Vendor_Address Vendor_Group Vendor_Created_by TO Temp_temp v_ctr=48 v_name = BLANKS(50) END I don't have a example handy, but I use LOOP with OFFET(). If I wanted to Total the results of a crosstab that had 10 columns each of length 12. (Writtten free-hand not testsed - for example only). v_total = 0.00 v_ctr = 0 GROUP LOOP WHILE v_CTR <= 9 v_total = OFFSET(Amount, v_Ctr * 12) + v_total v_ctr = v_ctr + 1 END EXTRACT FIELDS ALL v_Total AS "TOTAL" to XTab_Results v_Ctr = 0 v_total = 0.00 END

  3. Ok, I’m coming late to the comments on this one, but I had to chime in. I also built a cross tab command before it existed in ACL, and was murky on the details. So I performed an archeological expedition. I opened the closet in my home office and ventured in. After searching through some stacks of CDs and a few 3.5″ floppies, I continued further into the dusty recesses, needing to shut the door behind me to get to the box that was probably contained the source of my search. And there it was! The CD from the “5th International ACL User Challenge.”

    My entry read like a personal ad for some sort of data analytics dating site. 🙂

    • Computer specialist, Angela Watterworth, a member of the CAATS team at OIG, enjoys automating processes and developing batches to reduce redundant, manual work. Consequently, she applied her “creative logic” to optimize ACL functionality.

    Using ACL™ for Windows, Angela created three applications.
    * Subtotal Reporting: to produce a Microsoft Excel spreadsheet as output with subtotals already in place. She wrote a batch to automatically create subtotals, and then export the results to a delimited file, which in turn, she imports to Excel.
    * Category Extract: to automatically break a file into specific categories based on one key field. For example, auditors can select an input file definition such as “Contracts” from a drop-down list. Then, from another drop-down list, they can select a key field such as “City” to export all records for each city to a separate file.
    * Unique Sampling: to automatically select samples that let you look at certain characteristics of the data. For example, you might want to review at least one record from each US state or each type of contract from your data.

    1. Angela – I knew you were a power user, but this is fantastic. I also built a crosstab script – one of top ten scripts I built so I know how hard this was to accomplish (unless I took a harder route – which I often do).

Leave a Reply

Your email address will not be published.