AUDITMAP.AI TEXT ANALYSIS PLATFORM

Table of Contents
- Abstract: Why AI for Internal Audit and Risk Management?
- Introduction
- Contemporary Internal Audit Challenges
- AuditMap.ai: A Platform for Audit Enhancement
- Limitations and the Way Forward
- References
1. Abstract: Why AI for Internal Audit and Risk Management?
Internal audit tasks within large organizations are slowed by the volume of documentation. Slow audit response time, sampling-based audit planning, and reliance on keyword searches are all indicators that automation is required to accelerate internal audit tasks. Audit quality also suffers when relevant gaps or risks are not disclosed to stakeholders in a timely manner. This work outlines a workflow automation solution called [AuditMap.ai](http://auditmap.ai). The solution contains several Artificial Intelligence models that read in thousands of audit reports in various languages to continuously identify and organize the relevant text within. Rather than replacing the auditor, AuditMap.ai assists in the human-centered audit planning and execution process.
2. Introduction
The internal Audit function of organizations is under pressure to deliver results to meet the assurance demands of stakeholders and protect the organization from emerging threats. Keeping the big picture in perspective is difficult given the volume of reports that exist within an organization. This work outlines a solution that reads thousands of audit reports to categorize and organize the relevant text within the reports.
Internal auditors provide a line of defense against preventable errors and omissions that may lower quality, tarnish a firm’s reputation and trustworthiness, miss opportunities, or lead to direct financial losses. Internal audit’s main role is the detection of inefficiencies, noncompliances, and the prevention of losses. These activities are delivered by means of risk-based assessment and communication with board committees, whereas financial audit is focused on the detection of potentially material issues in recordings of transactions, and at times, corrections. In other words, losses detected in a financial audit have already occurred, while the results of internal audit’s work highlight gaps in compliance, quality, and other areas. To identify these gaps, internal audit operations require the tracking of outcomes in each risk area that the organization’s programs are exposed to. This tracking requires contextualization and fast results, rather than an annual summary report to the audit committee.
Internal audit holds a unique assurance position within the corporate structure. It has historical roots in financial auditing and has since evolved to provide a much greater range of assurance. Internal audit is quite different in practice from financial auditing. Whereas modern financial auditing assesses transactions and their recordings in support of financial statement accuracy (at times using double-entry accounting software); internal audit’s performance-based assessments serve the goal of reporting to the audit committee and senior management on the state of their organization’s governance, processes, procedures, risks, controls, case reports, and much more. Furthermore, at public corporations, and within specific sectors such as healthcare and finance, internal audit activities and risk disclosures are required by law [1] [2].
3. Contemporary Internal Audit Challenges
Internal reporting has crossed into the era of big data, and this is leading to information overload across the corporate landscape. As a result of the high volumes of report data, a Data Rich Information Poor situation is creeping across the audit landscape [3]. This situation is characterized by an organization tracking many indicators under the assumption that they assure quality, but missing the emerging risks because the information at hand was not measured by any of the tracked indicators. The answer must be better technology for converting data into actionable information. Technology is a critical driver of efficiency and productivity in the internal audit function [4]. Quoting from the 2017 International Standards for the Professional Practice of Internal Auditing: "Internal auditors must have sufficient knowledge of key information technology risks and controls and available technology-based audit techniques to perform their assigned work" [5]. As outlined in the 2019 Brydon report [6], the landscape within the auditing field is also shifting toward increased enforcement of the separation between auditors and their clients, driving companies to rotate their assurance providers at increased rates. Furthermore, the report identifies that "There appears to be a widespread consensus that automating existing data related audit tasks is underway and its extension inevitable."

Vast private textual datasets of internal reports have overwhelmed the traditional role of auditors. For example. an organization with 100,000 employees in a highly regulated field (e.g., aircraft manufacturing), can generate millions of documents over a 10 year period. Forming this carefully recorded report data into theories and assessment plans is time-consuming, and yet the results of these assessments can be extremely time-sensitive. There is a dire need for management to know where the risks are on a Tuesday, but the audit function tends to produce reports quarterly and annually [7]. Furthermore, risk management is under ever more pressure to look outwards for emerging risks. In addition to the pressure to deliver quickly, the assessments discussed within reports are routinely sample-based, resulting in a lack of full coverage. It is common to select reports to sample using keywords, which can lead to missing critical documents that state similar concepts without using the specified keywords. It is also common to miss the connection between reports across time, such as repeated risks, growing numbers of controls, and other time-based phenomena.
In addition to the need for automation when assessing large datasets, the human factor also calls for additional automation of audit processes. Human auditors experience pressure to understate material weaknesses [8]. The integration of algorithms into the analysis process may insulate human auditors from these pressures to some degree. In addition, humans are limited in their capacity to aggregate and make sense of large datasets. Unfortunately, operating with spreadsheets and word processing programs as the engines of team-based work product is limited by the human capacity to make sense of vast datasets. Text-based work product and reporting frustrate the creation of governance metrics and the delivery of planning activities. Planning a contemporary internal audit is, at its root, managing information overload. Reviewing large enterprise control environments is expensive and time-consuming. Even though reports are mostly digitized, the task remains daunting for any human team to read and understand in full. Program coverage tends to thin out in audit areas that are less tightly tied to perceived financial risk. Furthermore, non-obvious connections between topics can be overlooked, as low-risk areas receive fewer audit resources. Internal audit and risk management functions find themselves over-evaluating some areas of operations while missing other rarely audited ones. In response to these challenges, significant artificial intelligence and data analytics adoption initiatives have been undertaken by the major audit organizations in recent years [9] [10] [11]. The goal to obtain a quantifiable overview of core governance programs remains out of reach for most enterprises, as artificial intelligence technologies have not yet reached significant adoption within audit firms. Several applications of artificial intelligence within audit processes are proposed in the literature, but few are applied by the major audit firms [12] [13].
Predictive models such as [14] have been put forward in the financial auditing academic literature, and this work leads naturally to similar predictive and automated innovations within audit workflows. Artificial intelligence is coming to internal audit and risk management functions and will present new opportunities for the transformation of corporate governance.
Public disclosure is an area where correct and timely identification of risks is critical, and often mandated by law. In a public relations crisis, identifying relevant information in reports for subsequent public disclosure is important and time-sensitive. Often this information is not tracked within a risk register or quality management system, as the risk in question may be new or unexpected. Regulatory risk disclosures can also be time-critical, as filing dates can be at times inflexible. Risk disclosures in corporate filings increase risk perceptions among investors [[15]](#2402), and so, perhaps unsurprisingly, useful risk disclosures in corporate filings are rare. The legal requirements to disclose risks are subjective and therefore not difficult to circumvent with generic statements [15] [16] [17]. However, research reveals a relationship between annual filings and SEC comment letters [18], whereby corporations are more likely to disclose risks if (1) they perceive that non-disclosure may lead to a finding by the SEC, or (2) after an SEC finding was issued to the corporation by the SEC. Given the importance of risk disclosure in quarterly and annual filings, it is clear that there is a strong need for a solution that can detect risks in a timely manner to facilitate the disclosure process, especially in time-critical situations. More generally, assessing the strength of quality management is an important capability for internal auditors to have [19] [20].
Prior to the release of AuditMap.ai, Machine Learning has been applied to risk disclosure documents for various applications such as annual report analysis to assess similarity [21], internal financial controls [22], and IT security [23] [24]. These and similar applications of machine learning within the audit sphere represent initiatives moving toward a larger goal of digital transformation and predictive auditing. The lag in adoption of natural language processing and machine learning in internal audit, relative to other fields such as law and accounting, could be explained by institutional inertia, a lack of training datasets, the reimbursement model for consultants, the requirement to understand documents in multiple languages, and differing standards for reporting. These various factors holding back the field are now shifting, leading to a major opportunity for audit automation with machine learning [11]. Assessing these factors in more detail, the hourly pay structure for professional services firms may discourage innovations that reduce billable hours. In addition, the data required to model audit processes is also a closely guarded corporate secret, and so labeled report data must be painstakingly collected and labeled by subject matter experts. The type of text data in the reports varies widely between teams within an organization. Assessments such as country risk can focus entirely on external documentation, while internal controls can focus entirely on internal documentation. Another factor holding back the adoption of artificial intelligence in audit is the lack of data in multiple languages such as English, French, German, and Arabic. Furthermore, the features of audit reports are unusual as compared to standard text corpora such as news reports and books. Specifically, audit reports express a higher language level than typical documents, because of their requirement to abstract complex problem patterns. In addition, a number of internal audit standards and risk management frameworks exist, and their adoption varies by geography. For example, ISO 31000 [25] is more prevalent in Europe, whereas COSO [26] is more prevalent in the United States. Other important frameworks include COBIT [27], TSC [28], and NIST [29]. Internal auditors use these frameworks to ensure best practices, and these frameworks are key to the reproducibility of high-quality audits.
Audit quality, speed, and competition for efficiency are drivers of artificial intelligence adoption. For example, the need for timely identification of gaps or risks and their disclosure to stakeholders has a tight connection with corporate performance. Adoption will also involve auditor education and scientific testing regimes for monitoring artificial intelligence performance. Using publicly disclosed corporate reports, benchmarks for this performance evaluation should be developed in future work, in multiple languages. The recall, precision, and bias in each model should be tested with these benchmarks.
There is not broad agreement in the assurance industry or the academic literature regarding the nature of the coming changes. Some assessments conclude that auditors will be replaced by artificial intelligence innovations [10]. This is likely incorrect. Instead, the future is likely one where auditors work with artificial intelligence in the same way that they have adopted spreadsheets and word processing to enhance their workflows with digital automation. This work takes the position that incremental improvement through the application of many specialized models will provide the initial boost of automation to audit teams. On a longer-term basis, the broad replacement of auditing with technology is very unlikely.
Auditors are likely to keep their existing processes in place while working to execute them more often, and with higher coverage as a result of artificial intelligence solutions. The future is more assistive than prescriptive. In this view, artificial intelligence does not replace auditor decision making, judgment, or assessment interviews. Instead, innovations accelerate planning and execution activities related to corrective and preventive actions. The key outcomes should be increased audit quality and speed, moving in the direction of continuous auditing.
4. AuditMap.ai: A Platform for Audit Enhancement
AuditMap.ai is a solution for audit teams that can help them to make sense of large amounts of documentation. It can also be used by Risk managers to discover emerging risks. The solution enables audit teams to quickly retrieve and action text within uploaded documents. The activities performed with AuditMap.ai are required as part of the strategic and tactical planning activities of the internal audit function. The solution automatically performs activities in support of the internal auditors’ information-intensive tasks. Figure 1 below summarizes the process through which auditors make use of the solution.

The audit team begins using the platform by defining business goals. They then proceed to define the organization’s preferred audit topics. The team also uploads their audit reports and other documents to the platform via manual upload or an Extract, Transform, and Load (ETL) task (Fig 1 (a)). The platform includes a dataset concept for managing document sets across clients. During ingestion, a machine learning model within the platform classifies the uploaded documents against the defined audit topics. The model architectures are based upon state-of-the-art machine learning models [30] [31] [32], trained on proprietary training datasets. Additional machine learning models perform automated extraction of linguistic entities, extract entity relationships, a cross-document analysis of statement similarity, and classification of key statements – those indicative of corporate risk, mitigations, and those indicative of key insights. Further processing is performed in order to assess the relevance of document segments to generally accepted enterprise Risk Management frameworks (Fig 1 (b)). The findings resulting from document ingestion and automated analysis are made available through the system’s user portal, a web application allowing auditors to perform a technology-assisted review of the contents with role-based access control. When exploring the results of the machine learning processes, auditors can observe trends over time within programs or topics, and can flag specific risks or controls for deeper analysis and paragraph or document-level context, or for relabeling (Fig 1 (c)). Lastly, the solution includes an interactive workbench for the rapid creation and export of working papers.
The platform provides auditors and risk managers with a simplified, self-directed capacity to manually include information discovered during research by reducing the steps between the identification of information in a dataset of documents, and its addition to work artifacts (Fig 1 (d)). The delivery of work items to stakeholders is accomplished via export (Fig 1 (e)).

Figure 2 shows some of the user interface components used by auditors. The platform enables the narrowing down of focus. For example, in a selected dataset with 17,571 sentences from 35 reports, only 418 sentences were highlighted as being indicative of risk. Some were not "real" sentences, as they may be sentence fragments such as table of contents entries, or table data. With that in mind, AuditMap was able to provide a 97.6% reduction in data to be analyzed. 9,800 entities were identified. Some examples of interesting sentences identified in publicly available reports as indicative of risk are the following (numbers in round braces indicate classification confidence):
- (98.4%) "We noted that no prioritization exercise was documented to determine which JHAs were to be conducted first nor did we see evidence that priority was given to the development of JHAs based on recent events incidents or operational risks." [33]
- (52.8%) "Given that similar findings have been identified in past audits, we would suggest that [Entity] require all regional SCCs to use printer codes to retrieve printed [Identification Data] letters from shared network printers." [34]
- (99.3%) "Based on interviews conducted, it was found that [Department1] used backups in the past to selectively fix problems with regards to the three application systems; however, it has not had to perform a full database recovery." [35]
- (86.1%) "Les dossiers relatifs aux installations et à l’administration de l’approvisionnement ne semblaient pas être appuyés par des documents étayant une piste d’audit uniforme" [36]
- (73.7%) "Die EFK vermisst in den Abläufen und bei den Kontrollhandlungen im Prozess die angemessene Nachvollziehbarkeit und Transparenz" [37]
5. Limitations and the Way Forward
Adoption of AuditMap.ai artificial intelligence into the audit and risk management industry is likely to change outcomes. It is likely to change the nature of assurance itself. However, artificial intelligence adoption within audit must be paired with a quantitative assessment of the limitations of the technology, and staff training that emphasizes the limitations of the technology. Blind adoption could lead to reputational risk in the event of artificial intelligence failures. It is therefore prudent to be aware of the functional limitations of machine learning in relation to assurance and assess the acceptability of these limitations.
The two types of machine learning applied in AuditMap.ai are supervised learning for classification, and unsupervised learning for contextual representation and similarity assessment. Supervised learning models applied to proprietary client data is unlikely to have perfect recall and precision. This means that some risks and controls will be missed by the algorithm, and some statements will be incorrectly classified. It is critical for the human auditor to understand these limitations, and to have easy access to a corrective capability within the workflow that can relabel statements on the fly. AuditMap.ai does have this capability.
Supervised learning is also susceptible to learning bias from data, if it is trained on arbitrary client data, and therefore AuditMap.ai models are trained on a proprietary primary dataset addressing this issue, prior to deployment into an auditor’s environment. Although bias may be addressed in the initial deployment, it is an issue that needs to be measured and assessed, especially when model retraining takes place.
Unsupervised learning is similarly limited to the contexts it has been exposed to. The technology is susceptible to errors when faced with a radically new context. In some cases, a supervised model relies on the representation created using unsupervised learning, and changing the distribution that the unsupervised model was trained on can ruin the predictive power of the supervised model. For example, the models in AuditMap.ai are trained to classify text from audit reports, and have never been exposed to email messages or text messages. Feeding such data into the models results in poor similarity understanding because their writing style and vocabulary are radically different from the training data. It is therefore important to consider the scope of the data that is included in the technology adoption prior to deployment.
Missing information is another key issue to consider. There is often information that is outside the dataset of audit reports and working papers, that is only obtainable by going out into the real world and collecting data through the process of internal audit. The assumption that information extracted from an internal audit dataset (e.g., relationship graphs, risks, mitigations, insights) fully covers the state of the organization is surely false. Auditors need to remain curious and ask tough questions about missing risks, missing procedures, and generally understand where internal audit has poor coverage in terms of internal assessments. AuditMap.ai helps the audit team to identify where information is likely missing. However, the initiative to fill in the blanks still remains with the human internal audit team. Having access to the big-picture view enables the audit team to think about what information may be missing by topic, or through time.
Auditors have to ask whether adoption of this imperfect and approximate technology is better than the status quo, and if it improves the quality and speed of audits. Auditors should evaluate AuditMap.ai adoption quantitatively and dispassionately as they consider adopting the technology. We are running a series of webinars to engage with audit and risk management professionals, demonstrate the platform, and line up pilots.
The website will soon post a link to the upcoming webinar. If you liked this article, then have a look at some of my past articles in AI for internal audit, "How [AuditMap.ai](http://AuditMap.ai) Improves Internal Audit" and "Better Internal Audits with Artificial Intelligence." I also want to thank professor Miodrag Bolic from the University of Ottawa for his feedback on this work. Have you noticed that AuditMap.ai has a new website? And hey, join the newsletter via the site!
Until next time!
-Daniel
6. References
[1] United States Public Law: Quality System Regulation. 21 CFR part 820 (1996) [2] United States Public Law: Prospectus summary, risk factors, and ratio of earnings to fixed charges (Item 503). 17 CFR part 229.503 (2011) [3] Goodwin, S.: Data rich, information poor (drip) syndrome: is there a treatment? Radiology management 18(3) (1996) 45–49 [4] Eulerich, M., Masli, A.: The use of technology based audit techniques in the internal audit function–is there an improvement in efficiency and effectiveness? Available at SSRN 3444119 (2019) [5] Institute of Internal Auditors: International standards for the professional practice of internal auditing. Institute of Internal Auditors (2017) [6] Sir Donald Brydon, CBE: Assess, Assure And Inform: Improving Audit Quality And Effectiveness; Report Of The Independent Review Into The Quality And Effectiveness Of Audit. The Crown (2019) Accessed on Jan 2, 2020 https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment%5Fdata/file/852960/brydon-review-final-report.pdf. [7] Chan, D.Y., Vasarhelyi, M.A.: Innovation and practice of continuous auditing. International Journal of Accounting Information Systems 12(2) (2011) 152–160 [8] Cowle, E., Rowe, S.P.: Don’t make me look bad: How the audit market penalizes auditors for doing their job. (September 2019) Available at SSRN: https://ssrn.com/abstract=3228321. [9] Kokina, J., Davenport, T.H.: The emergence of artificial intelligence: How automation is changing auditing. Journal of Emerging Technologies in Accounting 14(1)(2017) 115–122 [10] Alina, C.M., Cerasela, S.E., Gabriela, G., et al.: Internal audit role in artificial intelligence. Ovidius University Annals, Economic Sciences Series 18(1) (2018) 441–445 [11] Sun, T., Vasarhelyi, M.A., et al.: Embracing textual data analytics in auditing with deep learning. (2018) Universidad de Huelva. [12] Sun, T., Vasarhelyi, M.A.: Deep learning and the future of auditing: How an evolving technology could transform analysis and improve judgment. CPA Journal 87(6) (2017) [13] Appelbaum, D.A., Kogan, A., Vasarhelyi, M.A.: Analytical procedures in external auditing: A comprehensive literature survey and framework for external audit analytics. Journal of Accounting Literature 40 (2018) 83–101 [14] Kuenkaikaew, S., Vasarhelyi, M.A.: The predictive audit framework. The International Journal of Digital Accounting Research 13(19) (2013) 37–71 [15] Kravet, T., Muslu, V.: Textual risk disclosures and investors’ risk perceptions. Review of Accounting Studies 18(4) (2013) 1088–1122 [16] Schrand, C.M., Elliott, J.A.: Risk and financial reporting: A summary of the discussion at the 1997 aaa/fasb conference. Accounting Horizons 12(3) (1998) 271 [17] Jorgensen, B.N., Kirschenheiter, M.T.: Discretionary risk disclosures. The Accounting Review 78(2) (2003) 449–469 [18] Brown, S.V., Tian, X., Wu Tucker, J.: The spillover effect of sec comment letters on qualitative corporate disclosure: Evidence from the risk factor disclosure. Contemporary Accounting Research 35(2) (2018) 622–656 [19] Bhattacharya, U., Rahut, A., De, S.: Audit maturity model. Computer Science Information Technology 4 (12 2013) [20] Thabit, T.: Determining the effectiveness of internal controls in enterprise risk management based on COSO recommendations. In: International Conference on Accounting, Business Economics and Politics. (2019) [21] Fan, J., Cohen, K., Shekhtman, L.M., Liu, S., Meng, J., Louzoun, Y., Havlin, S.: A combined network and machine learning approaches for product market forecasting. arXiv preprint arXiv:1811.10273 (2018) [22] Boskou, G., Kirkos, E., Spathis, C.: Assessing internal audit with text mining. Journal of Information & Knowledge Management 17(02) (2018) 1850020 [23] Boxwala, A.A., Kim, J., Grillo, J.M., Ohno-Machado, L.: Using statistical and machine learning to help institutions detect suspicious access to electronic health records. Journal of the American Medical Informatics Association 18(4) (2011) 498–505 [24] Endler, D.: Intrusion detection. applying machine learning to Solaris audit data. In: Proceedings 14th Annual Computer Security Applications Conference (Cat. №98EX217), IEEE (1998) 268–279 [25] International Organization for Standardization: Risk management – Guidelines. Standard, ISO 31000:2018, Geneva, CH (February 2018) [26] Committee of Sponsoring Organizations of the Treadway Commission and others: Internal Control – Integrated Framework. (2013) [27] Information Systems Audit and Control Association: Cobit 5: Implementation. ISACA (2012) [28] American Institute of Certified Public Accountants: Trust Services Criteria. AICPA (2017) Accessed on Jan 15, 2020https://www.aicpa.org/content/dam/aicpa/interestareas/frc/assuranceadvisoryservices/downloadabledocuments/trust-services-criteria.pdf. [29] Bowen, P., Hash, J., Wilson, M.: Information security handbook: a guide for managers. In: NIST Special Publication 800–100, National Institute of Standards and Technology. (2007) [30] Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018) [31] Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019) [32] Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014) [33] Internal Audit and Program Evaluation Directorate: Audit of Occupational Health and Safety, March 2019. Technical report, Canada Border Services Agency, Ottawa, CA (March 2019) [34] Internal Audit Services Branch: Audit of the Management and Delivery of the Social Insurance Number Program, December 2015. Technical report, Employment and Social Development Canada, Ottawa, CA (December 2015) [35] Internal Audit Services Branch: Audit of the Departmental Information System and Technology Controls – Phase 1– Application Controls, 2014. Technical report, Employment and Social Development Canada, Ottawa, CA (November 2014) [36] Audit interne: Achats et marchés, Novembre 2018 Rapport d’audit interne. Technical report, Bureau du surintendant des institutions financieres, Ottawa, CA (November 2018) [37] Swiss Federal Audit Office: Prüfung der IT-Plattform NOVA für den öffentlichen Verkehr – Schweizerische Bundesbahnen. Technical report, Switzerland, Bern, Switzerland (July 2019)