The world’s leading publication for data science, AI, and ML professionals.

How We Won Our First Government AI Project

The Story of Delivering Canada's Precursor Engagement to the Canadian AI Supplier List

Photo by Tetyana Kovyrina on pexels.com. Also, a nice view from the Alexandra Bridge when I go cycling.
Photo by Tetyana Kovyrina on pexels.com. Also, a nice view from the Alexandra Bridge when I go cycling.

Table of Contents

Overview

Back in early 2018, we participated in the Canadian Federal Government’s innovative procurement vehicle with a simple goal: to find innovations that will help to modernize all acts and regulations. There were many objectives to this new innovation purchasing vehicle:

  • Identify outdated, burdensome, or simply non-applicable laws;
  • Compare the laws to other countries and regions around the world to see how domain-specific regulations are applied, such as the health or energy sectors;
  • Retrieve third-party references inside the stock of regulations.

Lucky for us, we won the bid to help modernize Canadian regulations through the use of a custom NLP platform. However, everything that happened leading up to this project ended up affecting the project in some way.

This is a story of government procurement, AI adoption, and using technology to solve real-world problems.


Keeping Laws and Regulations Up To Date

Every government has a requirement to ensure that laws are not only equitable to all citizens but also applicable. Philosophers for centuries have argued and debated about the relationship of the individual in a society, and the concept of fairness and equality is generally a main driving force in democratic populations.

As we’ve seen with government polarization, laws can become really slow to get adopted. Usually, elected officials pass a law to assign a set of responsibilities to an agency or to a department. This responsible body can update the regulations as they see fit for the duration of its mandate.

No laws are set in stone, but they are assumed to be fairly fixed. With technology, however, innovation typically moves beyond the speed of standard legal processes. Should a government only intervene when technology starts hurting its population? Or should there be a system to react faster to social issues?

An important tool used by all levels of regulations to remain dynamic is incorporating a technical reference within a regulation. This incorporation allows for regulation to stay relevant by deferring to an external source of information, thereby having the power to update a regulation by pointing to a more modern standard.

The Mechanics of Incorporating a Document in a Regulation

The simple obligation transitivity looks like this:

  • an Act will say, ‘follow the Regulation’ and authorize a responsible body (such as a department) to manage and update said Regulation;
  • that Regulation will say, ‘apply the Standard’;
  • the Standard will then contain all of the prescriptive activities required for the citizens and organizations.
The relationship between laws, regulations, and technical references. Image by author.
The relationship between laws, regulations, and technical references. Image by author.

Incorporations by Reference: a Double-Edged Sword

How can regulators specialize and design laws and regulations in every industrial and technological sector? The simple answer is: they can’t. Technical details change too quickly for experts to list all obligations that participants should follow. And so, with the increased availability of quality work from various Standards Definitions Organizations (SDOs) such as the International Standards Organization (ISO), the inclusion of those expert guidelines as a means to reduce the review time and make laws relevant and applicable makes sense.

The flip side of this process, however, is a fairly nasty one: the deferral of expertise to external agents may constitute an illegal abdication of democracy. Simply, non-elected officials are prescribing directives within regulations.

How does a democracy then maintain agency over its laws if the responsibility of technical oversight is pushed to SDOs? By reviewing, updating, and managing which standards are to be included during every regulatory review.

What quickly happens, however, is that too many references start appearing in these regulations, and the cognitive burden of reviewing a regulation to find the scattering of references shoots up to thousands of hours per review.

In fact, the true motivation for this project was the cost of the manual effort. All of these complexities put together meant a tremendous amount of human effort to review and update these regulations. The KPIs that were shared with us: 1,500 person-hours are required for every single review.

Interested in playing at home? Try to find all references in the Canada Occupational Health and Safety Regulations! Hint: Some of them start with "CSA", but not all!

The Need for Automation

Why was this a machine-learning problem? Logically, one should assume that the list of IBRs was available somewhere. Also, why wouldn’t we be able to simply download the list from a few SDOs and string-match them?

Well, we tried that. We tried all of that. Very quickly we confirmed the issues that were raised by Justice Canada and various departments. The master lists were more legacy knowledge than systematic recordings and many team members had left with all of the reference locations in their heads.

Let’s take a standard as an example – ISO 13485. (My first career was in medical devices, so this standard was always top of mind.) The official title of that regulation is "ISO 13485:2016 Medical devices – Quality management systems – Requirements for regulatory purposes". The whole thing. With a title this complex, many things can go wrong with string matching. Some issues that we found were:

  • Incorrect characters. Many standards are enumerated with em dashes rather than hyphens in their titles ("⁠ – " vs "-").
  • Official vs interpreted names. Sometimes the colon was not in the correct space, and additional characters (spaces and punctuation) were added incorrectly.
  • Short names. After a document has been incorporated with its full name, the shorthand version (e.g., "ISO 13485").
  • Geographic Names. National SDOs (such as NIST or CSA) re-interpret a standard to be slightly more applicable in the country, so the title changes ever so slightly (with "CAN/CSA" as a prefix).

Going back to the ISO 13485 example, here is one of the references from the Medical Devices Regulations: "[…] (f) a copy of the quality management system certificate certifying that the quality management system under which the device is manufactured meets the requirements set out in the National Standard of Canada CAN/CSA-ISO 13485, Medical devices – Quality management systems – Requirements for regulatory purposes, as amended from time to time.[…]"

Not the same. This issue multiplies across all standards. Image by author.
Not the same. This issue multiplies across all standards. Image by author.

We did find some luck with early string matching to find a few examples and to start building our dataset, but fundamentally it was not going to be a reliable approach, and certainly not one that would provide any level of assurance.

On top of the difficulties reported by Justice Canada, the language used had some additional issues that required this project to resolve:

  • True incorporation vs mere reference. Just because a document is mentioned does not mean it is legally binding. Therefore, a distinction had to be made regarding how is the document mentioned.
  • Static vs ambulatory references. Is a reference pointing to a specific version of a standard or a document, or is it pointing to the latest version of that document? Could that document be updated without the responsible body knowing?
  • Outdated standards. Is the document still applicable? Is the document still retrievable from the SDO? Can the government fundamentally enforce a regulation if its referenced standards have all been sunset?

Therefore, a tool that could automate all of this was required.

Sidenote: My favorite incorporation is still from Mushuau Innu First Nation Band Order (SOR/2002–415):

In this Order, "adoption" includes adoption in accordance with Innu custom.


Defining a New Procurement Process

Pushing for government innovation, by any measure, is never a small feat. In this particular case, the timing could not have been worse. Many public reprimands caused departments to not want to be associated with the process, and for the Canada School of Public Service (CSPS, a non-political entity helping to improve government function) to take the burden of responsibility.

To overcome these challenges, procurement officers led the charge in defining a new procurement process more closely aligned with procurement in the tech sector, where a trough of vendors could be selected on competency, sub-selected for a project on willingness to bid, and then a handful of vendors would be invited to submit a bid. This regulatory innovation list was a prototype for what is now today the AI Supplier List.

Here are some of the summary factors that led to this list taking place:

  • March 2016: The Standing Joint Committee for the Scrutiny of Regulations issues a [series of recommendations addressing issues related to the practice of incorporations by reference](http://STANDING JOINT COMMITTEE FOR THE SCRUTINY OF REGULATIONS). Initial efforts are made to address Recommendation 4, "[…] That the Statutory Instruments Act be amended to establish a central repository for incorporated materials and require regulation-making authorities to provide, on an annual basis, a list of all incorporated documents." This was following a series of lawsuits claiming that any and all documents represented within the regulatory stock should be made available, free of charge. Therefore, there was to identify all available references to evaluate exactly what is the fiscal burden to participate in a given industry.
  • Spring 2018: The vitriolic Auditor General’s 2018 Report comes out regarding Canada’s largest IT migration project and boy, it was not gentle. Citing numerous oversights pertaining to the Phoenix payroll overhaul project, the Auditor General calls it "_an incomprehensible failure_". Things had to change in the IT procurement process, jeopardizing the initial AI procurement efforts, and the IBR project.
  • May 2018: Before jumping into any risky venture, the Treasury Board Secretariat (described in a previous article) decided to invite industry participants to gain better knowledge about what AI could potentially do for navigating the stock of regulations. During the Artificial Intelligence Industry Day, "[…] TBS [was] looking for industry partners and academic researchers to help apply Artificial Intelligence methods such as advanced data analytics (ADA) and machine learning (ML) to regulations of varying type, scope and complexity."
  • June-Sept 2018: Feeling confident about the state of the art, but worried about another IT procurement fiasco, TBS asks the Canada School of Public Service (as apolitical a government organization as it gets) to lead the procurement process for creating a list of capable AI companies. The total contract size for the winners? $1.00.
  • Nov 2018: With our consulting partner MNP, we get invited to bid on CSPS-RFP-18LL-1593: Demonstration Project to Pilot Application of Artificial Intelligence Methods to Regulations that Use Incorporation by Reference, a pre-qualified supplier-only project. All suppliers were selected from the Demo Day qualification process.
  • This process, having been successful and our consortium having won it, allowed the Government to push ahead with this new vehicle. "PSPC is working with the Canada School of Public Service (CSPS) on the first procurement to use the AI source list. The solicitation for the CSPS interactive regulatory evaluation platform was issued on BuyandSell.gc.ca on Feb 28, 2019." (source)

As a prototype for what is now the Federal AI supplier list, the Canada School of Public Service is a prime project owner. This department focuses on the improvement of the public service workforce through training, education, and awareness, and is a refreshingly non-partisan function – everybody likes having a more effective government.


The AI Engine

Let’s contextualize this project.

  • This is an entity recognition problem, but most of the entities were not retrievable from a central list (one of the purposes of this project was actually to generate this list).
  • We had to account for many potential OOV issues since we did not want to run the chance of missing a forgotten SDO.
  • The actual contract scope was back in 2017, so BERT wasn’t even published yet. Transformers would have been lovely.

The approach that we took was based on the Chiu and Nichols (2016) paper entitled Named Entity Recognition with Bidirectional LSTM-CNNs. Kudos to my team for trudging through all of the potential NER papers. At the time, this paper had best-in-class scores not only for NER tasks but had shown the highest rates of success with never-before-seen entities, something quite important here.

Justice Canada made our lives a bit easier by providing the entire stock of Canadian regulations in a machine-readable format. However, there was no training data available and no starting examples of entities, only horror stories of people losing their critical Post-It notes.

We had weeks of interviews talking to the employees about which standards they were aware of and received a lot of support in identifying the heuristics that could indicate that a reference was present. "… in accordance with X", "as amended from time to time", and a few other terms helped us in sifting through regulation after regulation to spot these sightings in the wild.

We even tried deploying a custom labeling tool, but the results were somehow still very poor. We resorted to collecting the base dataset ourselves by searching through the heuristics provided.

Why this model?

Why we really liked this paper is that it encodes the same heuristics that a human uses to identify an external reference – especially one that is a code more than a word. The model looks at the following features:

  • changes in word sequence patterns;
  • changes in character sequences; and
  • changes in capitalization.

From the paper:

Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance. In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

The reason we committed to this particular model was because of its strength in identifying never-before-seen entities, especially in the context of third-party standards. Additionally, the exact mechanism for identifying the start and stop of an entity is almost exactly how an individual within Justice Canada would do it: by looking at trigger words, changes in capitalization, and changes in alphanumeric sequences.

Image by author.
Image by author.

Here is an in-depth article about the original paper.

What was truly innovative about this model was the Frankenstein approach to reusing the prepared features for both an LSTM focusing on words and a CNN focusing on characters. Instead of picking the best approach, you stick everything in a blender and let fate decide.

Here’s the LSTM side:

"[...] The (unrolled) BLSTM for tagging named entities. Multiple tables look up word-level feature vectors. The CNN extracts a fixed-length feature vector from character-level features. For each word, these vectors are concatenated and fed to the BLSTM network and then to the output layers." (From the paper.)
"[…] The (unrolled) BLSTM for tagging named entities. Multiple tables look up word-level feature vectors. The CNN extracts a fixed-length feature vector from character-level features. For each word, these vectors are concatenated and fed to the BLSTM network and then to the output layers." (From the paper.)

And here’s the CNN side:

"[...] The convolutional neural network extracts character features from each word. The character embedding and (optionally) the character type feature vector are computed through lookup tables. Then, they are concatenated and passed into the CNN." (From the paper.)
"[…] The convolutional neural network extracts character features from each word. The character embedding and (optionally) the character type feature vector are computed through lookup tables. Then, they are concatenated and passed into the CNN." (From the paper.)

for clarity, here’s a walkthrough of the model building code used in the project:

Note: the rest of the code is under some weird licensing conversation with the client, so we’ll open-source it once we know what’s going on. Then again, just use transformers.

The Results

There were two categories of results we focused on:

  • The general model performance; and
  • The usability of the tool for our client.

The model results were acceptable given the context.

The overall F1-score of the model was 0.726 with the raw structure above. (For fun, a basic LSTM on the same dataset had an F1-score of 0.277, so an improvement for sure.)

Diving deeper into the utility of the model, we looked at 1. whether or not a reference was present ("O"), 2. if we could accurately predict the beginning of a reference ("B-ref"), and 3. can we detect that we are inside a reference ("I-ref"). This meant that we were closer to how an operator would improve their work by being indicated where a reference is present rather than optimizing for the start and stop of the identified segment. These results were much more promising:

Also, for the keen observers stating that some of the false positives were higher: if you look at the resulting model performance, these could be described as true positives in the regulations. For instance, the model will highlight "the Code", which is in relation to a previous mention of an IBR.

Navigating the Regulatory Stock

After the model pushed through all the regulations, it was then time to display the results somehow to search and identify the results.

While skipping over the details of accessibility and platform design (we used a Laravel frontend with a Flask backend – it was 2018, after all), we built a simple platform that could ingest the regulatory stock, search for regulations, and identify in context the specific incorporations by reference that existed inside the regulations.

This frontend was where a lot of requirements started getting clarified and adjusted as the clients saw what the tool could do.

A view of the SA/IBR portal, looking at the Regulations search page. Iamge by author.
A view of the SA/IBR portal, looking at the Regulations search page. Iamge by author.
A view of the SA/IBR portal, looking at the Medical Device Regulations page. Image by author.
A view of the SA/IBR portal, looking at the Medical Device Regulations page. Image by author.
A detailed view of the highlighting performed. This is in Section 9 of SOR-86–304: Canada Occupational Health and Safety Regulations. Image by author.
A detailed view of the highlighting performed. This is in Section 9 of SOR-86–304: Canada Occupational Health and Safety Regulations. Image by author.

Validation and Usability

As we were closing off many of the features in the contract, we started noticing the limits of the content in the regulatory stock. that certain ancillary features could not be achieved due to insufficient candidate examples in the data. (For example, there was a line item tied to static vs. ambulatory references – usually referred to with a "[…] the latest version of […]" – but our initial search only found 5 examples of ambulatory references. )

Sometimes, in AI consulting, project delivery requires clarification once the data has been evaluated and the models built. In this case, many conversations were had about the utility of the tool against contract expectations (based on the reality of the data), which allowed us to whittle away at the platform and ensure that the code delivered actually addresses the regulation drafters’ core concerns.

On validating the core model (catching third-part references), a key question was simply: does it work? That question had many technical sublayers to it (with the beginning and inner metrics listed above), but the key business case was further clarified: Does the tool allow a reviewer to identify all of the third-party references in a regulation?

The Justice Canada team performed multiple reviews of the results. After a few weeks of discussion, they confirmed that our tool had not missed a single incorporation by reference. We kept up post-project quality control to ensure no outstanding issues, but our work here was done. ❤️


Disclaimer: This article is also about the prototype procurement list that was the precursor to the now-famous AI Supplier List of the Canadian Federal Government. The first actual project on the Canadian AI Supplier List was won by both KPMG and Lixar (now BDO); in no way are we claiming in this article that we won that particular project, or that we were the first ever AI project within the public function. However, there was a prototype procurement vehicle for the adoption of AI technologies that came out before the AI supplier list where we delivered a very interesting approach. This is the story of that project.


Other articles you may enjoy

If you have additional questions about this article or our AI consulting, feel free to reach out via LinkedIn or by email.

-Matt.


Related Articles