Community Spotlight

Data Science Experiments in Government

Shaping data infrastructure and lawmaking at the German Ministry of Health

TDS Editors
Towards Data Science
7 min readNov 17, 2021

--

In the Community Spotlight series, TDS Editors chat with members of the data science community about the exciting initiatives that help push the field forward. Today, we’re thrilled to share Elliot Gunn’s conversation with Lars Roemheld, Director of AI & Data at the health innovation hub, a digitalization task force at the German Ministry of Health (Bundesministerium für Gesundheit).

hih team, with Germany’s minister of health, Jens Spahn (center). Lars Roemheld is third from right. (Image courtesy of hih)

Lars Roemheld is Director of AI & Data at the German Ministry of Health’s digitalization task force. He previously held senior positions at the AI specialist QuantCo, where he used machine learning to develop anti-fraud and pricing solutions for financial, retail, and healthcare organizations in the US and Europe. A data scientist and philosopher, he holds degrees from Stanford University and the University of Heidelberg. Lars Roemheld is an expert in causal inference, machine learning, and, increasingly, the wondrous workings of government. You can follow him on Twitter and on Medium.

The health innovation hub (hih) has an intriguing mission statement: “The hih serves as a think tank, sparring partner, and implementation supporter for the Federal Ministry of Health.” Could you please share more about what the data science team does at hih? In what ways does it serve as a “sparring partner”?

The health innovation hub (hih) was created as an experiment in government: with the intention of creating a “digital task force,” the Ministry of Health in 2019 created a small group of experts that work full time for the Ministry, yet act as an independent think tank. This setup has proven effective: designed as a timeboxed project with market-rate compensation, hih brought together a diverse group of professionals for a “policy stint.”

Throughout the last years, working with the Ministry was characterized by a flexible distance — our offices are only two city blocks away from the Ministry, and have a much more startup-y vibe. At times, this meant we sat day and night with our Ministry colleagues. Other times, we were able to maintain a more “free-thinking” air outside of governmental norms.

Within data science, our work broadly fell into three buckets: policy support, education for decision makers, and hands-on projects. AI in healthcare is an exciting field. But as it is relatively new, deep understanding of its technologies is still relatively scarce in government. We helped shape thinking and lawmaking around topics such as data access and data sharing, interoperability of data standards, medical device testing, responsibility, and fairness.

What really makes work at hih unique is the cross-functional, interdisciplinary team: we bring together medical doctors, pharmacists, IT managers, journalists, data scientists, lawyers, and economists. Creating project-pods with such different real-world backgrounds allowed us to quickly come up with pragmatic solutions that would be difficult to see for any one discipline alone.

How does a government think tank identify promising projects? Is there something unique about what a think tank can accomplish when embedded within the government, versus a standalone think tank?

Being squarely placed within the Ministry made all the difference for us: we are not a lobby organization, and our output is not measured in policy papers. Instead, we were able to directly support the work of our colleagues in the Ministry.

Governments tend to work in shockingly siloed ways: even within the same Ministry, one department often has no idea what the other is working on. Being an outsider in these structures, we were privileged to “naively” skip a lot of the norms, and connect people across functions.

What health related projects do you feel are best addressed by governmental institutions like hih rather than for-profit efforts in AI and DS?

The actual doing of deep technical work is probably best left to industry professionals. However, in healthcare and elsewhere, those professionals need infrastructure to do their job well. Governments have to play a role in providing such infrastructure in a fair and safe way. Some examples:

  • Enabling access to representative training, and especially test data, while safeguarding citizens’ privacy rights. This is typically much more involved than just publishing a csv-file on the internet: anonymization, federated learning, and query-based systems are some of the methods used. One day, we might see quantitative checks for representativeness of test data.
  • Especially deep learning-based medical applications often have a certain black-box flavor to them. As the regulator tasked with keeping patients safe from malpractice, digital or otherwise, it is not straightforward how to decide which algorithms may enter the market for medical devices, and which ones may not. We need reliable requirements for safe and responsible development of medical AI, to keep hyperbolic marketing claims in check. Conversely, trustworthy requirements can help complying software makers address skepticism from the established healthcare world.
  • Virtually everywhere, quasi-governmental bodies design the economic systems that mandate “reimbursement” — who gets paid how much in healthcare, and for what services and products. This is inherently difficult, because the payer (insurance) is typically different from the consumer (patient), and incentives can be misaligned. This is not new; but a new generation of AI-enabled medical devices opens new opportunities and challenges for cost-saving as well as waste.

I imagine that delivering ML projects in healthcare is a challenging endeavor. Does hih collaborate with other stakeholders? How does it scale itself?

As part of our mission statement, we work with a broad range of partners: from government agencies to startups and academia.

At a high level, I don’t think that federal policy has a scaling issue — contributing good ideas to the details of lawmaking allows for fairly fast “scaling” of ideas across the country. That being said, the process of lawmaking can sometimes feel sluggish. But overall I was surprised how much “agile legislation” was possible in the Ministry in the last four years, thanks to the right leadership.

Could you tell us about a project that the team is particularly proud of?

I want to highlight three very different projects:

1. Germany last year created the first fast-track pathway for digital health apps to be reimbursed by health insurers. Over two-dozen so-called “DiGA” can now be prescribed by doctors throughout Germany, downloaded onto smartphones, and paid for by insurance. From data privacy to patient safety and costs, the hih team helped overcome both legal and practical challenges.

2. Covid-19 was a special, and of course unexpected, challenge for the team. Very early on, we hosted a group of academics and software makers to brainstorm potential digital solutions. Out of this group, the idea of using bluetooth for contact tracing was born. We supported the Corona-Warn-App (CWA) project throughout its development, starting at proof-of-concept. Ultimately, the German CWA was one of the first privacy-aware contact tracing apps globally, released as a government open-source project; all available evidence suggests that the app contributes to slowing down Covid-19 infections.

3. Germany’s public health insurance system serves approximately 75 million patients. This makes the dataset of German healthcare claims one of the largest and most representative in the world. Starting next year, the dataset will be made available to academic research through the so called “Forschungsdatenzentrum,” effectively a query-based system at BfArM.

What kind of writing in DS/ML do you enjoy, and what would you like to see more of?

I enjoy reports that have a “real” feeling to them — too many blog posts start off from Kaggle-style clean data without a history or a “data generating process.” Real data science does not work like that: you have people creating data, you have measurement error, and if you look really closely you can still see those rows that the intern accidentally dropped in 2016. I think really outstanding projects understand the real world well, and are able to map it to the right data science tools. I often find a well-reasoned choice of loss function, or a clever data selection pattern, or an out-of-the-box idea for transfer learning more insightful than how to achieve the last 3% of accuracy.

What are your hopes for the DS/ML community in the next few months or couple of years?

I would expect reduced hype around technologies, and more empathy for data-generating processes. Whether that means understanding doctors, nurses, and processes in hospitals, or biases in financial data, or misrepresentation in photo databases. Personally I am very excited about the field of causal inference: partially because of the ideas, technologies, and applications. But perhaps more importantly, because a causal worldview forces us as data scientists to consider what exact problem we are trying to solve in the first place.

Curious to learn more about data science at the health innovation hub? Follow them on LinkedIn, Twitter, and YouTube. Here are other articles that share case studies of projects that utilize machine learning for social good.

--

--

Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly/write-for-tds