Towards Social Data Science

With Great Quantity Comes Great Responsibility

Alex Moltzau
Towards Data Science
6 min readJul 19, 2019

--

Combining social science and data science is not a new approach, yet after several revelations (and sizeable fines) large technology companies are waking up to discover where they are situated. It seems research institutes particularly in Europe are happy to facilitate this shift. This article is (1) a broad definition of data science; (2) a rapid look at social data science; (3) a surface look at how new, in relative terms, the discipline of social data science is at this moment.

1. Data Science Broadly Defined

Let us first consider what data science is and then proceed to why this new terminology adding social is a useful addition. As a short disclaimer I am not claiming that data science is ignorant of social issues, is not social or has important insights. Rather it is a particular field of research with an area that can be complemented or mixed with other disciplines.

Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data.

It can be defined more specifically as someone who has a strong skillset from three complementary roles:

  • Expert analyst
  • Machine learning engineer
  • Statistician

However research engineering and data scientist could both be used. It seems the Alan Turing Institute straddles both models.

Earlier this year a few writers in Towards Data Science asked a critical question of vagueness and interpretation in data science. Thomas Nield argued that it needed to be broken up into specialised disciplines. Cassie Kozyrkov argued there was a psychological trap in data analytics causing issues for interpretation. I was fascinated to see that both articles used the same image of the Rorschach test — the inkblot testing perception.

Two of my favourite authors writing about ambiguity in data science

Cassie’s point was particularly striking: “Are you sure your latest data epiphany isn’t an apophany in disguise?” Apophenia has come to imply a universal human tendency to seek patterns in random information. She argued the mind does with inkblots as it does with data and once you gain an ‘insight’ you will struggle to unsee it.

The Rorschach test is often questioned as pseudoscience and unscientific for a reason. Taking a neat picture, bringing it to different people and using it to diagnose a problem seems in hindsight a rather grave mistake. This has over the years been used in a variety of situations from courtroom to mental health institutions with at time widespread negative consequences. We like neat pictures, yet they do not in all cases give the rightful impression.

2. Social Data Science

Let me state the obvious: data science has a different focus from social science. There is a vast existing field of research that most computer scientist will not learn or focus on. Building the skillset as a computer engineer or programmer takes time and a lot of effort, and there are various subfields within this area, however it has traditions that may differ.

Social science is a category of academic disciplines, concerned with society and the relationships among individuals within a society. Social science as a whole has many branches. These social sciences include, but are not limited to: anthropology, archeology, communication studies, economics, history, musicology, human geography, law, linguistics, political science, psychology, public health and sociology.

Again citing Carrie in another article, this one written on the 26th of July in called Top 10 Roles in AI and Data Science, she lists social scientist as number #8:

“We don’t realize how valuable social scientists are. They’re usually better equipped than data scientists to translate the intuitions and intentions of a decision-maker into concrete metrics.”

Chief Decision Intelligence Engineer at Google Cassie Kozyrkov

Her view however seems based on the social scientist without any or little knowledge in regards to programming, meant as a way to augment the engineering team. Yet this brings us to the combination of social science and computer science.

The Alan Turing Institute (ATI) is the UK’s national institute for data science and artificial intelligence founded in 2015. It is named after Alan Turing, the British mathematician and computing pioneer often considered as one of the founders of computer science. Computer science is the study of processes that interact with data and that can be represented as data in the form of programs.

One research area at ATI is social data science. They seek to address the challenges associated with large quantities of data through two themes: (1) developing foundational theories of human behaviour at diverse social and temporal scales; (2) and identifying methodological challenges and solutions to enable social data science to deliver robust and credible results in key application domains.

Its aims are to:

  • Create a critical mass of social scientists, data scientists and social data scientists
  • To build relationships with data infrastructure and training investments and policy makers through regular meetings involving academic, commercial, NGO, and government stakeholders.

There are MSc courses at three universities that I know of so far. One is University of Oxford, The London School of Economics and Political Science,and University of Copenhagen. The description of social data science on the website of the University of Copenhagen is the following:

Social data science is a new discipline combining the social sciences and computer science in which the analysis of big data is linked to social scientific theory and analysis.

Fields of study in relation to social data science are numerous. They can be based on both digital data collected from e.g. the social media, register data, customer data or on other types of digital traces that people are leaving, for instance by their personal use of the internet, their use of smart phones and of other digital services. These enormous sets of data can also be combined with qualitative data collected through anthropological field work etc.
University of Copenhagen, Social Science Faculty retrieved 19th of July

3. How new is the discipline of social data science?

Well, there is no page on the topic on Wikipedia as of the 19th of July 2019.

At the time of writing there is only one article on Medium mentioning social data science that I could spot with this search. This article was written last year, and there may be far more on this topic now than I presume.

Picture taken the 19th of July 2019

The thought of combining data science with theory from the social sciences is not necessarily a new approach. Yet the combination of computer science and social science in this manner can perhaps said to represent a form of newness interesting – worthy of exploration.

This is day 47 of #500daysofAI

What is #500daysofAI?
I am challenging myself to write and think about the topic of artificial intelligence for the next 500 days. Learning together is the greatest joy so please give me feedback if you feel an article resonates with you. Thank you for reading!

--

--