The world’s leading publication for data science, AI, and ML professionals.

How Social Platforms Can Prevent Mental Illnesses Using Deep Learning

Recent university research out of Korea suggests that they can. But should they?

Photo by Josh Riemer on Unsplash
Photo by Josh Riemer on Unsplash

Imagine this: it’s a rainy day, and you’ve not been out for months due to lockdown. You’ve been feeling lethargic, and generally not in the mood to do anything but complain for the past few months. You write a post on Facebook to your friends venting about life once more. Suddenly, a Facebook notification pops up on your phone, recommending you to seek medical advice with a nearby therapist who specializes in Depression and Cognitive Behavioural Therapy, offering a free consultation.

This world could be one that we live in rather soon, where every social media post you upload is pre-screened to check in on your mental health: scientifically proven models and algorithms that can predict whether you are at risk of suffering from a mental illness. Is this a world that you think is possible? Would this be a world you’d like to live in?

The importance of mental health cannot be understated. Suicide is the second leading cause of death among teenagers in the United States, according to the Centers for Disease Control and Prevention (CDC). A separate CDC study also found that teen suicide jumped 56% from 2007 to 2017- this rise coincides with the launch and growth in adoption of many social media platforms that we know and love today.

As modern-day vulnerability and transparency become more common, social media users are increasingly sharing detailed feelings or emotional states in their posts. These millions of posts are being used for capitalist purposes such as online advertising, but they could equally be used for helping the health of its users.

The Research

A recent study conducted by Kim, Lee, Park and Han (Kim et al., 2020), researchers from Sungkyunkwan University in Korea and Carnegie Mellon University, demonstrated a deep learning model that can identify a person’s mental state based on their posted information. This research extends previous studies by Gkotsis et al. (2014) that used Deep Learning models to recognize mental illness-related posts for classification purposes automatically.

By analyzing and learning posting information written by users, the proposed model by Kim et al. (2020) could accurately identify whether a user’s post belongs to a specific mental disorder, including depression, anxiety, bipolar, borderline personality disorder, schizophrenia, and autism. The model provides a foundation for detecting whether a user, based on their post, is at risk of suffering from a mental disorder.

The research prompts us to ask fundamental questions to our social platforms of choice:

  • If it is possible to create a pervasive Deep Learning Model to help identify potential sufferers with mental illness?
  • Should social media platforms monitor mental health states of its users?
  • What responsibility do social media platforms have if their data suggests that a user is at severe risk of mental illness?
  • What kinds of intervention would be useful?

Before attempting to answer these, let’s take a closer look at the research.

Photo by Michael Longmire on Unsplash
Photo by Michael Longmire on Unsplash

But first, what is Deep Learning?

There are various posts already that cover the basics of Deep Learning. If you want a short primer on it, I recommend reading this excellent medium post by Radu Raicea. I learned a lot from it, and I’m sure those that gave the 40K+ claps also agree!

In short, Deep Learning is a machine learning method that allows us to train an Artificial Intelligence (AI) to predict outputs, given a set of inputs.

More specifically, the Deep Learning method uses a Neural Network to imitate animal intelligence by creating a neural network of data and processing actions. There are three layers of neurons for neural network processing: Input Layer, Hidden Layer(s), Output Layer.

Inputs within a Deep Learning method can be either:

  • Supervised – giving the model inputs and telling it the expected output. Radu uses the example of a weather-predicting AI. It learns to predict the weather using historical data, where the model has training data inputs (pressure, humidity, wind speed) and outputs (temperature).
  • Unsupervised – using data sets with no specified structure and letting the AI make logical classifications of the data. Radu uses the example of a behaviour-predicting AI for an e-commerce website. While it won’t learn by using a labelled data set of inputs and outputs, it would create its classification of the input data and tell you which kind of users are most likely to buy different products.

Deep Learning combines the best of both worlds by training the AI, based on both Supervised and Unsupervised data. However, enormous datasets and computational power are needed for the models to come up with meaningful results and predictions because of the multiple hidden layers of calculations required in between.

Creating the Deep Learning Model

Kim et al. (2020) had one key research question to address:

Research question: Can we identify whether a user’s post belongs to mental illnesses on social media?

They collected users’ posts from Reddit, a popular social media that includes numerous mental-health-related communities (or so-called ‘subreddits’), such as r/depression, r/bipolar, and r/schizophrenia.

Their method is detailed in-depth in their research paper, but in short, they employed the following:

Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash

Data Collection

Kim et al. (2020) collected post data from the following six mental-health-related subreddits, each of which is reported to be associated with a specific disorder: r/depression, r/Anxiety, r/bipolar, r/BPD, r/schizophrenia, __ and `r/autism`

Also, they further collected post data from r/mentalhealth, to analyze posts with general mental health information. From each subreddit, they collected all the user IDs who had at least one post related to mental health.

Data Pre-processing

Data cleansing and formatting were required so that the model could process the data effectively, such as tokenizing (splitting sentences into unique words) users’ posts and filtering out frequently employed words (stop words). After this process, they were able to get 488,472 posts across 228,060 users for analysis.

Data Classification

Create six binary classification models, each of which categorizes a user’s specific post into one of the following subreddits: r/depression, r/Anxiety, r/bipolar, r/BPD, r/schizophrenia, and r/autism.

Kim et al. (2020) found that a prior study (Gkotsis, 2017) trained a model with the posts of users who have multiple symptoms, with results suffering from noisy data. By developing six independent models, each of which using data where users suffer from only one particular mental problem, they were able to identify a user’s potential mental state accurately.

Setting up the Modelling Architecture

This part is rather complicated, so if you’re interested in the inner workings, I recommend you dive directly into Kim et al. (2020)’s research. For the Data Science techies out there who want to know what principles were applied to the model, here are the main headlines:

  • They took the data above and split it: 80% for training and 20% for testing
  • They used two of the most popular methods for Machine Learning: the convolutional neural network (CNN) and XGBoost methods
  • They applied at least four different computational layers: embedding layer, convolutional layer, max-pooling layers, dense-layers
  • They found that the model showed positive signs of accuracy (output) in their model: Accuracy for all classes were in the range of 70% to 95%, but the F1-Score (often considered the better measure of the incorrectly classified cases than the Accuracy metric) was in the range of 45% to 55%.

Note that the F1-Score ranges from 0 to 1 – the closer it is to 1 (perfect) the more precise your classifier is (i.e., how many instances it classifies correctly), and the more robust the model predictions are.

Results and Limitations

All in all, the model Kim et al. (2020) proposed shows a fair amount of promise in detecting potential users who may have psychological disorders.

Some limitations of the study did exist around class imbalance, the influence of socio-demographic and regional differences to data, and the fact that the data is from Reddit, which may have users more inclined to express their emotions relative to other social networks. The researchers acknowledged they would re-apply the model to other social media platforms such as Facebook and Twitter to validate the model further.

Photo by William Iven on Unsplash
Photo by William Iven on Unsplash

Questions and Implications

Data science is showing that we can predict when a user is at risk of suffering from a mental illness. But we now face a crossroads on ethics and the use of data for detecting and treating Mental Health conditions. Let’s revisit our exam questions at the beginning of this story:

Is it possible to create a pervasive Deep Learning Model to help identify potential sufferers with mental illness?

As discussed above, an accurate Deep Learning Model is technically feasible, and with the right amount of data, a powerful prediction engine could be created. The research by Kim et al. (2020) and Gkotsis et al. (2017) showed great potential but was limited to Reddit data. Still, the models could be validated and extended to other social platform data sources, including:

  • Facebook search terms, posts, and comments, either on their wall, their friends’ walls, and any comments made in general.
  • Twitter search terms, posts and threads. Retweets are may also provide some indication, depending on whether or not a user states an opinion or comments in their retweet post.
  • Instagram photo comments, including potentially some optical character recognition based classification of data, to check whether an image posted by the user is positive or negative in sentiment.

The data referred to above must be used in a way that protects the privacy of the users, but more on this later.

Should social media platforms monitor mental health states of its users?

Advocates for "yes" would probably say that these platforms should monitor our posts, so long as the user provides consent to the platform to conduct health checks along the way in an ethical non-abusive manner. After all, many users sign terms and conditions for the use of their data for capitalist marketing purposes, so using data to help save someone’s life (e.g. in the case of identifying someone in severe depression at risk of suicide) seems like a good social outcome.

Advocates for "no" would probably point to data privacy laws, and risks in having the data and insights further re-sold to advertisers to then sell private health solutions for mental health problems. The risk of this is no better demonstrated Cambridge Analytica – a political data firm hired by President Trump’s 2016 election campaign – to influence user behaviour as well as their political views during that time. That said, Facebook has gone a long way to rebuilding their reputation and data misuse, discussed later.

What responsibility do social media platforms have if their data suggests that a user is at severe risk of mental illness?

There are some responsibilities that Social Media platforms should probably pay attention to, including:

  • Providing data to researchers and health practitioners while protecting privacy: In 2018, Facebook announced an initiative to help independent researchers analyze the platform’s influence on elections. Social media platforms can do the same as Facebook, but with data related to users’ mental health rather than elections data. Mental health data can be classified using the models suggested by the researchers above, and access to data can be securely provided to researchers in a privacy law compliant manner using technologies such as differential privacy.
  • Partnering with academia to solicit independent research that uses the data for social good: Facebook’s initiative unveiled in 2018 formed a commission to develop a research agenda about the impact of social media on society – starting with elections. The commission developed requests for research proposals and selected which grantees (researchers) would deserve access and funding to continue their research.
  • Updating product roadmaps based on the findings of validated research: Simply having the data available to academia is not enough – social media platforms should carve out dedicated product roadmap capacity to improve the mental health of its users. The tough part, assuming researchers will provide valid proposals over time, will be prioritizing which initiative to build first. A prioritization framework assessing social impact is one topic I’d love to explore in future.
Photo by Marcelo Leal on Unsplash
Photo by Marcelo Leal on Unsplash

What kinds of intervention would be useful?

From a product perspective, I can see various features being useful, but they hinge upon different hypotheses needing to be validated:

  1. Notifications pushed to users with details on nearby mental health care, when a specific set of conditions are triggered: Let’s say someone is posting in a depressive state: an alert or popup can be shown on the screen of the social media platform, advising the user of some mental health prevention tips or potential treatment centres (e.g., therapy practices or phone hotlines available to the user)
  2. Data or events sent to the users’ health applications with mental health risk score changes for the app to action upon: For example, once a user logs into a meditation app (e.g. Headspace, Calm) using a social media platform’s access (login) credentials, the app could receive notification from a social media platform that certain events have been triggered, and the app may use this data to suggest treatments or remedies tailored to the user.
  3. Preventions in place so that the user cannot use the social media platform, either partial or in full, for a short period of time. This one needs validation as there is a risk that a lack of social media outlet for expressing thoughts may also have a negative impact on the at-risk user. Suspending a user may also be hard to digest from the social media platform’s perspective, as it results in less platform engagement and adoption.
  4. Advertisements replaced with public mental health announcements. Ads should probably be replaced with mental health announcements as a public service. The alternative is to use the ad-space for private-sector mental health treatment products; however, these companies could take advantage and sell an ineffective product to an unsophisticated user by simply bidding a higher amount than its competitors.

Whatever the intervention is, a great deal of experimentation and safe control testing is needed to ensure the efficacy of any mental health intervention measures.

Closing thoughts

I’m keen to know whether you think social media platforms do indeed have a role to play in preventing mental health issues from occurring. When such problems do arise, I advocate for a more active role by the platforms in treating or helping the user understand what resources they have at their disposal to help themselves.

Do you have any ideas on what interventions (perhaps features) could be delivered to prevent or treat mental health illnesses? Keen to hear them in comments below!


References

  1. Kim, J., Lee, J., Park, E. et al. A deep learning model for detecting mental illness from user content on social media. Sci Rep 10, 11846 (2020). https://doi.org/10.1038/s41598-020-68764-y
  2. Gkotsis, G., Oellrich, A., Velupillai, S. et al. Characterisation of mental health conditions in social media using Informed Deep Learning. Sci Rep 7, 45141 (2017). https://doi.org/10.1038/srep45141

Related Articles