How to Make Sense of Social Media Using Machine Learning

Sophia Brooke
Towards Data Science
5 min readJan 9, 2018

--

Social media started as an environment for individuals, but brands quickly took notice of the opportunity for personal interaction. Currently, top social media platforms are also relevant marketing channels, sometimes replacing entirely traditional choices like TV commercials or flyers. Every second, 3.3 million new posts appear on Facebook and almost half a million on Twitter. What if you wanted to keep track of all those times your brand was mentioned?

To leverage the power of the social media, create a marketing plan and allocate budgets accordingly. Brands want to have clear insights about the impact of their actions, the client’s preferences and even negative reviews. Yet, given the volume of information, it is impossible to do so manually.

Overcoming the information overload

Enter machine learning (ML), a series of algorithms that enable computers to identify patterns in data and classify it in clusters. This is perfectly adapted to unstructured data as social media postings don’t follow any rules. It is usually a mix of text, images, sounds, and video.

The results of such an analysis can give actionable insights about the selected users. Natural language processing offers valuable clues about the age, gender, location, and preferences of the authors of posts on social media. The data coming from an NLP API can help with customer segmentation based on real-world data instead of statistics or educated guesses.

Why use machine learning

There are several reasons to deploy ML in social media analysis which are dictated by the 3 Vs. of Big Data (volume, velocity, and variety).

Scalable

The sheer volume of social media activity requires automated tools to deal with the processing activities. It is impossible, even with a dedicated social media team, to keep track of all channels and brand mentions. Instead, web scraping tools gather all the posts that may be associated with the brand, put them in a data lake from which they are fed into the algorithms that slice and dice them into relevant pieces.

Text vs. context

The scraping phase relies on a keyword such as the brand or product name. For dedicated campaigns, the search could be done using a hashtag, but this is just the beginning.With Big Data we can achieve more than with simple statistical tools designed for structured data. Those would have just counted how many times the keyword appeared in conversations and added more filtering levels like geolocation and gender, while now we can create graphs that show the existing links and give meaning.It is more important to analyze not only the focus word (the text) but also the context it is placed in. Through sentiment analysis powered by NLP, a company can learn how happy the clients are with the product and what are the words associated with both positive and negative feelings. This is similar to the way humans understand each other from the tone of their voice or how friends communicate through instant messaging.

Relevance & authority

In social media, it is important to identify influencers — whether they are individuals or agencies — since these are central nodes in the network and creating a partnership with them can create viral content which boosts marketing.

A piece by Stanford explains how it is possible to trace back the links and see where each bit of information comes from and even track changes to initial posts using graphs. The most relevant items have many references, while the content generators with the highest authority create relevant posts consistently.

Speak their language

Just ten years ago, marketing research was done through surveys and focus groups. Machine learning not only improves the accuracy, speed and reliability of the answers, but it can combine different sets of pre-existing information to answer new questions. This can help narrow down options or create a new action course after initial testing, thus iteratively reaching a decision.

By looking at social media insights, marketers can learn about new ways that clients are using the product, how they feel when they purchase it and even new business opportunities.

Previous client segmentation techniques could not create user personas, but right now through clustering, a company can find out not only that their typical customer is in her early twenties, college educated and an ecologist, but they can also generate posts that sound like her own, reaching a very personal level of targeted marketing.

Speaking of language, since machine learning algorithms just use clusters, they can be used to analyze different languages without modifying the underlying commands. Also, these tools are great for social media analysis, an environment where users sometimes mix more than one language, especially in the case of non-native English speakers. For example, text can be written in the user’s mother tongue, have emoticons which are universal and trendy hashtags in English, creating a richer message that connects with global users.

Computers don’t understand

It is important to understand that computers don’t process information the same way humans do, although this is the ultimate goal of AI. Currently,

they just create rules and apply them, giving the impression of reasoning. Yet, this is not an argument against using machine learning, just a reminder of a program’s capabilities and a way to set realistic expectations.

The possible drawback of this limitation is that analyzing social media posts calls for particular attention during the calibration phase, especially regarding metacommunication such as emoticons and using irony and sarcasm. While a human can detect this more easily, a machine could classify such a post in the wrong bin and ignore a dissatisfied client.

From user to influencer

Before social media, the number of people who could influence others was limited and usually consisted of high-profile and highly-visible individuals like movie stars, athletes, doctors or experts. Content creation was also limited to publishing houses and media channels. Through smartphone democratization, each of us is a content creator and the entry barrier to becoming an influencer has been lowered, thus allowing anyone to create accounts and posting their thoughts. In this de-regulated environment, companies no longer control their image. Right now, they can only watch the show and determine the behaviors that result in positive market stimulation to encourage them.

--

--