The world’s leading publication for data science, AI, and ML professionals.

Why Pure Sentiment Analysis does not Work in Today’s Industries

A complete introduction to the next level of sentiment analysis.

Getting Started

Image by author
Image by author

Sentiment analysis has been widely used by several types of industries for the last decades. Not only it can produce helpful insights, but also save time and energy by leveraging the power of Machine Learning rather than manually gathering and analyzing the information from a bunch of data. It simply classifies whether an input (usually in the form of sentence or document) contains positive or negative opinion. One simple example below clearly indicates that the sentiment is negative.

It seems quite easy, right? The problems arise when the data get longer and much more complex. Let’s say we have a document consists of more than 500 words, or… just about 50 words like the example taken from Luminoso below. It may talk about more than one single topic. If the final output of the document will only be whether positive or negative, does it really represent the ideas of the whole document? And what if, the document contains both positive and negative sentiment? What will be the output of sentiment analysis?

In this era where big data play a very crucial role, it is quite possible to have a wide variety of scenarios from a huge amount of unstructured information. Most companies surely want to seek knowledge as detail as possible from it. And unfortunately, sentiment analysis cannot fulfill their requirement.


And then, how?

Today we know at least 4 subfields of sentiment analysis: aspect-based sentiment analysis, targeted sentiment analysis, targeted aspect-based sentiment analysis, and concept-based sentiment analysis. We will take a look at how each of them can do so much more than regular sentiment analysis.

Aspect-based Sentiment Analysis (ABSA)

This is the most common type of sentiment analysis. Beside sentiment, there is another thing named aspect which should be taken into account. Aspects here mean a list of predefined categories and are very dependent on the domain of data. Suppose that the data we have is about laptop review, we will most likely define the aspect categories not far from portability, connectivity, operation_performance, and any other laptop-related things. Many companies utilize ABSA to learn better about important aspects of their own. Unlike sentiment analysis, ABSA can give them more detailed and structured information. Thus they only need to focus on improving aspects which tend to have negative sentiment.

SemEval had held a competition along with the dataset related to ABSA (Pontiki et al., 2016). Taken from SemEval Laptop dataset, below is an example to help you get a better understanding. According to the set of aspects we have mentioned earlier, the aspects contained in the example are portability and connectivity whose polarities are both positive. Please note that the underlined words are not really the output of the system. They serve as a clue as well as evidence of the corresponding aspects (portability and connectivity).

On the other hand, there is another type of ABSA. Actually, both output and implementation are quite different, but people often use the same term leading to a confusion especially among those who just knew about this field. Unlike the previous ABSA, the aspect here is extracted directly from the document. It can be anything as long as is stated in the document and represents something which is being discussed. Usually it is in the form of a word or phrase. So, it is safe to say that this type of sentiment analysis is more domain-independent than the previous one. Below are some examples taken from the SemEval dataset. The aspects which can be extracted here are service and the people whose polarities are negative and positive respectively.

Targeted Sentiment Analysis

This sub-field is often called entity-based sentiment analysis as it analyzes entities appearing in the document or sentence. The underlying assumption is there should be at least one entity mentioned in the document or sentence. The entities are usually in the form of product, people, location, organization, and so on. Below is an example taken from Vo and Zhang (2015), where the entities are Windows and OS X whose polarities are positive and negative respectively.

Targeted Aspect-based Sentiment Analysis (TABSA)

As the name suggests, this task basically is a combination of targeted sentiment analysis and ABSA. There are three components which need to be analyzed: entities, aspects, and sentiment. Taken from Sentihood dataset (Saeidi et al., 2016), below is an example of how the sentence and the output may look like.

There are 2 entities mentioned: Boqueria and Gremio. The design in Boqueria is positive, while the service is negative. And both the service (staff) and the food in Gremio are positive. What makes it interesting is, both entities have the same aspect (service) but with different sentiment. In other words, this task offers an advantage to make a comparison of more than 1 entity. This is very useful for companies especially when they want to compare their products with the competitors’ and dig more information about them.

Concept-based Sentiment Analysis

Concept-based approaches to sentiment analysis focus on a semantic analysis of text through the use of web ontologies or semantic networks, which allow the aggregation of conceptual and affective information associated with natural language opinions (Cambria, 2013). Basically, it is similar to aspect but more general. That is why it can produce more detail output than others.

Let’s take a look again at the 50 words example we have seen at the beginning of this post. We can get several concepts (or topics) from this review by using concept-based sentiment analysis even though not all of them have sentiment. Check my balance and make transfers are examples of valid concepts but do not have sentiment. While, the other three are valid concepts and have sentiment. It may seem a little bit difficult to read and analyze the result. That is why, in practice, usually the results are visualized in the form of relational graph, word cloud, and so on.


Conclusion

To sum up, sentiment analysis is useful to give broader ideas from any kind of domain, even general. But, if we would like to have more fine-grained information, there are number of options available depending on which part we want to focus on.


References

[1] Pontiki et al, SemEval-2016 Task 5: Aspect Based Sentiment Analysis (2016), Proceedings of the 10th International Workshop on Semantic Evaluation

[2] DT. Vo and Y. Zhang, Target-Dependent Twitter Sentiment Classification with Rich Automatic Features (2015), Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence

[3] Saeidi et al, SentiHood: Targeted Aspect Based Sentiment Analysis Dataset for Urban Neighbourhoods (2016), Proceedings of COLING 2016

[4] E. Cambria, An Introduction to Concept-Level Sentiment Analysis (2013), Mexican International Conference on Artificial Intelligence

[5] A. Lowe, Concept-level sentiment analysis: The next level of understanding emotion in text feedback


If you enjoyed reading this post and would like to hear more from me and other writers here, join Medium and subscribe to my newsletter. Or simply follow the links below. Thank you!

Join Medium with my referral link – Arfinda Ilmania

Get an email whenever Arfinda Ilmania publishes.


Related Articles