The world’s leading publication for data science, AI, and ML professionals.

InfraNodus: An Excellent Tool for Textual Data Analysis

Introduction to InfraNodus with example of Google Trends queries during the Covid-19 pandemic

Source: Image by Author
Source: Image by Author

Text mining relates to transforming unstructured text into a structured format to identify meaningful patterns and new insights. Companies can apply advanced analytical techniques and other deep learning algorithms to explore hidden relationships in their datasets. In this way, IBM defines the broad field of Data Science that works with the string data format: speech, statements, reviews, poetry, etc. These textual datasets are handy if we can drill them with good statistical software.

An excellent option is InfraNodus that uses text network analysis as the core framework. I will show a single example with Google Trends data to demonstrate the main principles, but it is worth exploring many more data mining features that InfraNodus offers.

The core framework: text-network analysis

InfraNodus is based on text network framework that follows these steps:

  • import your data source, or exploit many in-built API data streams, including Twitter feeds, Google trends, research papers abstracts and titles (PLOS, PubMed), RSS news feeds, and Evernote notes,
  • get a network structure: the network will be generated from the text. The most influential words in the graph are shown bigger, while the words that occur more often together are grouped into clusters and have distinct colors. The graph shows the main topics and the most influential terms and the relations between them,
  • generate insight analyses focusing on sentiment, product reviews, demand-supply with google queries, clustering, Word Cloud visualization, etc.

Use case: word cloud of google queries related to Covid

Let’s show a simple example. With Google Trends API, we can access data on Google queries and analyze it with a couple of clicks. The parameters are location: the United States, period: 31/10/2021, related queries to the keyword "covid."

First, we import data directly with the in-built API. We can see four main clusters of topics: the first one refers to vaccination and the location of Covid, the second one to the cases of Covid, reporting and underlying data, the third one relates to the treatment of Covid, and the last one contains the resources on the disease.

Source: Image by Author
Source: Image by Author

Next, we generate a word cloud of words and their relations. Compared to the standard word cloud, the network is built with relations to other topics. Tuning the size of nodes and their descriptions, we get a word cloud of terms searched during the covid period on Google and the network of relations between them.

Source: Image by Author
Source: Image by Author

Infranodus also provides a set of statistics on relations in the data. People looked up the vaccine, testing, disease, county, and health most frequently with the "covid" keyword.

Source: Image by Author
Source: Image by Author

The statistics I just presented are not surprising based on what we see on media. But imagine you have a more extensive textual dataset that you know nothing about. Here, InfraNodus is very helpful as it draws the structure of the data and helps discover relations you had no idea before.

The graphs above are just a tiny piece of knowledge that InfraNodus can produce. To see other exciting applications, check some of their tutorials.

Conclusions

The important thing to mention is pricing. InfraNodus has a free, open-source version, which requires some programming knowledge. The standard version working on the cloud that I used for preparing the graphs serves well for most smaller projects (9 EUR/month). I also acknowledge that the creators don’t sell the product as a black box and share the infrastructure in the original paper. This is important for researchers who can cite the source and use text networks in their papers.

Did you like the article? You can invite me for coffee and support my writing. You can also subscribe to my email list to get notified about my new articles. Thanks!


Related Articles