Globalizing Trends and Data Science

A closer look at a promising relationship

Prerna Singh
Towards Data Science

--

Pixabay

Introduction

The term “global” has come to be widely used to study contemporary reality’s many nuances. However, a better understanding of the dynamics of these transformations is necessary in light of the wide-ranging changes occurring around the world. As a result, the field of global studies was born to learn more about the various ways in which globalization is affecting the natural world.

The changes we are experiencing today are unquestionably altering the map we use to organize our knowledge of the world. There is also a profound shift in learning because of unprecedented access to data. To put it another way, we are witnessing not one but two profound shifts. A part of the change in “the territory” we try to depict on a map is nations’ growing social, political, and economic interdependence. Despite this, the amount of data available to us to create such a map has increased exponentially. Ontological and epistemological shifts are co-occurring. As a result, the very nature of reality and how we gather information are altered simultaneously. Data science-based methodological tools and their potential impact on global studies are examined in this article from this perspective.

This century’s most significant transformation is the exponential growth of data, which opens the door to new ways of analyzing the world around us. We’ve seen the rise of disruptive technologies like big data in recent years, which have significantly impacted many aspects of society. Since the beginning of this century, information has become increasingly scarce, expensive, and difficult to obtain. A reasonable conclusion could be drawn if the information provided was of a good enough standard. The way we perceive reality is changing due to the massive amounts of data gathered and stored. This fact necessitates a radical shift in how we view the world. The value of information is no longer in the specific data but rather in the correlation of massive data sets to discover previously unknown patterns. Therefore, massive data can provide more precise information about patterns, trends, and hidden associations when computationally analyzed.

This suggests that we are currently in the midst of a data revolution. New technologies are driving an exponential increase in the volume and types of data available, creating previously unimaginable reporting possibilities. As a result, today’s data is more significant, faster, and more detailed than ever before and reshape the social order.

To date, big data analysis has been applied to various fields of scientific study to resolve complex issues. This includes studies in the fields of education, health, transportation, media and entertainment, and national security, as well as biomedicine and the environment. Projects that use data extraction techniques to improve or predict employability stand out among these. Plant growth data, greenhouse gas measurements, and climate change impacts on crops have all been used to increase agricultural productivity using data science in agriculture. As a result, big data techniques are expected to play a major role in future advancements in precision medicine and disease treatment and prevention. There is a wide range of applications for big data, including the development of The Cancer Genome Atlas to support cancer research; the study of verbal and nonverbal communication, both with computational and cognitive methods; the analysis of multiple protein sequences, to determine evolutionary linkages and to predict molecular structures; and the development of territorial intelligence systems for cities, which can help improve city life for its residents.

Global studies have only recently begun to delve into the topic of analyzing large amounts of data. An important debate in global studies is whether or not the methodological possibilities and limitations that data science offers can help us better understand the complexity and diversity of today’s globalized world.

This is accomplished by dividing it into 3 sections. In the first section, the theoretical aspects of data science are discussed.

The second section will understand that data science can be used to conduct global studies.

Finally, we’ll talk about the theoretical and methodological ramifications of incorporating big data into global studies.

1. A conceptual framework for data science

Because of the Internet’s impact on how society generates and consumes information, it is impossible to conceptualize data science without considering this technology. The Internet is a universal network of interconnected computers used to exchange data. In this context, networks that connect many devices such as computers, file servers, and video cameras are described as a “complex network”. Amid a race for global supremacy that included the technological realm, the United States and the Soviet Union developed this technology during the cold war. Using this method, the Soviet Union launched Sputnik in 1957. Advanced Research Projects Agency (ARPA) was set up in response to this threat. Finally, this organization laid the groundwork for the Internet’s emergence. A communications system that was protected from external attacks was proposed by Paul Baran in 1962, using computers connected to a decentralized network.

This communications system evolved by integrating the virtual network of university sites; later, business platforms would be incorporated, social networks would be developed, and a series of applications would be generated, such as georeferencing. This virtual connection has generated a continuous and growing body of information. It can be said, metaphorically speaking, that we went from being analog individuals to digital individuals, which has meant that we are now permanently connected from different devices practically at all times. This generates a large amount of data that can be obtained, classified, and analyzed thanks to the advancement of more powerful computers every day.

Initially, the generation of large volumes of information was used by companies specializing in the administration of Internet sites to find opportunities in this data set to increase their profits. However, this new context characterized by the massive generation of information has given way to the establishment of a knowledge-based society, where a new profession stands out, that of knowledge workers, who have skills, abilities and access to the technology that allows them to process the continuous flow of information. This knowledge worker has evolved into a new employee profile, now known as a data scientist. This professional combines statistics, mathematics, programming, and problem-solving skills with data capture and can carry out the proper data cleaning, preparation, and integration activities to locate patterns in the data.

Now, data science is made up of three areas:

1. Big data, which is used to process the data.

2. Data mining, whose purpose is to find patterns, even if they were not previously imagined?

3. The visualization of the data, whose purpose is to facilitate the understanding of the information clearly and promote its socialization.

To have an approximation to what massive data means, it can be pointed out that, since March 2018, the Facebook social network has registered 2,200 million monthly active users and 1,450 million daily active users, a figure that represents 13% more than the previous year; Likewise, it is estimated that there are 1,500 million monthly active users of WhatsApp; In addition, around 500 million tweets are sent daily on the Twitter platform, which means that in the last six years it has increased tenfold; Finally, it is estimated that more than 7,000 million queries are made daily in the Google search engine.

In this sense, there was an establishment of three challenges that social scientists must consider in relation to the phenomenon of big data:

1. The technological problems associated with the storage, security and analysis of growing volumes of data.

2. The commercial value put another way to be added by generating more effective perceptions.

3. The social impacts, particularly the implications of using data for personal privacy.

Thus, given the enormous amount of data generated every day, proposes a reference framework to identify the information generation process according to the term Internet of events (IoE), intended to classify all data available on the Internet. Thus, three categories are established. The first is the Internet of the content (IoC), which represents the information generated by people seeking to increase knowledge on particular topics; Examples of this information are articles and blogs, encyclopedias like Wikipedia, and video platforms like YouTube and e-books like Google Books. The second, Internet of the people (IoP), includes the information generated by social interaction; for example, emails social networks and virtual forums. The third category comprises the Internet of the Things (IoT), which comprises the information generated by the objects connected to the network; it is about all things that have a unique identification and a presence in an Internet-like structure. In this sense, things can have an Internet connection or be network tagged. Finally, there is the Internet of the locations (IoL), which includes information that has a spatial dimension derived from the adoption of mobile devices; for example, smartphones generate more and more events with geospatial attributes.

Regarding the second area of ​​big data, referring to data mining, consider it as the analysis of observed data sets, generally of great volume, intending to find new relationships between variables, as well as the correct summary of said data sets in an understandable and Useful, data mining is the science and art of intelligent data analysis, to generate knowledge of interest from it. On the other hand, data mining is considered a fundamental step in discovering information patterns or knowledge discovery. This is understood as the process of finding valid, relevant, potentially useful patterns, seeking to make them understandable.

As data mining is a process to extract information, it uses methods and models that allow the definition of relationships or patterns between previously unknown variables. Now, the models they use are of two types: descriptive, which seek to identify patterns that allow them to summarize and explain the behavior of the data; and predictive models, which seek to estimate the future values ​​of a variable of interest based on its historical behavior. So, the data mining process includes data description, estimation, prediction, classification, clustering, and association.

A classification was made to the techniques used in these models as supervised and unsupervised. Supervised techniques are used to build models that are used to make predictions. These include the development of generic algorithms, neural networks, decision trees and regression analysis. Unsupervised techniques, or knowledge discovery algorithms, are generally used for extracting useful information from large volumes of data. Examples of this type are clustering, link analysis, and frequency analysis.

The last element of data science is visualization. This technique allows us to examine a large amount of data and identify patterns or trends with the help of graphs or representations, using different methods and techniques. Although the visualization of information has indeed been present in the development of humanity, only recently has its use been extended thanks to software with greater processing capacity and the development of so-called libraries that allow easy graphing and representation.

Obviously, each of the components of data science is constantly evolving. Undoubtedly, this fact influences the way scientific research is carried out and impacts different areas of knowledge. Thus, the possibilities of a good relationship between global studies and data science will be analyzed in the next section.

2. Global studies and data science: a promising relationship?

Although the phenomenon of the global is not an exclusive trend of contemporary life, global studies is a relatively new scientific discipline, since only recently have social scientists begun to systematically analyze networks, flows, processes, ideologies and representations of transnational and global systems, both from a historical perspective and from a contemporary approach. Moreover, the academic field of global studies emerged in the current context of growing globalization in response to the search to better understand the changes that have intensified in recent decades and have increased political, economic and social interconnections on a planetary scale. Since the first global studies academic programs and research centers were established in North America, Europe, and Asia in the 1990s, the discipline has flourished widely. Currently, it is estimated that there are approximately two hundred research centers around the world that address the global phenomenon from different angles.

However, when starting from the perspective of global studies, it is necessary to consider that this discipline does not only mean analyzing globalization. In this regard, Nederveen (2013) proposes three aspects in which global studies are clearly distinguished from studies on globalization. First, he suggests that global studies arise from a wave qualitatively different from the one that propelled the analysis of globalization from specific disciplines that already have a long intellectual tradition. Global studies do not address issues related to particular disciplines incorporating the global as a subject of study, as is the case with global political economy, global culture or global communication. In this sense, he proposes that global trends intensified first; later, globalization began to be studied from already established fields of knowledge; Finally, today, we are in a third phase in which the discipline of global studies emerges as a different way of analyzing globalization. Based on an interdisciplinary approach, this way of approaching the global goes beyond the traditional boundaries of knowledge contained in specific disciplines. Second, he warns that, from an intellectual perspective, global studies are still a fledgling discipline, what he calls roofless scaffolding. Third, it affirms that global studies can add value beyond traditional global studies because they aspire to a multicentric construction of knowledge, which does not only have the Western world as the center of knowledge and historical experience, and starts from a multilevel perspective, in where the local, the national, and the global have equal importance.

The following fundamental characteristics can be highlighted when defining global studies. Furthermore, they are transnational because they focus on analyzing cross-border and cross-cultural phenomena in their content and scope. To top it all off, they’re interdisciplinary in nature because the phenomena they study range from economic to political to social to cultural to religious to ideological to environmental and biological. In the same way, global studies are both historical and contemporary because it is necessary to examine historical precedents to understand current global patterns fully. Finally, they are often critical and postcolonial. As a general rule, the patterns of economic, political, and cultural globalization that have been forged in the West aren’t accepted uncritically by the field of global studies. Thus, they recognize that global problems, dynamics, and trends are seen differently from different parts of the world and within the same country or region by those who perceive these phenomena, depending on their social and economic status. As a result, global research suggests that According to their social and economic status, people around the globe and even within the same country or region perceive global dynamics and trends in different ways. As a result, research from around the world suggests that according to their social and economic status, people worldwide and even within the same country or region can perceive global dynamics and trends differently. This is why global studies advocates using the term “multiple globalization,” which acknowledges that no one dominant paradigm or perspective is more valuable than another.

A fundamental shift in analytical perspective is required, one that necessitates a complete revision of the dominant mode of analysis in each discipline due to the collective turning of various disciplines toward understanding globalization and the problems it brings with it. Because they are constrained by the traditional theoretical and methodological canons that support and reflect the time and place in which they originated, studies on globalization conducted exclusively within already established disciplines, for example, would generate that these approaches were partial. On the other hand, global studies aim to overcome these limitations and incorporate multiple insights about groups that have historically been excluded from disciplinary narratives.

Because of the complexities of the subject matter, the global study is a distinct discipline in the social sciences and humanities. This is because current paradigms are incapable of comprehending the dynamics and trends of the global within their conceptual framework. This fundamental inability stems from our preexisting conceptions of knowledge and the methods we’ve employed to verify the accuracy of what we think we know. Global studies as a new scientific discipline may signify that our current paradigms, both ontological and epistemological, are inadequate.

Because of this paradoxical inability to reconcile the two paradigms, we have an opportunity to explore new ways of conceptualizing the object of study, which we refer to as the global, as well as new methodological proposals aimed at better understanding and explaining this phenomenon. A paradigmatic disability provides us with an opportunity to imagine new ways of acquiring, analyzing, and interpreting information that is not currently available to us. We can examine whether or not data science can offer something new to the discipline of global studies and whether or not it can enrich or complement the dominant methodological approaches from which globalization and its consequences have traditionally been analyzed.

It is also possible to raise the intellectual question of how far new approaches based on data science might alter our understandings of the ontological foundations of the global. When it comes to researching global trends, how will we be able to fine-tune our search and correlation of big data at the local or national level, or even at the global level? Will, our views of the world change, or will we simply “expand” our current knowledge? What patterns will data science reveal about globalization and its implications, or will it represent a new Eurocentric form of scholarly knowledge that limits other non-Western voices of knowledge, which global studies have attempted to recapture? This next section will examine the possible methodological implications of data science for global studies in an effort to provide some answers.

3. The theoretical-methodological implications of data science for global studies

Today, “fingerprints” can be used to examine human activity on unimaginable scales, allowing us to understand human behavior and its many facets better. Big data has spawned an entirely new field of study, while computer science has provided new methods for generating and collecting data, as well as new analytical and statistical methods and methods for visualizing and presenting data. Social science methodologies could be radically reshaped by these new techniques and sources of information. Some approaches have emerged from data science that is changing the objects of knowledge while at the same time generating new narratives about social interactions and the human environment. In other words, if we can improve the structure of the prior information on which our estimates are based, we can reduce the uncertainty in the knowledge that we acquire.

A methodological tool or approach, on the other hand, is only one piece in the puzzle that can help us understand trends, establish relationships, generate conclusions, and build plausible explanations and thus contribute to our understanding of the world around us, giving us a “scientific image of the world” by providing the context and meaning of our surroundings. According to this definition, data science, like any other scientific methodology, plays a critical role in the development of scientific knowledge and in discovering what the “scientific image of the world” really is. In addition, each methodological approach is influenced by national traditions and transnational influences that are rooted in the past and present. The inherent limitations of any scientific research method derived from the specific contexts in which they arise are not unrelated to the massive use of data and the research techniques associated with its collection and interpretation.

As a method of social research, data science can bring new insights into global studies. Still, it also has its drawbacks, such as the difficulty of obtaining large amounts of data and establishing more rigorous correlations and patterns. As a matter of fact, algorithms are increasingly being used to mediate social processes and business transactions and governmental decisions. There are serious consequences for individuals, groups, and entire societies when we don’t take a critical stance on the ethical implications of algorithmic design and implementation.

What exactly can analysis of massive data contribute to our traditional method for obtaining information about the globalizing trends in our world is the fundamental question we need to ask? As a result, the contributions of global studies as a scientific discipline are enriched by this method of gathering and analyzing data. That is to say, how the interdisciplinary approach, multicentric optics, and multilevel perspective that distinguish global studies can be enriched through the use of large-scale data analysis.

On the other hand, big data analytics enables us to see previously unobserved trends that are not based on our previous mental patterns. In contrast to traditional social science methods, data science studies begin with a blank slate and must filter through a sea of data to find the answers they seek. This method relies on generalizations drawn from well-established theories to guide the search for empirical evidence. It’s more like a massive net that’s thrown into the ocean without first conducting a thorough investigation. Massive data analysis aims to collect as much data as possible, unlike traditional quantitative or qualitative methodologies, which have previously decided what data to look for. The ability to work on large amounts of data in a short period has led to the discovery of previously unknown connections between the data. In the field of global studies, migration can be used as an example to illustrate this situation. Many theories have been put forth to explain why people migrate around the world. However, what new interpretations of migration, stay, consumption, expectations, and decision-making might be opened up if we could track the migrant’s data consumption through their cell phones? What can we deduce from comparing these data sets in such a massive way? What kinds of connections are possible? The Maghrebi and the Central Americans share some similarities, but they also have some significant differences in global migration. What are the commonalities? What’s the difference?

Recognizing patterns in massive data sets enables us to understand local phenomena better, a second contribution. Because data collection and processing have historically been onerous tasks, most data has been presented at the national level. Since it’s impossible to compare multiple cases and multiple variables, the case study has been viewed as an ideal approach rather than a method for learning about dynamics in specific regions or locales. Nevertheless, local patterns can be established and compared to national and transnational ones. It is important to remember that a large portion of the massive data is social. It is possible to examine cultural consumption patterns on a global scale linked to specific local contexts as an example for discussion. As a result, it is possible to explore questions: What are the current trends in Twitter discussions? Where does a particular national or global trend affect a specific location? Is there a strong link between the two? In other words, what social groups do these discoveries belong to once they’ve been made? Is the global south impacting global trends of discussion at the same time, or is it just the global north? And, if so, are there any patterns that can be used to explain this?

As a result, multiple modeling variables can be generated, and the massive use of data overcomes the principle of parsimony. This proves that, under the same circumstances, the simplest explanation is usually the correct one. To some extent, it is understood that models that tend to be simpler and deliberately reject complexity will be adopted given constraints such as time, cost and space, which have traditionally limited the amount of information that can be analyzed. Processing data will become much more straightforward as the technological revolution progresses. Complex models with multiple variables can be approached at the local, national, and transnational scales to establish more complex patterns of interrelationships.

For a different expression, social movements and the fight for human rights on a global scale have been studied extensively in the academic literature. On the other hand, data science has the potential to assist in the appropriation and defense of these causes from a cyber-activist perspective. What else can we learn from the profiles of those who use social networks to advocate for human rights? What other online behaviors best illustrate how cyber activists connect their local context to global demands is up for discussion. The patterns of interrelation and reaction generated as a result of these behaviors… What’s the connection between those two? In this regard, it is worth noting that there is a large body of literature devoted to studying social movements and the global fight for human rights. To put it another way, social movements and the fight for human rights on a global scale have been studied extensively in the academic literature. A large body of literature is devoted to studying global social movements and the fight for basic human rights.

A path to abandoning linearities is made possible by the real-time analysis of massive amounts of data. This allows access to the continuous flow of information and breaks the chains of predetermined temporalities. Statistics on productive activity and population censuses, for example, have a temporary cut off because of this practicality. However, thanks to advances in data analysis and collection and algorithms capable of extracting and illustrating large-scale patterns, new temporal cutoffs can be considered. As a result of the constant flow of information and the ability to analyze it in real-time, researchers can now study a topic for any length of time they choose. For example, using thousands of medical records from around the world, how can tracking the emergence and spread of infection outbreaks help us better understand how the global and local connect? Such as a pandemic’s unusual and unexpected growth. Is there a correlation between the variables involved in this and other events? If patients’ and health professionals’ movements can be used to generate georeferenced data, what new insights might it provide?

The use of massive amounts of data, on the other hand, is not a panacea for understanding the current globalization dynamics in our society. There are at least four major risks associated with the non-critical use of big data analytics. An emphasis on empiricist epistemology and a return to positivism as a higher form of knowledge in the first place. When it comes to getting accurate information, even intending to be thorough in data collection and provide a comprehensive picture of the phenomenon under study, sampling bias is inevitable because of the technological platform, data model, and regulatory framework used to gain access to data. Third, oligomeric views of the world can be found in all data: views from specific points of view and specific tools. No natural elements can be taken at face value as absolute truths; instead, the data are created within a complex set that actively shapes their constitution rather than being abstracted from the world in an objective and neutral manner.

A second danger is falling prey to a predictive science that assumes that humans behave in predictable patterns. Predictive human behavior can be discovered using appropriate algorithms based on previously recorded patterns, according to these assumptions. To express differently, the more data is analyzed, the more accurate the prediction is. A resurgence of interest in the laws of human behavior may be facilitated by the debate between free will and structural determinism. That is to say; it is believed that as algorithms become more adept at forecasting natural phenomena like the weather, they will also be able to better predict human behavior in specific situations. Examples in global security include claims that “promise to secure the future by anticipating the ‘next terrorist attack’ and apprehending would-be criminals before they can strike” predictive analytics based on data science have been made. We might falsely conclude that the analysis of massive data can finally establish definitive patterns that explain social behavior at the local, national, and global levels; we might further suggest that this algorithm will establish what we do and what we should do as a result. Using preexisting information patterns could shorten our decision-making horizon. According to a warning, the algorithms used in the new analytics may appear to discover insights without asking questions automatically. Still, they were scientifically tested in specific contexts, which are not necessarily universal.

When it comes to understanding the world today, data science can lead to a false choice between causality and correlation. There is a risk that one aspect of explanation construction is preferred over the other, favoring correlational tendencies that prevent the establishment of explanations based on the relationship between causes and consequences. However, the numbers don’t tell the whole story. To infer causality from a correlation is incorrect. However, large-scale data analysis can detect correlations between a large numbers of data sources, but it cannot determine whether the correlations are significant enough.

Furthermore, statistical significance tests are expected to find spurious correlations based solely on chance when looking for correlations between variables. Narratives and contexts must be used to make sense of the data and give it form and meaning. The data must be mobilized as part of larger processes of interpretation or meaning-making to make sense of it. However, axiological neutrality does not mean that science is free from philosophical and political considerations. Narrative constructions that aren’t neutral obviously reflect these goals. In the end, data science is just one of many stories.

Finally, the reduction of what is real to what is expressed in the massive information flows that we generate through digital interactions in the age of information technology raises the possibility of creating false images of reality. or the way we spend our money. Remember that the dynamics of exclusion in today’s world create a new type of discarded: the digital tossed, with their sparse or nonexistent data “footprints” in the clouds. Data-driven invisibility is possible if we place too much emphasis on it. This risk is greater than digital illiteracy as a new form of invisibility, exclusion, and segregation.

According to the above discussion, data science may pose a risk to globalization. Still, it also provides numerous benefits and opportunities for businesses and the labor market on the other side.

--

--

Ph.D. in Computer Science | Data Scientist | Machine Learning Researcher | Currently working in Unity Technologies -Weta Digital