# Business Process Management Meets Data Science

## A step up from “business process management” to “intelligent continuous improvement”

Data Science is a relatively new practice leveraging mathematics, statistics and data visualization. This practice has emerged with the increasing data volume generated by systems over the last decade, or “big data.”

By exploring huge amount of data, it is possible to uncover and understand complex trends and behaviours. By better understanding data one can take smarter decisions.

For example, Netflix analyzes a large amount of data from a user’s search history and watch list to learn about their habits and interests. With analytic techniques, they are able to guess what other content a user might be interested in watching. Netflix’s suggestions are possible because of powerful algorithms.

This first example of BIG DATA usage can be augmented with examples from Amazon, Google, Dominos (yes, pizza makers) and NASA all use their own algorithms and approaches to leverage big data.

Application domains are very wide: online sales, translation, delivery directions, exploration of the universe and beyond.

The basics of data science are statistics and data visualization.

Statistics

In 1989 Ackoff demonstrated how data contributes to building knowledge by defining a hierarchy: “data — information — knowledge”.

Statistics tools are very powerful for building knowledge out of data exploration. Statistics is used to aggregate observations on subjects sharing the same property. Mathematic formulas are applied to those observations to generate data on this property.

For example, Netflix stores observations on TV shows searched for or watched by each user. This allows them to identify a user’s interests.

Data visualization

Descriptive statistics are used to generate information in the form of tables, graphs, charts, and so on. Large amount of data represented via charts are easier to read. A data scientist will use those graphs to better understand the system he/she analyses.

Predictive models

Mathematics models applied to statistics will allow the data scientist to build predictions or suggestions. So the next, most interesting step is to build predictions through application of artificial intelligence.

Some examples of questions that data science can be used to address:

• What are the habits of the users of a system?
• What are the successive transformations of a given data over a period of time?
• Is there a pattern in business activity?
• What is the probability that a given event occurs?
• How much a product will cost in the coming week?

To “guess” the answer to these questions, data scientists gather a large amount of data produced by the system, and apply mathematics formula to extract information in order to learn something about a business activity. To determine if it is reasonable to think that an event will occur, the idea is to look at all the data, identify the preconditions, and check if they are currently met.

The definition of a business process is based on a model of the business and the organization. Using a BPM-based platform, developers who are automating a process to create a process-based application have full control over WHO must execute WHAT tasks and WHEN.

Who: The way that specific end users (customers, employees) will interact with the process are included in the process definition (through user interfaces: webforms, pages, and portals).

What and When: The specific tasks to be completed, and sequence of those tasks, is also part of the process definition.

Further, business data generated via a process application are also clearly identified in the BPMN diagram model.

In short, BPM aims at guaranteeing that users will perform tasks and update data in a pre-defined order, and often within a pre-defined time limit or deadline. Business rules are enforced by the process definition.

Data science applied to BPM

As BPM provides a constrained workflow for user activities (habits), automates data transformation and ensures that actions are made in a pre-defined order, can data science answer specific questions from data generated by a BPM application?

One of the major difficulties faced in this challenge is the heterogeneity of the data produced by business processes. Every project using BPM is different, and can belong to different business verticals (e.g. e-learning, banking, educational, manufacture, and so on).

At Bonitasoft, we are using data coming from real environments from actual projects — anonymized of course. This data can help us create powerful algorithms that are applicable to many business verticals.

We are looking to data science to provide a step up from “business process management” to “intelligent continuous improvement,” using data, algorithms, and BPM structures for predictive models to help make smarter business improvements incrementally and continuously.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.