The world’s leading publication for data science, AI, and ML professionals.

Alteryx – a worthy Data Platform?

How to use Alteryx for Data Engineering and Science

Photo by Christopher Zarriello on Unsplash
Photo by Christopher Zarriello on Unsplash

Alteryx is known as a platform that combines analysis, data science and process automation. Integration with many tools can be realized very easily and used for many interesting use cases. In this walk through I used Google BigQuery as a Source and Target Platform.

How to get started:

  • Install Alteryx and create a GCP account with the right to create a service account.
  • Install the Big Query tools for Alteryx.
  • Authenticate against Bigquery by Service-to-service authentication or End user authentication (Full Guide) [1].

Use Cases

Here are a few typical use cases, that I came across, which might give you some inspiration for what Alteryx can do.

Data Preparation

Although tools and cloud platforms like GCP already bring their own data preparation tool (Cloud Dataprep) or services like the Data Transfer Service, you might choose Alteryx because of it’s variety. It also provides you with ESB and Data Analytics/Scientist functionalities – you can choose from one of the many tools to get the job done or use Python and R code.

Data Prep without coding - Image by Author
Data Prep without coding – Image by Author

Another reason could be the on-premise-version, which might be a must-have for you and your company due to data governance concerns. So a typical use case would be to take data from any source or data already loaded into BigQuery and prepare the data for further analytic cases.

Data Prep Workflow - Image by Author
Data Prep Workflow – Image by Author

You can use one of the many already build-in data preparation tools (as seen in the blue icon in the image above) or like already said above alternatively use python and R code to do the magic. In the end, you can easily load it back to a new table within BigQuery with the BigQuery Output tool.

Data Integration

Another use case could be the usage of Alteryx for your ETL/ELT processes. Like described above, Alteryx is offering a wide toolset of connectors and data integration tools.

ELT/ETL Workflow - Image by Author
ELT/ETL Workflow – Image by Author

Similar to the upper use case, you can extract data in this example from a MSSQL database, transform it and lastly load it to your Data Warehouse. The wide range of supported data sources of the input tools (as in the green icon) is definitely a big plus.

Supported data sources - Image by Author
Supported data sources – Image by Author

This field of application is also described in the success story of the tropical Smoothie CAFE where the data sources is AWS – see here [2].

Reports and Analytics

Also similar to the other cases but with focus on analytics, could be the usage of Alteryx as a vehicle to build reports. Often you have raw data in your Data Warehouse or Lake and want to build a report with it. Here, you can use Alteryx for heavier data analytic/science tasks like preparation, aggregation, statistics and also machine learning. After this, you can use one of the many supported BI and BI Server tools to upload your data there. In this use case, I uploaded the data to a Tableau Server.

Analytics Workflow - Image by Author
Analytics Workflow – Image by Author

Here, Alteryx is a really big plus in your data process because BI tools often don’t offer the wide variety of analytic capabilities. A similar use case is described by Siemens in this story [3].

Conclusion

Alteryx is more than just a data analytic tool and is also suitable for data integration tasks and can shine with its many system and database connectors. Together with technologies like BigQuery but also a lot of other Softwares and databases you can realize data integration and preparation but also reporting and analytic use cases. What I like is the possibility of choosing between the many Alteryx tools (Drag and Drop) but also the chance of using Python or R. The combination of both tools will provide you with a great data integration and analytic platform which will enable you and your company a wide toolset to overcome challenges in an agile and data driven world.

Sources and further Readings

[1] Alteryx, Google BigQuery Input Tool (2020)

[2] Alteryx, Transforming 4 Billion Rows of Data into Insights with Alteryx, AWS, and Tableau(2018)

[3] Alteryx, Siemens verarbeitet 50 Millionen Datenzeilen innerhalb von Minuten (2020)


Related Articles