What you should know about Googles Data Transfer Service

Bigquery Data Transfer Service automates data transfer in BigQuery on a scheduled, managed basis. The data engineering and analytics teams can lay the foundation for a BigQuery data warehouse without writing a single line of code [1]. Personally, I see two wide fields of application which can you save additional tools and costs.
Data Integration
Sources that can be integrated by the Data Transfer Service are:
SaaS (Software as a Service) applications from Google like:
- Campaign Manager
- Cloud Storage
- Google Ad Manager
- Google Ads
- Google Merchant Center (Beta)
- Google Play
- Search Ads 360 (Beta)
- YouTube – Channel Reports
- YouTube – Rights Holder Reports
External Cloud Storage Providers:
- Amazon S3
Data Warehouses:
- Teradata
- Amazon Redshift
Furthermore, additional third-party transfers are available in the Google Cloud Marketplace [2]. The advantage here is clearly that only SaaS services, i.e. BigQuery and the DTS, are used – without building any infrastructure. In addition, only one service provider is used, namely Google. This saves a lot of construction and maintenance effort. As a result,you only pay for computing power, which of course also accrues with other solutions.
ELT Processing
Besides transferring data, which is already in the name, the service can also be used for data transformations within your Data Warehouse or Data Lake. Here, I would use the modern approach ELT instead of ETL. A short summary of ELT and ETL is, that the ETL process is loading the data into the target system after the transformation takes place within the data integration tool, the ELT approach firstly loads the data into the target system before transforming the data. Often the ELT process is nowadays preferred over the traditional ETL process because it’s simpler to realize and loads the data faster [3].

A possible solution architecture would be the above, where the Data Transfer Service loads the data into the Data Lake or the staging area and then takes over the transformation of the data using simple SQL.

An example for a transformation task in BigQuery is to join two tables, doing some transformation on column data types or values and saving them as a new data object for later analytic purposes. After writing the SQL and clicking on "Schedule query" you can automate it by the popup configuration manager (right side of the picture above).
DTS will give you the possibility to automate the transformation task by scheduling the query e.g. daily on a certain time (similar to batch processing and OLAP operation over night) – but much faster thanks to the scalability within the GCP.
Conclusion
This approach brings certain advantages:
- You only have to use GCP native services (No big tool set is needed.) [2]
- Transformation can be done by SQL (Even less technical experienced staff can implement the data preparation logic.)
- DTS and BigQuery are full SaaS technologies (No worries about infrastructure and scaling.)
- DTS can handle Big Data very well
Since cloud platforms provide us endless scalability and lots of services, enabled just per mouse click, you can quickly build up a Data Warehouse and the needed data pipelines. This example shows you how to set all up by only using two Google services and SQL.
Sources and Further Readings
[1] Google, What is BigQuery Data Transfer Service? (2021)
[2] Google, Drittanbieter-Übertragungen verwenden(2021)
[3] Fivetran, https://fivetran.com/_/api/blog/elt-vs-etl/elt-vs-etl.png