AWS (Amazon Web Services) has been to Amazon, what compound interest has been to Finance.
Cloud has been all the rage for the past ten years, and the model is of compounding returns; the larger organisations you host, the more money you make as the vendor. So it is in the vendor’s interest to ensure the organisations grow and retain their business.
That’s the business problem; what are the technical patterns to move data from on-premises to Cloud?
Let’s dive in!
1. Lift and Shift – take the application and all its data as it is and dump it in the Cloud
This is the cheapest option to get started on the Cloud.
In the example above, we are migrating PostgreSQL to the Google Cloud Platform (GCP). To do this, you can use GCP’s Cloud SQL service; the goal is to transfer the existing PostgreSQL Server database to the managed Cloud SQL service without substantial modifications to the database schema or code. The approach involves replicating the database structure, data, and configurations to the Cloud SQL environment, allowing you to leverage the benefits of Google’s managed database service.
Some adjustments might be required during the migration process, such as compatibility checks, minor modifications to queries or configurations, and testing to ensure optimal performance and compatibility with Cloud SQL. These adjustments are typically aimed at aligning the on-premises PostgreSQL environment with the specific requirements and features of Cloud SQL.
Cloud SQL equivalents are Amazon RDS and Azure SQL.
Why Should I Lift and Shift?
- Cost Efficiency: Project cost savings is one of the main reasons for the lift and shift approach. Not necessarily operational cost savings (see the next section)
- Time Savings: Lift and shift migration often offers a faster project delivery time than other migration strategies.
- Reduced Disruption: You can minimise disruptions to your operations during migration.
- Familiarity: Your teams can work with familiar technologies and minimise the learning curve of adopting new cloud-native architectures.
Why Shouldn’t I Lift and Shift?
- Missed Cloud Benefits: Without Cloud native technologies, you limit the benefits of scalability, agility and cost optimisation.
- Limited Cost Savings: Although you can gain initial cost savings by avoiding redevelopment efforts, there may be little long-term cost reductions. Without optimising the architecture to be Cloud native, you may incur unnecessary infrastructure costs.
- Inefficiencies and Performance Issues: An application/data needs to be designed to scale or utilise cloud resources efficiently; lifting and shifting it to the Cloud may lead to suboptimal performance.
2. Fully Re-Architect – take the application & data’s core components and adapt them for usage in the Cloud
This option gets all the Cloud benefits.
In this example, we are migrating PostgreSQL to GCP’s native datawarehouse. When fully re-architecting, there is an optional opportunity to cleanse the data and reduce existing technical debt. There is also an option to re-model the data according to the end business user’s needs. This example approach ensures data is cleaned and re-modelled before being staged in the Google Cloud Storage bucket. Once data is staged and ready, it is loaded into Big Query target data tables.
The data may be the same, but it has been enhanced/enriched to your use cases, technical debt is removed, and Cloud benefits can be materialised.
Big Query equivalents are AWS Redshift, Azure Synapse (SQL datawarehouse), Snowflake and others.
Why Should I Fully Re-Architect?
- Long-Term Cost Efficiency: Re-architecting allows you to capitalise on the Cloud benefits of resource allocation, auto-scale computing resources based on demand, and taking advantage of pay-as-you-go pricing models, reducing infrastructure costs and maximising return on investment.
- Seamless Cloud Integration: This option allows leveraging an expansive Cloud ecosystem and integrating with other services like APIs. Along with advanced analytics capabilities to build comprehensive and faster data solutions.
- Future-Proofing: Allowing scalability, agility, and the ability to adapt to evolving business needs and hence future-proofing it.
Why Shouldn’t I Fully Re-Architect?
- Complexity & Time Spent: Re-architecting requires careful planning, design, and development efforts. It can be a complex and time-consuming process, especially for large-scale systems.
- Large Upfront Investment: This option involves significant time, resources, and expertise investments. It requires skilled technical and Data Architecture professionals to design and implement.
- Breaking Away Dependencies: Your technical debt chicken will come home to roost. Activity will break dependencies on existing systems, impacting integrations with other applications.
Key Questions Data Teams Should Be Asking
- Data Architects: what design decisions must be re-visited before shifting to the Cloud? What will be our data governance model? Who will deal with the business change and data literacy impacts? Are there downstream/upstream dependencies that need to be met?
- Data Engineers: what complex data pipelines can I adapt to be more Cloud friendly? Can I break them down into different steps? How do I ensure the cost of poorly performing pipelines is reduced? Do I get adverse performance due to Cloud compatibility issues? How can I reconcile the data between on-prem and Cloud?
- Data Scientists: is my model affected by this migration? How do I reconcile the model output between on-prem/Cloud? What are my sign-off criteria to ensure data quality has stayed the same or improved?
There are multiple other options between lift and shift & re-architect; a hybrid model with a mixture of both could also be applied. The option you choose depends on the requirements of the business, long-term data and technology strategy.
The pitfalls of migration can not be summarised in this short article; if you want to learn, check out this long-form article.
Top 25 Painful Data Migration Pitfalls You Wish You Had Learned Sooner
If you are not subscribed to Medium, consider subscribing using my referral link. It’s cheaper than Netflix and objectively a much better use of your time. If you use my link, I earn a small commission, and you get access to unlimited stories on Medium.