Technology Blog – How MYCOM OSI migrates stateful applications to AWS cloud with zero downtime

This post was written by Dirk Michel, SVP, SaaS and Digital Technology at MYCOM OSI,  Dattaprasad Sadwelkar, Partner Solutions Architect, WMP at AWS and Dan Taoka, Partner Solutions Architect, WMP at AWS

Customers that have decided to embark on a cloud transformation of their applications typically need to create detailed plans on how to secure business continuity during the migration period.

This entails migrating the application workloads to Amazon Web Services (AWS) with minimal or no downtime, as well as ensuring historical datasets and user configurations are also moved and retained.

In this post, we will discuss how MYCOM OSI was able to design workload migrations for its suite of Experience Assurance and Analytics (EAA) applications from on-premises private data center environments into a software-as-a-service (SaaS) model based on AWS.

In today’s fast-paced world, with 24/7 application and database availability, mission-critical stateful applications such as EAA typically can’t afford significant downtime while migrating to the cloud.

MYCOM OSI is an AWS Partner that offers assurance, automation, and analytics SaaS applications for the digital era. MYCOM OSI’s Assurance Cloud Service provides critical end-to-end performance, fault and service quality management, supporting artificial intelligence (AI) and machine learning (ML)-driven closed loop assurance for hybrid, physical, and virtualized networks, across all domains within a SaaS model.

Motivations for Cloud Migration

As an AWS ISV Workload Migration Partner, MYCOM OSI typically identifies two key reasons why organizations are migrating application workloads from on-premises data centers to the AWS Cloud:

  • Cost: With the cloud, you don’t have to lay out the capital up front for the servers and the data centers. Instead, you get to pay for it as you consume it as a variable expense
  • Elasticity/scalability: That variable expense is lower than what virtually every company can do on its own because AWS has such large scale.

Cost and scalability are very compelling, but so is the need for enterprises to increase their agility and speed with which they can change their customer experience.

AWS has been architected to be the most flexible and secure cloud computing environment available. AWS core infrastructure is built to satisfy the security requirements for military, global banks, and other high-sensitivity organizations.

AWS uses the same secure hardware and software to build and operate each of the regions, so customers worldwide can benefit from the only commercial cloud that has had its service offerings and associated supply chain vetted and accepted as secure enough for top-secret workloads. This is backed by a deep set of cloud security tools, with more than 230 security, compliance, and governance services and key features.

Challenges and De-Risking Migration to the Cloud

Now that we understand the motivations and benefits of running application on AWS, we need to enumerate the challenges around how to migrate and mitigate risk around the migration itself.

Migrating an existing set of EAA assurance applications that are already running in a private data center environment can be a particular challenge for several reasons:

  • Applications are stateful and ingest and process hot data in near real time, whilst also storing and aggregating processed data into large historical cold and warm datasets. The challenge is to migrate the large historical datasets whist ensuring no data-loss from near real-time data ingestion.
  • Telecommunication network components that emit or allow retrieval of near real-time data do change frequently, as new components are added and existing ones are removed or modified. This places a particular challenge on data collection from the source components, as a methodology is required ensuring data is collected from all sources and only collected once from each source to avoid undue interaction load on the components themselves.
  • A large range of user-defined content such as dashboard definitions, data enrichment, KPI formula definitions, threshold crossing alert definitions, and scheduled exports and email distributions need to me migrated as well. The challenge here is to ensure newly-created user content is moved across to the target system so as to avoid loss of user content.
  • At the time of cutting over to the new the cloud-based application, both hot and warm datasets need to be complete and allow for validation, benchmarking, and comparison between the on-premises application and the cloud-based application.

To address these challenges and satisfy the requirements, the design decision was taken to parallel-build the target set of applications on AWS. The parallel-build approach is also known as a “blue-green” upgrade and opens up a range of de-risking options to systematize and automate elements of the upgrade and migration.

AWS provides various tools in the cloud migration space, which are tested and proven tools that support data migration use cases and take away a significant number of challenges. This includes AWS Database Migration Service (AWS DMS) and AWS DataSync—both services map to specific areas of the EAA application data pools that are migrated.

MYCOM-1.2

Figure 1 – Migration flow and tooling.

AWS DMS facilitates the migration of databases to AWS. It can migrate relational databases, data warehouses, NoSQL databases, and other types of data stores into the AWS Cloud. AWS DMS equally supports homogeneous and heterogenous migrations between different database platforms.

In this case, MYCOM OSI focused on Oracle database migration using Amazon RDS for Oracle. With AWS DMS, MYCOM OSI performed a one-time bulk data and schema migration, followed by replicating ongoing changes to keep the source on-premises database instances in sync with Amazon Relational Database Service (Amazon RDS). This leverages the AWS DMS change data capture (CDC) feature, which is an important concept.

MYCOM OSI’s migration design also leverages AWS DataSync, which is used to copy and synchronize EAA historical data pools and application configuration data to AWS. It is a secure, online service that automates and accelerates moving data between on premises and AWS storage services.

AWS DataSync can also copy data between network file system (NFS) shares, server message block (SMB) shares, Hadoop distributed file systems (HDFS), self-managed object storage, AWS Snowcone, Amazon Simple Storage Service (Amazon S3), Amazon Elastic File System (Amazon EFS), Amazon FSx for Windows File Server, and Amazon FSx for Lustre.

Migrating Stateful Applications with Zero Downtime

The combination of AWS DataSync and AWS DMS are key elements that help enable a seamless migration experience to AWS. With the key elements of the migration approach briefly explained, let’s discuss the steps involved in the migration journey to the AWS Cloud.

Migration Sequence

MYCOM OSI decided on a reproducible and incremental migration methodology that in aggregate amounts to a blue-green migration sequence for stateful application migrations.

The sequence adopted was defined as follows:

  • Preparation
    • Data center connectivity and networking set up to the AWS Cloud
    • Data center readiness for AWS DataSync agent deployment
    • Oracle RDBMS supplementary logging enabled
  • Parallel build out
    • Pre-provision the AWS footprint
    • Activate/deploy EAA applications with factory defaults
  • Data migration
    • Migrate Oracle RDBMS, application configuration settings, and other data pools as full loads
    • Use CDC across RDBMS and data pools to synchronize the datasets
  • Data source integration migration
    • Target system data ingestion activation
    • Target system hydrates from legacy system
    • Validation and comparison
  • Cutover
    • Deactivate on-premises data collection
    • Activate cloud-based data collection
  • Decommission on premises

The parallel build-out covers preparatory areas such as on-premises connectivity and networking to AWS. The combination of AWS Direct Connect and AWS PrivateLink are a common preference in the context of mission-critical and latency sensitive applications.

AWS Direct Connect is the shortest path to AWS Cloud resources. While in transit, the network traffic remains on the AWS global network and never touches the public internet. This reduces the chance of hitting bottlenecks or unexpected increases in latency.

AWS PrivateLink provides private connectivity between virtual private clouds (VPCs), AWS services, and on-premises networks, without exposing traffic to the public internet. Additionally, PrivateLink makes it easy to connect services across different accounts and VPCs to significantly simplify network architecture.

Interface VPC endpoints, powered by AWS PrivateLink, can also connect to services hosted by AWS Partners and supported solutions available in AWS Marketplace. This is an important concept for SaaS, as PrivateLink securely enables the SaaS-provider and SaaS-consumer model and clear service endpoints.

The application migration progresses into the next stage once the connectivity between the data center and AWS Cloud-footprint is established.

Migrating Data History

In MYCOM OSI’s case, there are two main elements to migrating data history:

  • Oracle RDBMS table data and the numerical datastore which directly resides on a shared filesystem.
  • AWS DMS does not require on-premises preparation beyond the connectivity to the source Oracle RDBMS. MYCOM OSI right-sizes the DMS service instances to the size of the on-premises dataset and the speed at which the processing needs to happen.

The most intensive task is the initial full-load and replication of the RDBMS data; once completed, only the ongoing changes need to be replicated via CDC to the target, such as the Amazon RDS for Oracle instance.

Minimal Downtime

AWS DMS helps migrate databases to AWS with virtually no downtime. All data changes to the source database that occur during the migration are continuously replicated to the target, allowing the source database to be fully operational during the migration process.

After the database migration is complete, the target database remains synchronized with the source for as long as you choose, allowing users to switchover the database at a convenient time.

Ongoing Replication

MYCOM OSI defines AWS DMS tasks for either one-time migration or ongoing replication. An ongoing replication task keeps your source and target databases in sync. Once set up, the ongoing replication task will continuously apply source changes to the target with minimal latency.

All AWS DMS features such as data validation and transformations are available for any replication task.

Reliable

AWS DMS is also highly resilient and self-healing. It continually monitors source and target databases, network connectivity, and the replication instance. In case of interruption, it automatically restarts the process and continues the migration from where it stopped. A multi-Availability Zone (AZ) option allows you to have high availability for database migration and continuous data replication by enabling redundant replication instances.

Setting up the migration of the numerical datastore is based on the AWS DataSync agent, which runs on a virtual machine in the source data center. The agent commences with the full replication of the historical data store to the target file system in the cloud, and then moves into replicating changes as and when they happen.

Both replication streams are running in parallel and run continuously in CDC mode until the time of the cutover.

Migrating Hot Data Ingestion

Implementing hot data collection and ingestion in a blue-green context is realized through the concept of indirect data collection.

The existing “blue” data collectors continue running and collecting data from the data sources directly, whilst the new collectors effectively establish a backend data stream to the existing ones. This relay of data avoids both sets of collectors contacting the data sources directly, as illustrated in Figure 2 below.

MYCOM-2.1

Figure 2 – Migrating hot data ingestion.

The tradeoff is between protecting source systems from double-collection and the ability to run a full parallel data-load. Both options are possible, hence there is a clear customer choice available.

Equally, a choice can be made by data collection protocol. Typically, data collection protocols would fall into a push/pull model and a publication/subscription model. Protocols that rely on the pull model are most likely the ones that do not easily sustain multiple collection streams and are best migrated via “indirect data relays.” Whilst the relays introduce additional latency to the end-to-end data flow, the avoidance of double-collection can be a priority.

At this stage, the historical dataset and near real-time dataset and ingestion are available in the “green” target application instance running in the AWS Cloud.

Validation and Cutover

The validation phase begins once both the blue and green systems and data collectors are running in parallel. The datasets on both application instances can now be readily compared.

The validation is based on KPIs such as data collection latency, data importation latency, data completeness, and data integrity. Other benchmarks such as computation time for specific user-generated activities and application fluidity can be measured and reported on.

The cutover itself happens at various layers, including for example on the user interface URLs as well as on the indirect data relays. The data streams that use indirect data relays will cutover to direct data collection, whist the relays would be shut down.

Summary

For MYCOM OSI, there were several decisions that had been made to arrive at a reproducible and de-risked approach for customer workload migrations to AWS. These are the key ingredients for successful Experience Assurance and Analytics (EAA) application migrations to AWS:

  • Connect your data center to AWS via AWS Direct Connect and AWS PrivateLink.
  • Prepare for the blue-green migration with a parallel footprint build-out on AWS.
  • Migrate the various data pools with continuous data replication.
  • Create direct and indirect data collection relays for the green system.
  • Validate and cutover to green and decommission blue.

Customers that deploy and operate EAA applications in on-premises data centers have a proven path to migrating to AWS and SaaS. Each customer deployment will typically have some degree of particularities, but having a proven migration methodology can shorten the path to a de-risked workload migration project.

This blog was first posted on aws.amazon.com and co-written by Dirk Michel, SVP SaaS and Digital Technology at MYCOM OSI, in partnership with Dattaprasad Sadwelkar, Partner Solutions Architect, WMP at AWS and Dan Taoka, Partner Solutions Architect, WMP at AWS: https://aws.amazon.com/blogs/containers/mycom-osis-amazon-eks-adoption-journey/