Technological Advancements in Container Orchestration

Technological Advancements in Container Orchestration

In today's digital landscape, cloud infrastructure, and container orchestration advancements have empowered businesses to perform complex operations, such as data center migrations, with minimal downtime.

Our in-house team developed UPAY’s complete ecosystem from the ground up, covering everything from software development and integration with various partners to mobile app development, infrastructure readiness, and operational execution.

In our case, we leveraged Kubernetes to orchestrate application services and PostgreSQL as our primary database. Both tools operated efficiently within the Kubernetes environment. Our system incorporates Mobile Financial Services (MFS) technology, developed in Bangladesh, and has been operational for the past four years. While this is a notable achievement in our country, our MFS solution aims to compete with larger systems in the global market.

Migration Strategy: When planning our data center migration, the flexibility and capabilities of our chosen technologies were crucial to ensuring an efficient transition. Below is a detailed overview of our migration approach:

1. Setting up a New Kubernetes Cluster

We established a new Kubernetes cluster in the target data center to facilitate the migration. This foundational step was essential for ensuring a smooth migration process. 

Cluster Setup: We meticulously configured the new cluster to replicate the existing infrastructure specifications. This ensured that applications would operate consistently post-migration, minimizing compatibility issues. Additionally, we resolved known issues in the old data center configuration.

Service Deployment: Once the cluster was operational, we deployed our services into the new environment. Kubernetes' automation capabilities streamlined this process, allowing hassle-free deployment and management.

2. Logical Replication of the PostgreSQL Database

 A critical migration phase involved the PostgreSQL database, which holds vital data for our MFS system. Rather than performing a risky all-at-once migration, we opted for logical replication to ensure safety and reliability.

Replication Setup:  We established logical replication between the PostgreSQL instance in the old and new clusters, enabling consistent data changes with minimal latency across 120+ databases in our microservices architecture, amounting to over 20TB of data. The 1Gbps bandwidth was a limitation, but careful management allowed for efficient data transfer.

 Minimal Downtime: Logical replication's critical advantage is its ability to operate without downtime. This allows users to continue interacting with the system while data is replicated to the new data center.

 

3. Cluster-to-Cluster Replication Process 

The replication phase required careful synchronization of all services running in Kubernetes to ensure data integrity during the migration.

Data Integrity:  We closely monitored data transfers throughout replication to prevent loss. PostgreSQL’s logical replication features allowed us to maintain data integrity.

Cluster Testing:  We thoroughly tested the new environment parallel to the data replication, ensuring all services functioned correctly. This gave us confidence to proceed with the final migration. Success criteria included verifying core MFS functionalities such as KYC, financial data, transactions (Cash in/out, P2P, mobile recharge), and exclusive UPAY services. 

4. Network and IP Setup

We conducted the data center migration without altering the IP configuration to avoid complications. Changing the network setup would have required re-establishing over 150 VPN connections with our partners, a time-consuming and complex task. Maintaining the existing IP addresses ensured continuity in our external communications and avoided potential disruptions to critical partner integrations. This careful planning allowed us to execute the migration seamlessly without needing to renegotiate or reconfigure the VPNs, minimizing downtime and ensuring uninterrupted service.

 Cloudflare Integration: All internet traffic is routed through Cloudflare, which handles DNS resolution. A new subdomain created via Cloudflare was used for testing services before transitioning to the main domain.

5. Storage Setup and Transfer

 For our storage solution, we utilized Ceph, a distributed storage system, and created several buckets similar to Amazon S3 for storing dynamic and static data.

 Data Transfer: We conducted research and development (R&D) to identify the most efficient method for transferring approximately 100TB of data. Ultimately, we used rsync to migrate the data from the old data center to the new one. The process was smoother than expected.

 6. Final Migration and Switchover

After successfully replicating and validating the system, the final step involved redirecting traffic from the old data center to the new operational cluster.

Cutover: The final cutover involved rerouting traffic to the new data center, confirming that Kubernetes services were fully operational.

 Downtime Mitigation:  Thanks to our extensive preparation, testing, and replication efforts, we minimized downtime. The switch from the DB replica to the master took only three hours, and the sequence numbers were set accurately, ensuring a seamless transition for our users.

Advantages of Our Technology Stack

Several key technological advantages underpinned the success of our migration:

Kubernetes: Its container orchestration and resource management capabilities facilitate effortless replication and deployment of services across multiple clusters with minimal manual intervention. However, many tools and plugins are required to run the Kubernetes ecosystem in a self-data center.

PostgreSQL Logical Replication:  Continuous data replication between the old and new data centers ensured data consistency, effectively mitigating the risk of data loss during migration. We also utilized the Crunchy Postgres Operator, which automated database tasks such as backups, scaling, and failovers, ensuring high availability during the migration.

 Microservices Architecture:  Our MFS system's microservices framework allowed individual components to be migrated in isolation, drastically reducing complexity and risk.

Automated Deployment:  We adopted a CI/CD (Continuous Integration/Continuous Deployment) pipeline to manage our deployments, significantly streamlining the process. With over 300 services, automation was essential to avoid manual deployment’s time-consuming, error-prone nature.

 Zero-Downtime Deployments:  The synergy of these technologies enabled us to migrate services with minimal downtime, ensuring a seamless user experience during the critical transition. 

The successful outcome is a testament to our team’s dedication and expertise, particularly those who were deeply involved in the migration process and worked tirelessly to ensure its smooth execution.

Credits : Md. Shakil Hossain SK Faisal Sujoy Sarkar M. A. Jobayer Bin Bakkre Kanan Mahmud @Md fakrul admin Ashraf Ul Alam @jowel Rana Tawhidul Islam MD Mozahidur Rahman Raman Karmakar and Their fellow team members.

Al Imran

NOC @UCB Fintech Company Limited I Dhaka,Bangldesh.

2mo

I am proud to be a member of the team

Moin Uddin Ahmed

Fintech | MFS | DevOps Engineer | DBA

2mo

Proud to be a part of the team 💟

Shams Azad

Business Strategy I Digital Financial Service I Route to Market I Trade Marketing I Distribution I Channel Development I Customer Service

2mo

Good one 👍

Md. Saidur Rahman

DevOps Engineer @UCB fintech Company Limited | CKA, Istio, Cloudflare, Rook-ceph, AWS, GCP, Ansible, Automation, Linux, Kubernetes

2mo

Alhamdulillah! Thanks to the incredible teamwork and thorough planning, we have successfully completed the migration to the latest version and relocated the Data Center simultaneously. A heartfelt thank you to my teammates, co-workers for their dedication and hard work, and to our supervisor for their guidance and support throughout the process. This achievement would not have been possible without everyone’s contributions! Miles to go উপায় (UCB Fintech Company Limited)

Tawhidul Islam

Network Architect (IP/MPLS | SR | MFSP | ENTERPRISE | SECURITY)

2mo

As a team, we are able to achieve more.Though it was a complex task, many things had to be done in the right way but enjoyable too.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics