Technological Advancements in Container Orchestration
In today's digital landscape, cloud infrastructure, and container orchestration advancements have empowered businesses to perform complex operations, such as data center migrations, with minimal downtime.
Our in-house team developed UPAY’s complete ecosystem from the ground up, covering everything from software development and integration with various partners to mobile app development, infrastructure readiness, and operational execution.
In our case, we leveraged Kubernetes to orchestrate application services and PostgreSQL as our primary database. Both tools operated efficiently within the Kubernetes environment. Our system incorporates Mobile Financial Services (MFS) technology, developed in Bangladesh, and has been operational for the past four years. While this is a notable achievement in our country, our MFS solution aims to compete with larger systems in the global market.
Migration Strategy: When planning our data center migration, the flexibility and capabilities of our chosen technologies were crucial to ensuring an efficient transition. Below is a detailed overview of our migration approach:
1. Setting up a New Kubernetes Cluster
We established a new Kubernetes cluster in the target data center to facilitate the migration. This foundational step was essential for ensuring a smooth migration process.
Cluster Setup: We meticulously configured the new cluster to replicate the existing infrastructure specifications. This ensured that applications would operate consistently post-migration, minimizing compatibility issues. Additionally, we resolved known issues in the old data center configuration.
Service Deployment: Once the cluster was operational, we deployed our services into the new environment. Kubernetes' automation capabilities streamlined this process, allowing hassle-free deployment and management.
2. Logical Replication of the PostgreSQL Database
A critical migration phase involved the PostgreSQL database, which holds vital data for our MFS system. Rather than performing a risky all-at-once migration, we opted for logical replication to ensure safety and reliability.
Replication Setup: We established logical replication between the PostgreSQL instance in the old and new clusters, enabling consistent data changes with minimal latency across 120+ databases in our microservices architecture, amounting to over 20TB of data. The 1Gbps bandwidth was a limitation, but careful management allowed for efficient data transfer.
Minimal Downtime: Logical replication's critical advantage is its ability to operate without downtime. This allows users to continue interacting with the system while data is replicated to the new data center.
3. Cluster-to-Cluster Replication Process
The replication phase required careful synchronization of all services running in Kubernetes to ensure data integrity during the migration.
Data Integrity: We closely monitored data transfers throughout replication to prevent loss. PostgreSQL’s logical replication features allowed us to maintain data integrity.
Cluster Testing: We thoroughly tested the new environment parallel to the data replication, ensuring all services functioned correctly. This gave us confidence to proceed with the final migration. Success criteria included verifying core MFS functionalities such as KYC, financial data, transactions (Cash in/out, P2P, mobile recharge), and exclusive UPAY services.
4. Network and IP Setup
We conducted the data center migration without altering the IP configuration to avoid complications. Changing the network setup would have required re-establishing over 150 VPN connections with our partners, a time-consuming and complex task. Maintaining the existing IP addresses ensured continuity in our external communications and avoided potential disruptions to critical partner integrations. This careful planning allowed us to execute the migration seamlessly without needing to renegotiate or reconfigure the VPNs, minimizing downtime and ensuring uninterrupted service.
Cloudflare Integration: All internet traffic is routed through Cloudflare, which handles DNS resolution. A new subdomain created via Cloudflare was used for testing services before transitioning to the main domain.
5. Storage Setup and Transfer
For our storage solution, we utilized Ceph, a distributed storage system, and created several buckets similar to Amazon S3 for storing dynamic and static data.
Data Transfer: We conducted research and development (R&D) to identify the most efficient method for transferring approximately 100TB of data. Ultimately, we used rsync to migrate the data from the old data center to the new one. The process was smoother than expected.
6. Final Migration and Switchover
After successfully replicating and validating the system, the final step involved redirecting traffic from the old data center to the new operational cluster.
Cutover: The final cutover involved rerouting traffic to the new data center, confirming that Kubernetes services were fully operational.
Downtime Mitigation: Thanks to our extensive preparation, testing, and replication efforts, we minimized downtime. The switch from the DB replica to the master took only three hours, and the sequence numbers were set accurately, ensuring a seamless transition for our users.
Advantages of Our Technology Stack
Several key technological advantages underpinned the success of our migration:
Kubernetes: Its container orchestration and resource management capabilities facilitate effortless replication and deployment of services across multiple clusters with minimal manual intervention. However, many tools and plugins are required to run the Kubernetes ecosystem in a self-data center.
PostgreSQL Logical Replication: Continuous data replication between the old and new data centers ensured data consistency, effectively mitigating the risk of data loss during migration. We also utilized the Crunchy Postgres Operator, which automated database tasks such as backups, scaling, and failovers, ensuring high availability during the migration.
Microservices Architecture: Our MFS system's microservices framework allowed individual components to be migrated in isolation, drastically reducing complexity and risk.
Automated Deployment: We adopted a CI/CD (Continuous Integration/Continuous Deployment) pipeline to manage our deployments, significantly streamlining the process. With over 300 services, automation was essential to avoid manual deployment’s time-consuming, error-prone nature.
Zero-Downtime Deployments: The synergy of these technologies enabled us to migrate services with minimal downtime, ensuring a seamless user experience during the critical transition.
The successful outcome is a testament to our team’s dedication and expertise, particularly those who were deeply involved in the migration process and worked tirelessly to ensure its smooth execution.
Credits : Md. Shakil Hossain SK Faisal Sujoy Sarkar M. A. Jobayer Bin Bakkre Kanan Mahmud @Md fakrul admin Ashraf Ul Alam @jowel Rana Tawhidul Islam MD Mozahidur Rahman Raman Karmakar and Their fellow team members.
NOC @UCB Fintech Company Limited I Dhaka,Bangldesh.
2moI am proud to be a member of the team
Fintech | MFS | DevOps Engineer | DBA
2moProud to be a part of the team 💟
Business Strategy I Digital Financial Service I Route to Market I Trade Marketing I Distribution I Channel Development I Customer Service
2moGood one 👍
DevOps Engineer @UCB fintech Company Limited | CKA, Istio, Cloudflare, Rook-ceph, AWS, GCP, Ansible, Automation, Linux, Kubernetes
2moAlhamdulillah! Thanks to the incredible teamwork and thorough planning, we have successfully completed the migration to the latest version and relocated the Data Center simultaneously. A heartfelt thank you to my teammates, co-workers for their dedication and hard work, and to our supervisor for their guidance and support throughout the process. This achievement would not have been possible without everyone’s contributions! Miles to go উপায় (UCB Fintech Company Limited) ✊
Network Architect (IP/MPLS | SR | MFSP | ENTERPRISE | SECURITY)
2moAs a team, we are able to achieve more.Though it was a complex task, many things had to be done in the right way but enjoyable too.