#Day7 - Database Scaling: Vertical vs Horizontal Approaches
System design basics
𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞 𝐬𝐜𝐚𝐥𝐢𝐧𝐠
There are two broad approaches for database scaling: vertical scaling and horizontal scaling.
Vertical scaling, also referred to as scaling up, involves enhancing the power of an existing machine by adding more resources such as CPU, RAM, and disk space. There exist highly robust database servers, like those mentioned by Amazon Relational Database Service (RDS), offering configurations with up to 24 TB of RAM. Such formidable database servers have the capacity to manage extensive amounts of data efficiently. For instance, despite stackoverflow.com experiencing over 10 million monthly unique visitors in 2013, it operated with just one master database. Nevertheless, vertical scaling carries notable drawbacks:
Although additional resources like CPU and RAM can be incorporated into the database server, there are inherent hardware limitations. Consequently, a single server may not suffice for handling a large user base.
There is an increased risk of encountering a single point of failure.
Vertical scaling tends to incur significant costs, as powerful servers come with considerably higher price tags.
Horizontal scaling, also termed sharding, involves the addition of more servers to a system. Sharding divides large databases into smaller, more manageable parts known as shards. Each shard maintains the same schema, though the data within each shard is unique.
For instance, in a sharded database setup depicted in Figure 1-21, user data is distributed across database servers based on user IDs, utilizing a hash function for shard allocation. The hash function, such as user_id % 4 in this example, determines which shard stores and retrieves the data based on the hash result.
The critical aspect in implementing a sharding strategy is the selection of the sharding key, also known as a partition key. This key, illustrated in Figure 1-22 as "user_id," dictates data distribution and enables efficient retrieval and modification of data by directing database queries to the appropriate database. It's essential to choose a sharding key that evenly distributes data.
While sharding is an effective technique for database scalability, it brings forth complexities and new challenges to the system.
#systemdesign #engineering #learning