Arpit Bhayani’s Post

View profile for Arpit Bhayani, graphic
Arpit Bhayani Arpit Bhayani is an Influencer

60% of the tables I have recently designed, do not have the standard auto-increment ID column. Although counter-intuitive, doing this has helped me gain a significant performance boost because of smaller table sizes (data plus index); particularly for the cases where we paginate more than we do pointed reads and updates. This required me to not use ORM and I happily took that tradeoff to get better query control and performance. This design choice is purely made after I have understood how a relational database (MySQL in my case) stores the data and evaluates the query; definitely not a one-size-fits-all design decision. So, whenever you are designing a schema, always optimize for the most frequent queries and even if it requires going for a composite primary key, it may be worth it. also, knowing a bit about database internals always helps. #AsliEngineering #Databases #SystemDesign

Hitesh Garg

Curious• Senior Software Engineer • 19k+• Python • Django • Flask • AWS • Sql • Support-turned-Dev

4d

If you don't use the id column then how do you use foreign keys?

Syed Shamail

Fullstack JavaScript Engineer with a focus on Microservices, Multicloud,and DevOps

4d

It would be really good if you could share the use case in which you opted to go for non auto-increment ID column.

Ayush Gupta

EPAM | Ex-FIS | NIT Hamirpur | Author at C#Corner| Certified Azure Developer Associate|Tutor

4d

You still do require the unique id either Guid or computed key. Interestingly I will try your case to see the performance.

Arpan Mukherjee

Eng Lead - Founding Team @ Reconect.ai

4d

Composite Pks are more useful that it seems. I have been using (tenant_id, id) kind of fields on postgres (so easily use uuids), as we always read / write data for a single tenant only. This keeps possibilities like partitioning on tenant_id, or even sharding on it (using something like citus or own shard router code). This almost eliminates the noisy neighbour issue at least from query plan perspective as all indexes essentially contains tenant_id.

Evan Howlett

Senior Fullstack Software Engineer

4d

I've been working in the opposite direction. The data (tires) I work with is very hierarchical -- you have a brand, a model, and then the size in that model. The old data model has an auto-incrementing id for brand, but model was set up to have a dependent, incremental id based on the brand id, causing it to be a composite key. Generally, the sizes can be identified by a manufacturers #, but because there are "white label" tires that get sub-branded, it's not a unique identifier. Thus, the composite key of brand id, line id, and manufacturer's number is the primary key. In the new model, I still have the old keys that I keep as unique keys, however I've moved the primary key to just be a single auto-incrementing id column. You keep the prevented redundancy guarantee, have the ability to query based on either key set (thus allowing queries to be less complex in some scenarios), get better insertion/deletion performance due to the linear nature of the auto-incremental id, and only have the cost of an extra few bytes per record. Really, it's all about knowing the data. Not every dataset is best modeled with a single-column primary key. Not every dataset is best modeled with a composite primary key.

Vishal Jaiswal, PMP®

Data Architecting | Leading data strategy and innovative database solutions|Data Goverance | Data Management|Cloud Data Architect | Data Modeler| Mentor | Trained 5000+ professionals in Data field

3d

Perfectly said Arpit Bhayani . Design the database with forecasting that how much data and which type of data will come, will help you always to do a properly design table structure but as you said that one design not fit at all. Going for surrogate key(Auto increment non business value) or Composite primary key is always a debatable topic. composite primary key Vs Sequence(Surrogate Key) This is a controversial point. If my table with a composite primary key is expected to have millions of rows, the index controlling the composite key can grow up to a point where CRUD operation performance is very degraded. In that case, it is a lot better to use a simple integer ID primary key whose index will be compact enough and establish the necessary DBE constraints to maintain uniqueness. Follow Vishal Jaiswal, PMP® for learning database concepts, building tricks and cracking database technical interviews

Like
Reply
Javed Akeeb

frontend developer | React.js | Next.js | AI Automation | UI/UX

4d

Many devs by default go for ORMs . But what i have experienced as a front-end dev is that the queries get incrementally slower as the size of the database increases

Archishman Sengupta

SDE - 1 Fullstack @StackWealth (YC S21) | ICPC Regionalist | 3x YC dev

4d

but CPK can result in higher index sizes since composite indexes can be larger than single-column indexes, which can also increase storage and I/O ops, especially in the case of write-heavy workloads. how do you handle this complexity and potential performance issues?

Subham Tripathi

Senior Software Engineer @ Carrefour | Ex- Goldman Sachs

4d

100% i agree with you here, this is something i too discovered while designing a tightly related pointed queries

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics