🤔 To use Mmap or Not? That's a hotly debated topic in the #DATABASE community! 📍 What is Mmap? Mmap (memory-mapped files) is a system call that maps files directly into memory, allowing processes to access file contents as if they were in memory. It's like creating a direct window from your application's memory to the file on disk. ✅ Pros of Mmap: • Simplified code - treat file I/O like memory access • Zero-copy I/O - potential performance benefits • Automatic paging by OS • Great for shared memory between processes • Enables lock-free operations using atomic instructions ❌ Cons of Mmap: • Less control over I/O patterns • Page faults can cause unexpected stalls • Poor performance with default OS settings • Harder to handle errors (segfaults vs. explicit errors) • Complex interaction with OS page cache 🔄 Alternatives: • Direct I/O • Buffered I/O with explicit buffer management • Custom page cache implementations • Async I/O systems (like io_uring) 🗄️ Who Uses What? Using Mmap: LMDB Apache Cassandra Apache pinot RocksDB (for certain operations) Not Using Mmap: InfluxData https://lnkd.in/geNM4Qcs Oracle MySQL https://lnkd.in/gt2t2wMh 🤔 How to Decide? Consider mmap if: • You need simple shared memory between processes • Your workload is mostly read-heavy • You can tune OS parameters appropriately • You understand the complexity of virtual memory management Avoid mmap if: • You need precise control over I/O • Your workload is write-heavy • You can't afford unexpected stalls • You need predictable performance On one side, we have Andy Pavlo's famous paper "Are you sure you want to use MMAP in your database management systems (https://lnkd.in/gCyhiqKa)?" On the other side, Howard Chu (LMDB creator) strongly advocates for mmap, offering compelling counterarguments in his response "https://lnkd.in/gp9ScUy9" The debate shows how complex this technical choice really is! #databases #programming #technology #softwareengineering #backend What's your experience with mmap in production systems? Let's discuss in the comments! 👇
Raghvendra Yadav’s Post
More Relevant Posts
-
💡 𝐙𝐞𝐫𝐨𝐝𝐡𝐚’𝐬 𝐁𝐚𝐜𝐤-𝐎𝐟𝐟𝐢𝐜𝐞: 𝐖𝐡𝐞𝐫𝐞 𝐒𝐐𝐋 𝐐𝐮𝐞𝐫𝐢𝐞𝐬 𝐌𝐞𝐞𝐭... 𝐃𝐮𝐧𝐠 𝐁𝐞𝐞𝐭𝐥𝐞𝐬? I stumbled upon this approach while going through the postgres annual conference event PGCon 23. 𝐈𝐦𝐚𝐠𝐢𝐧𝐞 𝐡𝐚𝐧𝐝𝐥𝐢𝐧𝐠 𝐭𝐡𝐨𝐮𝐬𝐚𝐧𝐝𝐬 𝐨𝐟 𝐜𝐨𝐧𝐜𝐮𝐫𝐫𝐞𝐧𝐭 𝐮𝐬𝐞𝐫𝐬 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐫𝐞𝐩𝐥𝐢𝐜𝐚𝐬, 𝐦𝐚𝐬𝐭𝐞𝐫-𝐬𝐥𝐚𝐯𝐞 𝐃𝐁 𝐬𝐞𝐭𝐮𝐩𝐬, 𝐨𝐫 𝐥𝐨𝐚𝐝 𝐛𝐚𝐥𝐚𝐧𝐜𝐢𝐧𝐠 ! Zerodha’s back-office system does exactly that, with a setup that's more of a minimalist’s dream: - 𝘕𝘰 𝘳𝘦𝘱𝘭𝘪𝘤𝘢𝘴? 𝘕𝘰𝘱𝘦. 𝘈𝘭𝘭 𝘳𝘦𝘲𝘶𝘦𝘴𝘵𝘴 𝘩𝘪𝘵 𝘵𝘩𝘦 𝘮𝘢𝘪𝘯 𝘋𝘉. 🤯 - 𝘓𝘰𝘢𝘥 𝘣𝘢𝘭𝘢𝘯𝘤𝘪𝘯𝘨? 𝘕𝘢𝘩, 𝘯𝘰𝘵 𝘯𝘦𝘦𝘥𝘦𝘥. - 𝘊𝘢𝘤𝘩𝘪𝘯𝘨 𝘭𝘢𝘺𝘦𝘳? 𝘗𝘰𝘴𝘵𝘨𝘳𝘦𝘴 𝘥𝘰𝘦𝘴 𝘵𝘩𝘦 𝘩𝘦𝘢𝘷𝘺 𝘭𝘪𝘧𝘵𝘪𝘯𝘨 𝘰𝘯 𝘵𝘰𝘱 𝘰𝘧 𝘵𝘩𝘦𝘪𝘳 𝘱𝘳𝘪𝘮𝘢𝘳𝘺 𝘋𝘉 𝘧𝘰𝘳 𝘴𝘰𝘳𝘵𝘪𝘯𝘨 𝘢𝘯𝘥 𝘴𝘦𝘢𝘳𝘤𝘩. 💪 And the best part? All SQL queries go through their very own middleware — SQLJobber (a.k.a. DungBeetle), which optimizes query flow, managing the onslaught like a traffic cop on a busy day. 🚦 With Postgres as both the cache and the main DB, this system flies in the face of traditional architectures... yet, it works. Why make things more complex when you can let your SQL queries roll like dung beetles? #databasemanagement #Postgres #SQLJobber #MinimalistArchitecture #ConcurrencyAtScale #Zerodha
To view or add a comment, sign in
-
pgmoneta is a backup / restore solution for PostgreSQL. pgmoneta is named after the Roman Goddess of Memory. See Getting Started on how to get started with pgmoneta. See Configuration on how to configure pgmoneta. #devopskhan
GitHub - pgmoneta/pgmoneta: Backup / restore solution for PostgreSQL
github.com
To view or add a comment, sign in
-
How to use pgbench to test PostgreSQL® performance
How to use pgbench to test PostgreSQL® performance
dev.to
To view or add a comment, sign in
-
Great overview from SeveralNines on MariaDB 10.11, especially for a production focused perspective. Developers should remember that there are many quality of life improvements too, to make building applications that much easier. Extended JSON support is another cool addition in 10.11! https://lnkd.in/dXuSVcMQ
The most noteworthy improvements in MariaDB 10.11 | Severalnines
severalnines.com
To view or add a comment, sign in
-
A list of query optimizer fixes in MariaDB's February batch of Stable Releases: https://lnkd.in/gEp96J9n (Note that this does NOT include development work)
Notable optimizer fixes released in February, 2024
http://petrunia.net
To view or add a comment, sign in
-
Troubleshooting PostgreSQL on Kubernetes With Coroot Attention all DB geeks! 🤓 Are you tired of constantly troubleshooting PostgreSQL on Kubernetes? Say goodbye to those tedious processes and hello to Coroot – the new open source observability tool powered by eBPF. 🚀 With version 1.0 now available, we were eager to see how it could help with database debugging on Kubernetes. In our latest blog post, we dive into the benefits of using Coroot alongside Percona Operator for PostgreSQL. 💡 Learn how this cloud-native tool can simplify your database troubleshooting process and make your life a whole lot easier. ⚡️ Plus, we'll even walk you through the installation process. Don't miss out on this game-changing tool for MySQL and PostgreSQL performance and availability. 👊 #DBgeeks #PostgreSQL #Kubernetes #Coroot #Percona #observability #database #troubleshooting #eBPF #cloudnative #performance #availability #opensource #debugging #automation #LinkedInPost #databaseadmin #databasedeveloper https://lnkd.in/ebBvjEWm
To view or add a comment, sign in
-
🚀 Excited to share our latest blog post! Mastering pgbench for Database Performance Tuning🚀 In this blog , Bhupathi Shameer Kumar talks about output of pgbench and also tips and tricks with common pitfalls of benchmarking tests using pgbench. Here is the Blog: https://lnkd.in/g4BUph5F 🚀 Aiming New Postgres Horizons 🚀 Stay tuned !! For more info, reach us at OpenSource DB or https://opensource-db.com/ #OpenSourceDB #osdb #PostgreSQL #pgbench #tipsandtricks #pitfalls #benchmarking
Mastering pgbench for Database Performance Tuning - OpenSourceDB
https://opensource-db.com
To view or add a comment, sign in
-
✨ Excited to share that I’ve successfully completed a course focused on advanced database concepts and system designs! 📚 This journey deepened my understanding of: ACID Properties Database Indexing, Partitioning, Replication, and Sharding Database Cursors and Concurrency Control (Optimistic, Pessimistic) B-Trees in Production Systems Database Management Systems vs. Database Engines vs. Embedded Databases Exploring MyISAM, InnoDB, RocksDB, LevelDB, and more Benefits and Trade-offs of Different Database Engines Switching Database Engines in MySQL Database Security, including Homomorphic Encryption A big thank you to Hussein Nasser for guiding me through these valuable insights! 🙌 Looking forward to applying these concepts in real-world projects and optimizing database performance and security! 🚀 #databases #systemdesign #ACID #indexing #sharding #replication #security #MySQL #Postgres
To view or add a comment, sign in
-
🌟 Exciting News Alert! 🌟 Delighted to announce the latest blog post from the Apache Software Foundation shining the spotlight on Apache AGE – the game-changer in graph database technology! 🚀 Read the full blog post now and be a part of shaping the future of graph database technology. 🌐 Read the blog post here: https://lnkd.in/dP4thGgt #ApacheAGE #GraphDatabase #OpenSource #ApacheFoundation #Blog #Post
ASF Project Spotlight: Apache AGE - The Apache Software Foundation Blog
http://news.apache.org
To view or add a comment, sign in
-
What can Postgres’s memory management teach us about building smarter applications? For developers diving into Postgres internals, MemoryContexts provide a powerful framework for managing memory efficiently. In one of the most popular blogs on our site, EDB Staff Engineer Phil Eaton takes a creative approach to exploring this concept by building an HTTP server and web framework from scratch inside of a Postgres extension. This blog is a goldmine for developers looking to refine their understanding of Postgres internals and push the boundaries of what’s possible with Postgres. From managing object lifetimes to optimizing performance, it provides practical insights and hands-on examples. Dive in: https://bit.ly/4fvrDvm #EDBPostgresAI #JustSolveITWithPostgres #PostgreSQL #OpenSource #DatabaseEngineering #DevCommunity
To view or add a comment, sign in
Staff Software Engineer
1wGreat topic! I’ve discussed this briefly with a couple of other Pinot contributors and the opinions were mixed but mostly for implementing a Buffer Pool Manager (which Andy predicted Pinot would have “in the next few years”). To answer your question, I had a production situation where there was contention and the solution was to host real-time and offline tables on separate servers. I don’t have a strong opinion here, but I do think that cluster planning and management is more important.