Spoke with VP Data Platform at a Fortune-200 eCommerce last week. 8K production Spark workloads, $25M+ data platform annual cloud spend. For 2025, their CFO set a clear mandate: 𝐜𝐮𝐭 𝟏𝟓% 𝐨𝐟 𝐜𝐥𝐨𝐮𝐝 𝐜𝐨𝐬𝐭𝐬. They have cost allocated down to the penny, but 𝐡𝐞𝐫𝐞’𝐬 𝐭𝐡𝐞 𝐜𝐚𝐭𝐜𝐡: 𝐭𝐡𝐞𝐲 𝐡𝐚𝐯𝐞 𝐧𝐨 𝐢𝐝𝐞𝐚 𝐖𝐇𝐄𝐑𝐄 𝐢𝐬 𝐭𝐡𝐞 𝐖𝐀𝐒𝐓𝐄 𝐚𝐧𝐝 𝐇𝐎𝐖 𝐭𝐡𝐞𝐲 𝐜𝐨𝐮𝐥𝐝 𝐚𝐜𝐭𝐮𝐚𝐥𝐥𝐲 𝐎𝐏𝐓𝐈𝐌𝐈𝐙𝐄 𝐣𝐨𝐛𝐬. He’s now chasing 80+ data dev teams in order to find opportunities… So far, feedback from users is - 1️⃣ we don’t actually know ourselves -- monitoring is partial, optimizations require deep expertise 2️⃣ we can try to run experiments on job configurations -- big effort, one-time benefit All in all – no path yet to achieve targets for next year… --- It’s 2025. If your team is running Spark at-scale, out-of-the-box, continuous, and actionable transformation-level performance observability isn’t optional anymore – it’s a must. Chasing data dev teams cannot be the strategy. #DataEngineering #Spark #PerformanceOptimization #definity
Interesting
Real pain
Chief Financial Officer at Sqream Technologies Ltd
2dTo cut cloud costs with SQream, offload heavy Spark workloads to SQream’s GPU-accelerated platform. It processes massive data efficiently, reducing computing and storage needs. With built-in observability, SQream pinpoints inefficiencies, simplifying optimization and eliminating waste. Plus, its compression minimizes storage costs, and more minor infrastructure requirements slash expenses. It’s a scalable, continuous solution to meet your cost-saving goals. FYI - I'm the CFO at SQream and also asking our teams to reduce these costs as well.