Modern Data Stack France

Modern Data Stack France

Organisations professionnelles

Modern Data Stack meetup: Unleashing data insights with cutting-edge tech. Join us for knowledge-sharing and networking

À propos

Modern Data Stack meetup: Unleashing data insights with cutting-edge tech. Join us for knowledge-sharing and networking

Secteur
Organisations professionnelles
Taille de l’entreprise
1 employé
Siège social
Paris
Type
Établissement éducatif

Lieux

Employés chez Modern Data Stack France

Nouvelles

  • La Modern Data Stack (MDS) et l'utilisation des #datalakes n'ont pas sonné le glas de la modélisation des données, mais ont plutôt catalysé son évolution. Loin de disparaître, la modélisation s'est adaptée aux nouveaux paradigmes, passant d'une approche préalable rigide à une méthodologie plus souple, appliquée à la demande. Cette transformation reflète la flexibilité accrue offerte par les technologies modernes, où le schéma-on-read supplante le schéma-on-write, permettant le stockage de données brutes et non structurées. Cette nouvelle ère apporte son lot de défis, notamment la gestion de volumes de données plus importants et plus variés, ainsi que la nécessité de concevoir des modèles adaptés au machine learning et à l'intelligence artificielle. Malgré ces changements, l'importance de la modélisation demeure incontestable. Elle reste cruciale pour garantir l'intégrité et la cohérence des données, tout en étant essentielle à leur utilisation efficace dans l'analyse et la prise de décision. Face à ces enjeux, des approches hybrides émergent, combinant habilement méthodes traditionnelles et modernes. Ces nouvelles stratégies permettent d'adapter les modèles aux besoins spécifiques de chaque couche de données, de la source brute aux insights finaux. Ainsi, la modélisation des données, loin d'être obsolète, se réinvente pour répondre aux nouvelles exigences, offrant une flexibilité et une adaptabilité accrues aux besoins évolutifs des entreprises dans un paysage data en constante mutation ! 👉 Rejoignez-nous (online) le Jeudi 30 janvier 2025 à 17h pour en parler !

    Voir la page d’organisation pour DATANOSCO, visuel

    1 054  abonnés

    Rejoignez-nous pour un débat en ligne captivant sur la modélisation des données : Kimball vs. One Big Table (OBT) vs Inmon 👉 La modélisation Kimball, bien établie depuis 1996, repose sur l'utilisation de tables de faits et de dimensions, offrant une gouvernance des données améliorée. Cependant, des études récentes montrent que l'approche OBT, qui favorise une table dénormalisée unique, peut offrir des performances supérieures dans des environnements modernes comme Redshift, Snowflake et BigQuery. Avantages de l'OBT : ● Simplification des requêtes sans jointures complexes ● Accès rapide aux données avec des temps de réponse améliorés Inconvénients : ● Risque de désorganisation si les colonnes ne sont pas bien structurées ● Moins compatible avec certains outils BI comme Power BI ou Tableau 👉 La question se pose : Doit-on abandonner complètement le modèle Kimball au profit de l'OBT ? Ce débat est crucial pour les #datascientists et les #dataengineers qui cherchent à équilibrer créativité et rigueur. Une approche hybride pourrait-elle être la solution ? Participez à notre discussion pour explorer ces enjeux et partager vos expériences ! Ismael Goulani, Stéphane Heckel, Willis Nana, Axel TIFRANI

    Modélisation, OBT vs Kimball vs Inmon

    Modélisation, OBT vs Kimball vs Inmon

    www.linkedin.com

  • Modern Data Stack France a republié ceci

    Voir le profil de Kai Waehner, visuel
    Kai Waehner Kai Waehner est un Influencer

    Global Field CTO | Author | International Speaker | Follow me with Data in Motion

    🚀 Queues for Apache Kafka is Coming Soon in the 4.0 Release! Apache Kafka is evolving beyond streaming, soon supporting queue-based processing! 🎉 This exciting addition enables parallel consumption of messages, making Kafka more versatile and ideal for workloads requiring flexibility, scalability, and efficiency. ✅ Why Queues in Kafka? • Enhanced Flexibility: Process messages in order or independently based on your use case. • Scalability: Dynamically adjust the number of consumers to handle traffic spikes efficiently. • Hybrid Model: Combines the benefits of queues and Kafka’s log-based architecture for reprocessing, fault tolerance, and reliability. 💡 What’s New? Introducing share groups, allowing multiple consumers to read from the same partition while optimizing batch processing and enabling re-delivery of unprocessed messages. Perfect for use cases like sales events, inventory management, or real-time analytics. 📈 The Impact: Kafka’s new queue support reduces system complexity and bridges the gap between traditional queue systems and real-time streaming, offering a unified platform for diverse workloads. This feature is part of the Apache Kafka 4.0 Early Access release—a step closer to Kafka becoming the central nervous system for modern data workflows! #ApacheKafka #DataStreaming #MessageQueues #Scalability #Innovation

    • Aucune description alternative pour cette image
  • Modern Data Stack France a republié ceci

    Voir le profil de Burak Karakan, visuel

    Co-founder & CEO @ Bruin

    🚀 Time to break the silence: launching Bruin CLI, our open-source data pipeline tool! Bruin CLI is an open-source data pipeline tool that brings together: ✅ data ingestion ✅ data transformation using SQL & Python ✅ run Python in isolated environments using the amazing `uv` ✅ built-in data quality checks ✅ VS Code Extension for local development Bruin ties the end-to-end experience and enables teams to move much faster. It takes care of the complexity of ingesting data from many sources and transforming it while treating data quality as a first-class citizen. It brings data from a lot of different sources like Postgres, Kafka, Facebook Ads, and more, and allows running full pipelines locally in isolated environments. Bruin also comes with an open-source VS Code extension that does syntax highlighting, lineage, Jinja rendering, and a lot more. It allows you to iterate quickly locally. Bruin can validate your pipelines end-to-end using dry-run/EXPLAIN statements, it takes care of secrets management, and a lot more! Take a look at the repo, and don't forget to give us a start!

  • Modern Data Stack France a republié ceci

    Did you know that you can query the new Amazon S3 Tables with DuckDB? Yes! You totally can! But everybody seems to be out there grumbling about vendor lock-in and "omg, I can't believe this preview product doesn't support every single query engine out there". Take a moment and breathe, folks. In any case - I made a quick demo on how you can create and query S3 Tables with open source Spark on Kubernetes **and** how you can read those same tables with DuckDB. Enjoy! https://lnkd.in/grez2DeW

    Querying S3 Tables with Spark on Kubernetes and DuckDB

    https://www.youtube.com/

  • Modern Data Stack France a republié ceci

    Voir la page d’organisation pour Apache Iceberg, visuel

    21 590  abonnés

    #Icelovers 💙 don't miss a bit of what happened to #Iceberg as we wrap up the year. 2024 was definitely the year to consolidate Apache Iceberg in the #Lakehouse landscape 🧊 Cross-vendor integrations, new features, acquisitions, we all witnessed an avalanche of fast paced innovation as never before. And much more is coming. Help yourselves with the big news that warmed up the weather a little bit this year💥 Thanks for growing and contributing to this community 🙏 All of you will lead the Lakehouse era in the next years! Looking forward to a surprisingly 2025 🌟 AWS S3 Tables - native support for Iceberg tables: https://lnkd.in/dEFNnKTN Snowflake Expands Partnership with Microsoft to Improve Interoperability Through Apache Iceberg: https://lnkd.in/d93AXBvA Dremio Integrates Apache Iceberg REST to Promote Vendor-Agnostic Ecosystem: https://lnkd.in/dZMB7-ve Snowflake introduces Polaris Catalog - An Open Source Catalog for Apache Iceberg: https://lnkd.in/dGHDidfD Databricks Agrees to Acquire Tabular, the Company Founded by the Original Creators of Apache Iceberg: https://lnkd.in/eFniN9t3 Confluent Tableflow, Convert Kafka topics to Iceberg tables: https://lnkd.in/dmTBZ_VW Cloudera announced integration with Snowflake by extending its Open Data Lakehouse interoperability: https://lnkd.in/dWUNWrRn This page isn't affiliated with the Apache Iceberg project and doesn’t represent PMC opinions. For official news, please check the communication channels provided by the project: https://lnkd.in/dQ76H72K

    • Aucune description alternative pour cette image
  • Modern Data Stack France a republié ceci

    Voir le profil de Thomas Ricquebourg, visuel

    📊🧑💻🎓 Consultant - Formateur Microsoft Fabric, Power BI et Azure Data Platform

    🆕Je vous présente mon petit guide pour vous lancer gratuitement sur Fabric. 🧘 Avec la première version de ce guide, je vous explique comment profiter des périodes d'essai (allant jusqu'à 300 jours voire plus !) pour tester cette puissante plateforme de données et démarrer vos projets en toute sérénité. 🎉2024 : Une belle année pour Fabric Beaucoup d'annonces, de conférences et surtout la sortie de superbes ouvrages écrits par des passionnnés en français : 📗Fabric - Le guide complet de Marie Aubert, Charles-Henri Sauget, et Matthieu Roy. 📗Microsoft Fabric - De l’analyse à la mise en place d’une plateforme de données unifiée de Christopher MANEU, Romain Casteres, Emilie BEAU, Frederic Gisbert et Jean-Pierre Riehl. 💁♂️De mon côté voici ma petite contribution annuelle: un guide pratique sur l'essai de Fabric pour bien démarrer. 💾En téléchargement sur ce post. 🎆Je vous souhaite de joyeuses fêtes et une belle découverte de Microsoft Fabric. Bonne lecture et à bientôt 😊 #MicrosoftFabric #DataAnalytics #PowerBI #BusinessIntelligence #Data #Tech974 #LaRéunion

  • Modern Data Stack France a republié ceci

    📢 New article on the IT blog! 💡Read the feedback written by Jules-Eugène Plouvier Demets, Tech lead, on ensuring data consistency in a micro service application. 👉Want to find out more about how the team took up the challenge? Read the full article by clicking here: https://lnkd.in/gRh2HbMD #Microservices #TechSolutions #Michelin #SoftwareEngineering #SoftwareDriven #michelintechnology

    Ensuring data consistency in a micro-service application

    Ensuring data consistency in a micro-service application

    blogit.michelin.io

  • Modern Data Stack France a republié ceci

    Voir le profil de Saurabh Dashora, visuel

    Writing the System Design Codex Newsletter

    I asked 11 developers and found that 8 were struggling with data consistency issues. The number one culprit was Dual Writes. Dual Writes happen when you have to update two different systems. For example, [1] Updating a database [2] Publishing an event to a different system (say Kafka). The 2nd step could also be anything else like sending an email. Because these two systems aren’t linked, we can’t update both in a transactional manner. If the DB update is successful but a failure occurs after that, the event will never be published. This means inconsistent data. What’s the solution? 👉 The Transactional Outbox Pattern. In this pattern, the transactional logic is pushed into the database. Whenever there is an update in the database, we also update an outbox table in the same transaction. Think of the outbox table as a mailbox. As the database updates happen, it is filled with letters that have to be delivered to a post office. From an application point of view, - Letters = Events - Post Office = Kafka All we need now is the postman, who can carry those letters from the mailbox (outbox) to the post office (Kafka). This could be an async process and you’ve multiple options on how to implement it: - A separate thread with the original microservice - A separate application - Kafka Connector or Change Data Capture process that monitors the outbox table. 👉 But is the Outbox pattern perfect? Nothing is perfect, as such. With this pattern, you can still have duplicate messages in case there are failures. This is to ensure that we have an At-Least-Once delivery guarantee. But this requires the downstream systems to de-duplicate the messages. 👉 So - have you used the Outbox pattern? Also, for more detailed posts on System Design concepts, subscribe to my newsletter. Here's the link: https://lnkd.in/gS9eam6A

    • Aucune description alternative pour cette image
  • Modern Data Stack France a republié ceci

    Last week, we at Turso announced a bold project. Codenamed Limbo, it is a full rewrite of SQLite in Rust. I wrote more about it here: https://lnkd.in/guD78ug8 The reaction has been beyond fantastic. For 5 days straight, the graph of Github stars kept growing vertically, only showing signs of slowing down today. More impressive than that, the project gained more than 10 new contributors last week, with some individuals contributing more than once, with very high quality contributions. Limbo is still an experimental project for us, but it is very encouraging to see this. When we forked SQLite into libSQL and offered the community a vision of an open contribution space, where everybody can get a seat on the table and influence the direction of the project, that is what we wanted to achieve. And while we did get new contributors with libSQL, Limbo is already on track to surpass it, although very early. In hindsight, the boldness of a rewrite and the technical freedom it allows seems to be the catalyst we were missing, and is now present with Limbo.

    • Aucune description alternative pour cette image

Pages similaires