About
Software architect, manager and entrepreneur with more than 20 years of experience in…
Articles by Matteo
Activity
-
Yesterday, we teamed up with Uniswap Labs for an unforgettable night of alpha and insights. If you were there, you already know it was a…
Yesterday, we teamed up with Uniswap Labs for an unforgettable night of alpha and insights. If you were there, you already know it was a…
Liked by Matteo Pelati
-
Presenting to a packed room of brilliant engineers and software developers was daunting. I did my first live demo of Peek at AI Tinkerers’ 3rd meetup…
Presenting to a packed room of brilliant engineers and software developers was daunting. I did my first live demo of Peek at AI Tinkerers’ 3rd meetup…
Liked by Matteo Pelati
-
What an inspiring day at NEOM ! It was a pleasure to welcome the KAUST (King Abdullah University of Science and Technology) TIE Master students and…
What an inspiring day at NEOM ! It was a pleasure to welcome the KAUST (King Abdullah University of Science and Technology) TIE Master students and…
Liked by Matteo Pelati
Experience
Education
Publications
-
Migrating from RDBMS Data Warehouses to Apache Spark
Databricks - Spark Summit Europe 2018
Many companies are migrating their data warehouses from traditional RDBMS to BigData, and, in particular to Apache Spark. This usually requires a lot of effort and time: most of the developers used to work with RDBMS, in fact, need to quickly ramp-up in all big-data technologies in order to achieve the goal. Having faced this problem multiple times, at DBS Bank, we implemented a Spark-based application which helps during this migration process. The application embeds the Spark engine and offers…
Many companies are migrating their data warehouses from traditional RDBMS to BigData, and, in particular to Apache Spark. This usually requires a lot of effort and time: most of the developers used to work with RDBMS, in fact, need to quickly ramp-up in all big-data technologies in order to achieve the goal. Having faced this problem multiple times, at DBS Bank, we implemented a Spark-based application which helps during this migration process. The application embeds the Spark engine and offers a web UI to allow users to create, run, test and deploy jobs interactively. Jobs are primarily written in native SparkSQL, or other flavours of SQL (i.e. TDSQL). In the latter case an intermediate layer translates vendor-specific SQL constructs into Dataset operations (whenever possible) in order to leverage the features of the Catalyst engine. To offer RDBMS-like operations, the software is integrated with CarbonData as a storage layer, allowing users to perform update or delete operations on data. Among other things, the UI offers the possibility of validating procedures and performing data comparisons tasks between different datasets. To simplify deployment, each job can be packaged and released individually. The software produces a metadata file which is capable of driving the execution of the same transformations defined in the UI, in a batch fashion to be run in a production environment. During the talk we will showcase all the above features and explain how each one of them are helping ETL developers to migrate traditional RDBMS SQL code to Spark in DBS Bank.
Other authorsSee publication -
Writing and Deploying Interactive Applications Based on Apache Spark
Databricks - Spark Summit Europe 2018
Very often it is useful to create Spark applications which runs in interactive mode rather than batch mode. Think, for instance, in Spark notebook. This requires exposing a UI and Rest APIs which will interact with the core spark engine. On top of this, in an enterprise environment, it is always necessary to integrate with authentication and authorization services, in order to impersonate the correct user who is logging in and accessing the data interactively.
In this talk we will…Very often it is useful to create Spark applications which runs in interactive mode rather than batch mode. Think, for instance, in Spark notebook. This requires exposing a UI and Rest APIs which will interact with the core spark engine. On top of this, in an enterprise environment, it is always necessary to integrate with authentication and authorization services, in order to impersonate the correct user who is logging in and accessing the data interactively.
In this talk we will discuss how we have built an interactive Spark application which is fully integrated with our enterprise environment at DBS Bank. We will showcase the entire architecture of the framework we have built, showcasing how we embedded REST APIs and a web UI, how we can provision YARN containers dynamically and impersonating the proper user using Kerberos authentication, and how we perform service discovery across the various YARN instances to make the Spark engine accessible from the web.
At the end of the talk, the audience will have a clear understanding of how an interactive enterprise application can be built on top of Spark, and will be able to follow a similar design to implement and deploy interactive applications in their enterprise environment.Other authorsSee publication -
Data production pipelines: Legacy, practices, and innovation
O'Reilly - Strata Data Conference Singapore
Modern engineering requires machine learning engineers, who are needed to monitor and implement ETL and machine learning models in production. Natalino Busa shares technologies, techniques, and blueprints on how to robustly and reliably manage data science and ETL flows from inception to production.
In particular, Natalino explains how to solve one of the most annoying problems in modern data pipelines—migrating and managing legacy ETL—by generating Spark jobs from a textual…Modern engineering requires machine learning engineers, who are needed to monitor and implement ETL and machine learning models in production. Natalino Busa shares technologies, techniques, and blueprints on how to robustly and reliably manage data science and ETL flows from inception to production.
In particular, Natalino explains how to solve one of the most annoying problems in modern data pipelines—migrating and managing legacy ETL—by generating Spark jobs from a textual representation (NLP and SQL). Natalino also demonstrates an open source web UI implemented in React that transforms high-level representations to Spark code and shows how users are able to capture and discover data in the organization by accessing a metadata service. Natalino also introduces the datalab framework, a Jupyter-powered lightweight framework that allows machine learning scientists and engineers to build a robust production ML system only using notebooks.Other authorsSee publication -
Multicast routing code in the Linux kernel
Linux Journal
In this article I explain how the Linux kernel manages multicast traffic and how it is possible to interact with it by simply patching some kernel code. Although this is a rather specific topic, it might be useful for anyone interested in multicast routing. If you want to monitor or modify any existing multicast protocol, the information provided below will be useful.
Projects
Recommendations received
9 people have recommended Matteo
Join now to viewMore activity by Matteo
-
Majoring in Statistics wasn’t part of my original plan—it was, in many ways, an accidental choice. When I first enrolled at the National University…
Majoring in Statistics wasn’t part of my original plan—it was, in many ways, an accidental choice. When I first enrolled at the National University…
Liked by Matteo Pelati
-
A sense of PRIDE! and awe..thats what the Transformation Festival brings to me each year! 🙌 An annual event we’ve had at DBS for over 10 years…
A sense of PRIDE! and awe..thats what the Transformation Festival brings to me each year! 🙌 An annual event we’ve had at DBS for over 10 years…
Liked by Matteo Pelati
-
Seeing all these people in the Money Abroad community who quit their jobs at VCs, and Big Tech companies like I did opened my eyes. Growing up, we…
Seeing all these people in the Money Abroad community who quit their jobs at VCs, and Big Tech companies like I did opened my eyes. Growing up, we…
Liked by Matteo Pelati
-
Shri Sameer Gupta, Group Chief Analytics Officer & Managing Director, DBS Bank, IIMC Alumnus – 30th PGP Batch, receiving the Distinguished Alumnus…
Shri Sameer Gupta, Group Chief Analytics Officer & Managing Director, DBS Bank, IIMC Alumnus – 30th PGP Batch, receiving the Distinguished Alumnus…
Liked by Matteo Pelati
-
I recently built a video calling app using flutter, google firebase, all my video calls ( backed by a lot of NAT servers) , goes through this app…
I recently built a video calling app using flutter, google firebase, all my video calls ( backed by a lot of NAT servers) , goes through this app…
Liked by Matteo Pelati
-
In my last post, I promised to share some highlights and takeaways from the "AI at Scale" panel at the VentureFizz conference at SVB. As I described…
In my last post, I promised to share some highlights and takeaways from the "AI at Scale" panel at the VentureFizz conference at SVB. As I described…
Liked by Matteo Pelati
-
Why Single-Node Engines Are Gaining Ground in Data Processing 👉 Advancements in hardware technology have significantly enhanced the processing…
Why Single-Node Engines Are Gaining Ground in Data Processing 👉 Advancements in hardware technology have significantly enhanced the processing…
Liked by Matteo Pelati
-
Day 1 from the #InsightsForum in Singapore organized by GFTN with a special fire chat on Tokenization & Trust by Yazeed Al-Nafjan, CFA and Tanvir…
Day 1 from the #InsightsForum in Singapore organized by GFTN with a special fire chat on Tokenization & Trust by Yazeed Al-Nafjan, CFA and Tanvir…
Liked by Matteo Pelati
-
Yesterday, I had lunch with a prominent VC from Andreessen Horowitz. This individual has backed a dozen household names and is an absolute titan in…
Yesterday, I had lunch with a prominent VC from Andreessen Horowitz. This individual has backed a dozen household names and is an absolute titan in…
Liked by Matteo Pelati
-
And that's a wrap at #SWITCHSG 2024! 🎉 Thank you to our partners and all who visited our booth for making SGInnovate at SWITCH a success! 🤝 From…
And that's a wrap at #SWITCHSG 2024! 🎉 Thank you to our partners and all who visited our booth for making SGInnovate at SWITCH a success! 🤝 From…
Liked by Matteo Pelati
-
Meet Shobhit Datta, co-founder of HipVan, Singapore's #1 Home and Living App. Shobhit was the most recent guest speaker at my class at NUS Business…
Meet Shobhit Datta, co-founder of HipVan, Singapore's #1 Home and Living App. Shobhit was the most recent guest speaker at my class at NUS Business…
Liked by Matteo Pelati
-
Nave Amerigo Vespucci stands for all Italy represents: elegance, technology design, innovation, tradition! The two years long Amerigo Vespucci World…
Nave Amerigo Vespucci stands for all Italy represents: elegance, technology design, innovation, tradition! The two years long Amerigo Vespucci World…
Liked by Matteo Pelati
-
Here’s the landscape of Open Source Data Engineering 2024 — 1. Storage Systems: From relational OLTP databases like PostgreSQL and MySQL to…
Here’s the landscape of Open Source Data Engineering 2024 — 1. Storage Systems: From relational OLTP databases like PostgreSQL and MySQL to…
Liked by Matteo Pelati
-
🌟 Exciting News: Just Received the "LLM Engineer’s Handbook"! 📚 I can hardly contain my excitement as I announce that my copy of the “LLM…
🌟 Exciting News: Just Received the "LLM Engineer’s Handbook"! 📚 I can hardly contain my excitement as I announce that my copy of the “LLM…
Liked by Matteo Pelati
-
Inspiring Conversations with Jensen Huang! I'm honored to have had the opportunity to interact with Jensen Huang, Founder & CEO of NVIDIA, at the…
Inspiring Conversations with Jensen Huang! I'm honored to have had the opportunity to interact with Jensen Huang, Founder & CEO of NVIDIA, at the…
Liked by Matteo Pelati
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More