Matteo Pelati

Matteo Pelati

Singapore
5K followers 500+ connections

About

Software architect, manager and entrepreneur with more than 20 years of experience in…

Articles by Matteo

  • The future of banking is no banking

    The future of banking is no banking

    In the recent weeks I had a flashback: it was like going back to 1996 discovering and experimenting with the Internet…

    3 Comments
  • Could SuperApps be the next generation banking apps?

    Could SuperApps be the next generation banking apps?

    In the recent years I see a lot of banks putting huge efforts in their digital platforms. Most of the work in this area…

    4 Comments
  • Thoughts on Open Banking, Blockchain and Identity Management

    Thoughts on Open Banking, Blockchain and Identity Management

    I love the completely decentralised concepts sitting behind the blockchain technology; I truly believe it can…

    1 Comment
  • Looking for the next great idea

    Looking for the next great idea

    Today I was reading an article on TechCrunch about finding the secret sauce for making a startup successful and, while…

    2 Comments
  • Solving the late payment problem with the blockchain

    Solving the late payment problem with the blockchain

    It is no surprise that late payments between companies are a big problem (especially in Europe), and can seriously put…

    3 Comments

Activity

Join now to see all activity

Experience

  • LangDB Graphic

    LangDB

    Singapore

  • -

  • -

    Singapore

  • -

    Singapore

  • -

    Singapore

  • -

    Singapore

  • -

    Singapore

  • -

    Singapore

  • -

    Brescia, Italy

  • -

    Milan, Italy

  • -

    Redmond, WA, USA

  • -

    Milan, Italy

  • -

    Milan, Italy

  • -

    Milan, Italy

  • -

Education

Publications

  • Migrating from RDBMS Data Warehouses to Apache Spark

    Databricks - Spark Summit Europe 2018

    Many companies are migrating their data warehouses from traditional RDBMS to BigData, and, in particular to Apache Spark. This usually requires a lot of effort and time: most of the developers used to work with RDBMS, in fact, need to quickly ramp-up in all big-data technologies in order to achieve the goal. Having faced this problem multiple times, at DBS Bank, we implemented a Spark-based application which helps during this migration process. The application embeds the Spark engine and offers…

    Many companies are migrating their data warehouses from traditional RDBMS to BigData, and, in particular to Apache Spark. This usually requires a lot of effort and time: most of the developers used to work with RDBMS, in fact, need to quickly ramp-up in all big-data technologies in order to achieve the goal. Having faced this problem multiple times, at DBS Bank, we implemented a Spark-based application which helps during this migration process. The application embeds the Spark engine and offers a web UI to allow users to create, run, test and deploy jobs interactively. Jobs are primarily written in native SparkSQL, or other flavours of SQL (i.e. TDSQL). In the latter case an intermediate layer translates vendor-specific SQL constructs into Dataset operations (whenever possible) in order to leverage the features of the Catalyst engine. To offer RDBMS-like operations, the software is integrated with CarbonData as a storage layer, allowing users to perform update or delete operations on data. Among other things, the UI offers the possibility of validating procedures and performing data comparisons tasks between different datasets. To simplify deployment, each job can be packaged and released individually. The software produces a metadata file which is capable of driving the execution of the same transformations defined in the UI, in a batch fashion to be run in a production environment. During the talk we will showcase all the above features and explain how each one of them are helping ETL developers to migrate traditional RDBMS SQL code to Spark in DBS Bank.

    Other authors
    See publication
  • Writing and Deploying Interactive Applications Based on Apache Spark

    Databricks - Spark Summit Europe 2018

    Very often it is useful to create Spark applications which runs in interactive mode rather than batch mode. Think, for instance, in Spark notebook. This requires exposing a UI and Rest APIs which will interact with the core spark engine. On top of this, in an enterprise environment, it is always necessary to integrate with authentication and authorization services, in order to impersonate the correct user who is logging in and accessing the data interactively.

    In this talk we will…

    Very often it is useful to create Spark applications which runs in interactive mode rather than batch mode. Think, for instance, in Spark notebook. This requires exposing a UI and Rest APIs which will interact with the core spark engine. On top of this, in an enterprise environment, it is always necessary to integrate with authentication and authorization services, in order to impersonate the correct user who is logging in and accessing the data interactively.

    In this talk we will discuss how we have built an interactive Spark application which is fully integrated with our enterprise environment at DBS Bank. We will showcase the entire architecture of the framework we have built, showcasing how we embedded REST APIs and a web UI, how we can provision YARN containers dynamically and impersonating the proper user using Kerberos authentication, and how we perform service discovery across the various YARN instances to make the Spark engine accessible from the web.

    At the end of the talk, the audience will have a clear understanding of how an interactive enterprise application can be built on top of Spark, and will be able to follow a similar design to implement and deploy interactive applications in their enterprise environment.

    Other authors
    See publication
  • Data production pipelines: Legacy, practices, and innovation

    O'Reilly - Strata Data Conference Singapore

    Modern engineering requires machine learning engineers, who are needed to monitor and implement ETL and machine learning models in production. Natalino Busa shares technologies, techniques, and blueprints on how to robustly and reliably manage data science and ETL flows from inception to production.

    In particular, Natalino explains how to solve one of the most annoying problems in modern data pipelines—migrating and managing legacy ETL—by generating Spark jobs from a textual…

    Modern engineering requires machine learning engineers, who are needed to monitor and implement ETL and machine learning models in production. Natalino Busa shares technologies, techniques, and blueprints on how to robustly and reliably manage data science and ETL flows from inception to production.

    In particular, Natalino explains how to solve one of the most annoying problems in modern data pipelines—migrating and managing legacy ETL—by generating Spark jobs from a textual representation (NLP and SQL). Natalino also demonstrates an open source web UI implemented in React that transforms high-level representations to Spark code and shows how users are able to capture and discover data in the organization by accessing a metadata service. Natalino also introduces the datalab framework, a Jupyter-powered lightweight framework that allows machine learning scientists and engineers to build a robust production ML system only using notebooks.

    Other authors
    See publication
  • Multicast routing code in the Linux kernel

    Linux Journal

    In this article I explain how the Linux kernel manages multicast traffic and how it is possible to interact with it by simply patching some kernel code. Although this is a rather specific topic, it might be useful for anyone interested in multicast routing. If you want to monitor or modify any existing multicast protocol, the information provided below will be useful.

    See publication

Projects

Recommendations received

More activity by Matteo

View Matteo’s full profile

  • See who you know in common
  • Get introduced
  • Contact Matteo directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Matteo Pelati

Add new skills with these courses