About
My current interests are in big data analytics, data engineering, machine learning…
Articles by Jules
Activity
-
This is huge for Apache Spark™ 🎇 developers, who can connect to or access Spark from anywhere, debug it in their IDE, and view all jobs…
This is huge for Apache Spark™ 🎇 developers, who can connect to or access Spark from anywhere, debug it in their IDE, and view all jobs…
Shared by Jules Damji
-
Fine-tuning LLMs with MLflow — a practical guide Imagine you’re sailing across an ocean, searching for the perfect island. Without a map or compass,…
Fine-tuning LLMs with MLflow — a practical guide Imagine you’re sailing across an ocean, searching for the perfect island. Without a map or compass,…
Liked by Jules Damji
-
𝘾𝙖𝙪𝙨𝙖𝙩𝙚 : a #Python package I recently created to take the first steps in operationalizing causal AI workflows using #MLflow! As recently I…
𝘾𝙖𝙪𝙨𝙖𝙩𝙚 : a #Python package I recently created to take the first steps in operationalizing causal AI workflows using #MLflow! As recently I…
Liked by Jules Damji
Experience
Education
-
The Johns Hopkins University
-
-
-
-
-
Licenses & Certifications
Publications
-
Databricks Product and Engineering Publications
Databricks
Over 90+ product, open-source, conference, webinars, and engineering blogs.
-
Learning Spark 2nd Edition
O'Reilly
Book description
Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark.
Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning…Book description
Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark.
Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you’ll be able to:
* Learn Python, SQL, Scala, or Java high-level Structured APIs
* Understand Spark operations and SQL Engine
* Inspect, tune, and debug Spark operations with Spark configurations and Spark UI
* Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka
* Perform analytics on batch and streaming data using Structured Streaming
* Build reliable data pipelines with open source Delta Lake and Spark
* Develop machine learning pipelines with MLlib and productionize models using MLflowOther authorsSee publication
Patents
-
Web site monitoring system and method
US WO 2002013018 A3
Methods and systems for monitoring transactions and components on web sites are described. A universal monitoring module invokes a transaction agent associated with the web site by, for example, transmitting an HTTP GET command to a unique URL associated with that transaction agent. The transaction agent, which can be customized by the web site's owner to test the reachability and functionality of the web site components, performs at least one transaction on the web site and reports results…
Methods and systems for monitoring transactions and components on web sites are described. A universal monitoring module invokes a transaction agent associated with the web site by, for example, transmitting an HTTP GET command to a unique URL associated with that transaction agent. The transaction agent, which can be customized by the web site's owner to test the reachability and functionality of the web site components, performs at least one transaction on the web site and reports results using a standardized reporting format that can be readily parsed by the universal monitoring module. The universal monitoring module takes appropriate action, e.g., alerting web site responsible parties, based on the reported data. Using this type of monitoring methodology, a scalable yet customized monitoring solution is achieved.
Honors & Awards
-
The Netcenter Dedication and Contribution Award
Vice President Mike Homer
This award recognized my leadership and stewardship in deploying and supportiong a key component of Netscape's Member Directory Infrastructure.
-
The Java Cup International Achievement Award, 1996
Java Cup International Judges, including Dr. Eric Schmidt, Scott McNeally, Dr. Gosling, Bill Joy, Marc Andreessen, and Carol Bartz.
This award honored our group's successful completion of the world's first Java Cup International competition in which 2700 Java programmers, from around the world, submitted java applets for numerous computing categories. The Java Cup winners were recognized at JavaOne Conference.
-
The Team Award for Quality
Dr. Eric Schmidt
This team award recognized our collaboration software SparcWorks/TeamWare for its quality, robustness, and performance.
Recommendations received
5 people have recommended Jules
Join now to viewMore activity by Jules
-
We are thrilled to invite you to Databricks Get Started Days, a half-day virtual event to sharpen your data engineering and analysis skills. This…
We are thrilled to invite you to Databricks Get Started Days, a half-day virtual event to sharpen your data engineering and analysis skills. This…
Liked by Jules Damji
-
Want to join the most exciting company in Data and AI? I am looking for a #Databricks #Developer #Advocate. Are you passionate about #Data and #AI…
Want to join the most exciting company in Data and AI? I am looking for a #Databricks #Developer #Advocate. Are you passionate about #Data and #AI…
Liked by Jules Damji
-
Our Amsterdam Engineering team recently hosted guests for an evening focused on building a high-performance culture. Paul Leventis, VP of…
Our Amsterdam Engineering team recently hosted guests for an evening focused on building a high-performance culture. Paul Leventis, VP of…
Liked by Jules Damji
-
We recently had a great session with Amogh Jahagirdar, Apache Iceberg #PMC member, on #ApacheIceberg #DeletionVectors as part of #IcebergV3 at a…
We recently had a great session with Amogh Jahagirdar, Apache Iceberg #PMC member, on #ApacheIceberg #DeletionVectors as part of #IcebergV3 at a…
Liked by Jules Damji
-
MLflow tracing supports Google's just-announced Gemini 2.0 Flash, which brings 2x faster performance and improved capabilities across text, code, and…
MLflow tracing supports Google's just-announced Gemini 2.0 Flash, which brings 2x faster performance and improved capabilities across text, code, and…
Liked by Jules Damji
-
Your data processing job will rarely live alone. Often you will need to combine it with other jobs or tasks to create a data flow. A data flow…
Your data processing job will rarely live alone. Often you will need to combine it with other jobs or tasks to create a data flow. A data flow…
Liked by Jules Damji
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More