Michael Armbrust

Berkeley, California, United States

2K followers 500+ connections

View mutual connections with Michael

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Databricks

University of California, Berkeley

Activity

Navy Federal Credit Union enhances member personalization using Delta Live Tables (DLT) to streamline real-time data pipelines. This enables faster…

Navy Federal Credit Union enhances member personalization using Delta Live Tables (DLT) to streamline real-time data pipelines. This enables faster…

Shared by Michael Armbrust
I'm really excited to see us quantify all the cool stuff we've been working on in DLT / Serverless. ⚡️💵 Serverless compute for DLT pipelines offers…

I'm really excited to see us quantify all the cool stuff we've been working on in DLT / Serverless. ⚡️💵 Serverless compute for DLT pipelines offers…

Shared by Michael Armbrust
Unity Catalog now has a beautiful open source user interface. The code is written in TypeScript and is in the unitycatalog/unitycatalog-ui…

Unity Catalog now has a beautiful open source user interface. The code is written in TypeScript and is in the unitycatalog/unitycatalog-ui…

Liked by Michael Armbrust

Join now to see all activity

Experience

Databricks

San Francisco Bay Area
-

San Francisco Bay Area
-

Mountain View, CA
-
-
-
-
-
-

Education

University of California, Berkeley

2006 - 2012
2002 - 2006

Activities and Societies: Phi Beta Kappa, Delta Lambda Phi, Phi Sigma Pi, Mortar Board, College Mentors for Kids (CMFK), Computer Science Undergraduate Student Board (USB), Queer Student Union, Purdue Science Student Council (PSSC), Science Ambassadors, Boiler Gold Rush (BGR), Engineering Projects in Comunity Service (EPICS)

Publications

Creating Data Pipelines for PDS Datasets

Jan 2010
We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data…

We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data intensive distributed applications. By harnessing a cluster of processing nodes we are able to extract raw images from data products and convert them to web-friendly formats at the rate of gigabytes per minute. The resultant images have been converted using the Python Image Library. Additionally, the images have been cropped to postage stamp images supporting various zoom levels. The final images, along with some metadata are uploaded to Amazon's S3 data storage system where they are served. Preliminary tests of the pipeline are promising, having processed 10,000 sample files totaling 30 GB in 15 minutes. The resultant jpegs totaled only 3 GB after compression. The code base has not only proven successful in its own right, but also shows Python, an interpreted language, to be a viable alternative to more mainstream compiled languages such as C/C++ or Fortran, especially when combined with Hadoop. This work was funded through NASA ROSES NNX09AD34G.

Other authors
See publication

More activity by Michael

BigQuery just added full support for Delta Lake, this is super awesome! With Delta Lake's project UniForm we hope to eliminate differences between…

BigQuery just added full support for Delta Lake, this is super awesome! With Delta Lake's project UniForm we hope to eliminate differences between…

Liked by Michael Armbrust
I’m happy to announce that we just launched our second Power BI benchmark comparison and this time we focus on another data platform: Databricks. In…

I’m happy to announce that we just launched our second Power BI benchmark comparison and this time we focus on another data platform: Databricks. In…

Liked by Michael Armbrust
Delta Live Tables by Databricks are an awesome tool to help streamline ETL workloads via a "declarative" ETL framework. Our Head Architect Vinoo…

Delta Live Tables by Databricks are an awesome tool to help streamline ETL workloads via a "declarative" ETL framework. Our Head Architect Vinoo…

Liked by Michael Armbrust
We just released updated financial stats on our last fiscal year! We're seeing acceleration of growth, testament to the democratization of DATA and…

We just released updated financial stats on our last fiscal year! We're seeing acceleration of growth, testament to the democratization of DATA and…

Liked by Michael Armbrust
This is pretty amusing, kudos to Kyle and Vinoth for debunking this. As they say, lies, damn lies, and statistics: https://lnkd.in/g_8ryEjz

This is pretty amusing, kudos to Kyle and Vinoth for debunking this. As they say, lies, damn lies, and statistics: https://lnkd.in/g_8ryEjz

Liked by Michael Armbrust
A really awesome use case for DLT! https://lnkd.in/g4rd_zkM

A really awesome use case for DLT! https://lnkd.in/g4rd_zkM

Shared by Michael Armbrust
What started with me spelunking around streaming data with Databricks and Delta Live Table has turned into something pretty cool that, with some…

What started with me spelunking around streaming data with Databricks and Delta Live Table has turned into something pretty cool that, with some…

Liked by Michael Armbrust
Apple is using Delta Lake to handle a massive amount of data - 5PB daily, to be exact! They've even created single Delta tables that store over 20PB…

Apple is using Delta Lake to handle a massive amount of data - 5PB daily, to be exact! They've even created single Delta tables that store over 20PB…

Liked by Michael Armbrust
Super excited to launch the preview of Unity Catalog's Apache Hive Metastore API, which allows any system that understands Hive to connect to Unity!…

Super excited to launch the preview of Unity Catalog's Apache Hive Metastore API, which allows any system that understands Hive to connect to Unity!…

Liked by Michael Armbrust

View Michael’s full profile

See who you know in common
Get introduced
Contact Michael directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Michael Armbrust in United States

25 others named Michael Armbrust in United States are on LinkedIn

See others named Michael Armbrust

Add new skills with these courses

See all courses

Michael Armbrust

Berkeley, California, United States 2K followers 500+ connections

Activity

Navy Federal Credit Union enhances member personalization using Delta Live Tables (DLT) to streamline real-time data pipelines. This enables faster…

Shared by Michael Armbrust

I'm really excited to see us quantify all the cool stuff we've been working on in DLT / Serverless. ⚡️💵 Serverless compute for DLT pipelines offers…

Shared by Michael Armbrust

Unity Catalog now has a beautiful open source user interface. The code is written in TypeScript and is in the unitycatalog/unitycatalog-ui…

Liked by Michael Armbrust

Experience

Databricks

-

-

-

-

-

-

-

-

Education

University of California, Berkeley

Publications

Creating Data Pipelines for PDS Datasets

Jan 2010

More activity by Michael

BigQuery just added full support for Delta Lake, this is super awesome! With Delta Lake's project UniForm we hope to eliminate differences between…

Liked by Michael Armbrust

I’m happy to announce that we just launched our second Power BI benchmark comparison and this time we focus on another data platform: Databricks. In…

Liked by Michael Armbrust

Delta Live Tables by Databricks are an awesome tool to help streamline ETL workloads via a "declarative" ETL framework. Our Head Architect Vinoo…

Liked by Michael Armbrust

We just released updated financial stats on our last fiscal year! We're seeing acceleration of growth, testament to the democratization of DATA and…

Liked by Michael Armbrust

This is pretty amusing, kudos to Kyle and Vinoth for debunking this. As they say, lies, damn lies, and statistics: https://lnkd.in/g_8ryEjz

Liked by Michael Armbrust

A really awesome use case for DLT! https://lnkd.in/g4rd_zkM

Shared by Michael Armbrust

What started with me spelunking around streaming data with Databricks and Delta Live Table has turned into something pretty cool that, with some…

Liked by Michael Armbrust

Apple is using Delta Lake to handle a massive amount of data - 5PB daily, to be exact! They've even created single Delta tables that store over 20PB…

Liked by Michael Armbrust

Super excited to launch the preview of Unity Catalog's Apache Hive Metastore API, which allows any system that understands Hive to connect to Unity!…

Liked by Michael Armbrust

View Michael’s full profile

Other similar profiles

Vuk Ercegovac

Trevor Strohman

Anand Bheemarajaiah

Yonghui Wu

Peter F. Sweeney

Martin Kelly

Robby Bryant

Omer Baror

Mike Waychison

Jason Mackay

Hari Sudan S

Clare Liguori

Dan Vanderkam

Manjunath Bhat

Max Zhang

Chris McKinlay

Haifeng Jiang

Marc Olson

Ryan Oblak

Arun Ramani

Explore more posts

Explore collaborative articles

Others named Michael Armbrust in United States

Michael Armbrust

Michael Armbrust

Michael Armbrust

Michael Armbrust

Add new skills with these courses

Applied Machine Learning: Ensemble Learning

Applied Machine Learning: Algorithms

AutoML: Build Production-Ready Models Quickly!

Berkeley, California, United States

2K followers 500+ connections