Michael Armbrust
Berkeley, California, United States
2K followers
500+ connections
Activity
-
Navy Federal Credit Union enhances member personalization using Delta Live Tables (DLT) to streamline real-time data pipelines. This enables faster…
Navy Federal Credit Union enhances member personalization using Delta Live Tables (DLT) to streamline real-time data pipelines. This enables faster…
Shared by Michael Armbrust
-
I'm really excited to see us quantify all the cool stuff we've been working on in DLT / Serverless. ⚡️💵 Serverless compute for DLT pipelines offers…
I'm really excited to see us quantify all the cool stuff we've been working on in DLT / Serverless. ⚡️💵 Serverless compute for DLT pipelines offers…
Shared by Michael Armbrust
-
Unity Catalog now has a beautiful open source user interface. The code is written in TypeScript and is in the unitycatalog/unitycatalog-ui…
Unity Catalog now has a beautiful open source user interface. The code is written in TypeScript and is in the unitycatalog/unitycatalog-ui…
Liked by Michael Armbrust
Experience
Education
-
-
Activities and Societies: Phi Beta Kappa, Delta Lambda Phi, Phi Sigma Pi, Mortar Board, College Mentors for Kids (CMFK), Computer Science Undergraduate Student Board (USB), Queer Student Union, Purdue Science Student Council (PSSC), Science Ambassadors, Boiler Gold Rush (BGR), Engineering Projects in Comunity Service (EPICS)
Publications
-
Creating Data Pipelines for PDS Datasets
We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data…
We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data intensive distributed applications. By harnessing a cluster of processing nodes we are able to extract raw images from data products and convert them to web-friendly formats at the rate of gigabytes per minute. The resultant images have been converted using the Python Image Library. Additionally, the images have been cropped to postage stamp images supporting various zoom levels. The final images, along with some metadata are uploaded to Amazon's S3 data storage system where they are served. Preliminary tests of the pipeline are promising, having processed 10,000 sample files totaling 30 GB in 15 minutes. The resultant jpegs totaled only 3 GB after compression. The code base has not only proven successful in its own right, but also shows Python, an interpreted language, to be a viable alternative to more mainstream compiled languages such as C/C++ or Fortran, especially when combined with Hadoop. This work was funded through NASA ROSES NNX09AD34G.
Other authorsSee publication
More activity by Michael
-
BigQuery just added full support for Delta Lake, this is super awesome! With Delta Lake's project UniForm we hope to eliminate differences between…
BigQuery just added full support for Delta Lake, this is super awesome! With Delta Lake's project UniForm we hope to eliminate differences between…
Liked by Michael Armbrust
-
I’m happy to announce that we just launched our second Power BI benchmark comparison and this time we focus on another data platform: Databricks. In…
I’m happy to announce that we just launched our second Power BI benchmark comparison and this time we focus on another data platform: Databricks. In…
Liked by Michael Armbrust
-
Delta Live Tables by Databricks are an awesome tool to help streamline ETL workloads via a "declarative" ETL framework. Our Head Architect Vinoo…
Delta Live Tables by Databricks are an awesome tool to help streamline ETL workloads via a "declarative" ETL framework. Our Head Architect Vinoo…
Liked by Michael Armbrust
-
We just released updated financial stats on our last fiscal year! We're seeing acceleration of growth, testament to the democratization of DATA and…
We just released updated financial stats on our last fiscal year! We're seeing acceleration of growth, testament to the democratization of DATA and…
Liked by Michael Armbrust
-
This is pretty amusing, kudos to Kyle and Vinoth for debunking this. As they say, lies, damn lies, and statistics: https://lnkd.in/g_8ryEjz
This is pretty amusing, kudos to Kyle and Vinoth for debunking this. As they say, lies, damn lies, and statistics: https://lnkd.in/g_8ryEjz
Liked by Michael Armbrust
-
A really awesome use case for DLT! https://lnkd.in/g4rd_zkM
A really awesome use case for DLT! https://lnkd.in/g4rd_zkM
Shared by Michael Armbrust
-
What started with me spelunking around streaming data with Databricks and Delta Live Table has turned into something pretty cool that, with some…
What started with me spelunking around streaming data with Databricks and Delta Live Table has turned into something pretty cool that, with some…
Liked by Michael Armbrust
-
Apple is using Delta Lake to handle a massive amount of data - 5PB daily, to be exact! They've even created single Delta tables that store over 20PB…
Apple is using Delta Lake to handle a massive amount of data - 5PB daily, to be exact! They've even created single Delta tables that store over 20PB…
Liked by Michael Armbrust
-
Super excited to launch the preview of Unity Catalog's Apache Hive Metastore API, which allows any system that understands Hive to connect to Unity!…
Super excited to launch the preview of Unity Catalog's Apache Hive Metastore API, which allows any system that understands Hive to connect to Unity!…
Liked by Michael Armbrust
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Michael Armbrust in United States
-
Michael Armbrust
-
Michael Armbrust
-
Michael Armbrust
Supply Chain Operations Specialist
-
Michael Armbrust
Hardware Asset Manager at Bridgestone Americas
25 others named Michael Armbrust in United States are on LinkedIn
See others named Michael Armbrust