Rudyar Cortes, PhD’s Post

Chapter Lead Data Engineering - Senior Data Engineer

9mo

Deploying robust Data & AI assets on the Databricks Intelligence Platform can be enabled by Adopting Software Engineering best practices such as code versioning, testing, code packaging, and Continuous Integration and Delivery (CI/CD). In this article, I share with you how to leverage the latest tool for achieving such goal: Databricks Assets Bundles (DABs). Happy Bundling ! https://lnkd.in/dnnTkKrB

Databricks Assets Bundles (DABs): Deploying and Managing Data & AI Assets

medium.com

To view or add a comment, sign in

More Relevant Posts

Bo Zhang

Senior Staff Technical Instructor at Databricks
8mo
Report this post
Ever wondering how to adopt software engineering best practices for your data and AI projects on the Databricks? Check out DABs!

Karthik Ramasamy

Head of Streaming at Databricks
8mo

Databricks Asset Bundles (DABs) is now GA! Streamline data, analytics, and AI project development on Databricks using software development best practices. Explore now and transform your projects. #buildonDatabricks #DABS

Announcing the General Availability of Databricks Asset Bundles

databricks.com
Like Comment
To view or add a comment, sign in
Gireesh Sreedhar

Senior DSA, Field Engineering - Artificial Intelligence and GenAI | AI Technology Leader| Deep expertise on Data, AI, Machine Learning, Cloud, Generative AI, MLOps| 'Data + AI' problem solver | 5 x Databricks Certified
8mo
Report this post
Databricks Asset Bundles (DABs) provide a simple, declarative format for describing data and AI projects. DABs help you adopt software engineering best practices for your data and AI projects on the Databricks Platform. DABs facilitate source control, code review, testing, and continuous integration and delivery (CI/CD) for all your data assets as code. Explore now and transform your projects. #buildonDatabricks #DABS

Announcing the General Availability of Databricks Asset Bundles

databricks.com
Like Comment
To view or add a comment, sign in
craftworks GmbH

1,858 followers
1mo Edited
Report this post
Continuous Integration & Deployment (CI/CD) is expanding beyond traditional software development. In his latest article, our colleague, Eduard-Andrei Popa explores how CI/CD is being adapted to address the unique challenges of data engineering on Databricks. From managing the complex interdependence of code, data, and compute resources to automating deployments with Databricks Asset Bundles, this guide offers practical strategies for building reliable, scalable data pipelines. 🧱 Discover how these principles are applied in real-world scenarios: https://bit.ly/3YIiCJi 🔗 And if you’re ready to elevate your own data engineering projects, explore our solutions here: https://bit.ly/3YsxmuC #DataEngineering #CICD #Databricks #MachineLearning #DataScience #Automation #craftworks
1 Comment
Like Comment
To view or add a comment, sign in
Bhawna Bedi

Lead Data Engineer @Prudential | Leading Data Implementations | YouTube - CloudFitness | MCT
6mo
Report this post
Continuous Integration and Continuous Deployment using Databricks ASSET BUNDLE #databricks #data You tube link - https://lnkd.in/g5w_XfYT Databricks Assets Bundles are an infrastructure-as-code (IaC) approach to managing your Databricks projects. Databricks Asset Bundles are a tool to facilitate the adoption of software engineering best practices, including source control, code review, testing, and continuous integration and delivery (CI/CD), for your data and AI projects. Bundles make it possible to describe Databricks resources such as jobs, pipelines, and notebooks as source files. These source files provide an end-to-end definition of a project, including how it should be structured, tested, and deployed, which makes it easier to collaborate on projects during active development. This video demonstrates how to deploy delta live table pipeline and workflow with asset bundles!
4 Comments
Like Comment
To view or add a comment, sign in
Kevin Pilch

AI & Data Consultant | Designing Lean, High-Impact AI & Cloud Solutions to Make Sense of Your Data
5mo
Report this post
Day 75/100: The future of DataOps This week, I gave a brief introduction to the main pillars of DataOps. DataOps still is a quite new concept that still needs to mature. So far, data domain experts did a good job adopting DevOps principles to data and developing an initial vision to DataOps. Compared to classic software engineering, the state of DataOps still is quite immature, and many legacy data stacks do not yet support automation and observability best practices as we know them from modern tools. Tools like Airflow pioneered the way into modern DataOps, which introduced DevOps capabilities to data. As more data teams embrace DataOps best practices, the importance of DevOps skills and solid software engineering fundamentals will continue to grow for data practitioners. #data #dataops
Like Comment
To view or add a comment, sign in
Bryan Raymond

Driving Business Outcomes with Strategic Sales Leadership
8mo
Report this post
This is HUGE!! We're talking to customers about the development life cycle on Databricks weekly and Databricks Asset Bundles (DAB) is a game changer. DAB helps companies bundle resources like jobs, pipelines, and notebooks so you can version, test, deploy, and collaborate on your project as a unit. #Databricks #DataAnalysis #ProductivityBoost

Announcing the General Availability of Databricks Asset Bundles

databricks.com
Like Comment
To view or add a comment, sign in
Vamshi Krishna

DevOps Engineer ||Docker, Kubernetes, Jenkins, |shell,python | Terraform, Ansible | Helm chat| AWS | Azure |Azure DevOps | Troubleshooting Issues|
5mo
Report this post
Kubernetes is revolutionizing machine learning (ML) and data engineering by providing robust orchestration capabilities and support for containerized applications. Explore essential aspects of using Kubernetes in these domains: **Machine Learning on Kubernetes:** - Platforms like Kubeflow, TFJob, and Argo Workflows streamline the development, orchestration, and deployment of scalable ML workloads. - Benefit from Horizontal Pod Autoscaling and Node Autoscaling for automatic scaling based on workload requirements. - Integrate tools like MLflow and DVC with Kubernetes for efficient experiment tracking and model sharing. - Leverage solutions such as KFServing and Seldon Core for high-performance deployment of ML models at scale. **Data Engineering on Kubernetes:** - Deploy Apache Spark and Apache Flink for robust data processing on Kubernetes. - Utilize Persistent Volumes, Persistent Volume Claims, and Object Storage for reliable data storage and management. - Manage workflows with Apache Airflow and Argo Workflows for scalability and fault tolerance. - Employ Prometheus, Grafana, and ELK Stack for effective monitoring within the Kubernetes environment. Unlock the advantages of Kubernetes for ML and data engineering: Scalability, Resource Efficiency, Portability, and Resilience. Keep up with the latest in ML and data engineering using Kubernetes' orchestration capabilities and containerized applications! 🚀 #Kubernetes #MachineLearning #DataEngineering #MLonK8s #DataOps
Like Comment
To view or add a comment, sign in
Raphaël Hoogvliets

Tech Lead | Follow me for MLOps stuff | Creating the future's technical debt, today
3mo
Report this post
Data versioning is a crucial aspect of MLOps. Did you know that most #machinelearning projects fail due to data issues? That’s why closely collaborating with data engineers is key. 𝗗𝗮𝘁𝗮 𝘃𝗲𝗿𝘀𝗶𝗼𝗻𝗶𝗻𝗴 - Ensures reproducibility - Facilitates auditing - Enables rollbacks You don’t need a fancy set up, a feature store, or a fully fledged MLOps platform straight away. If you can link a version of the data to your code, pipelines, and artifacts, simply by matching timestamps, you are already doing better than most teams out there! #dataengineering #datascience

3 Comments
Like Comment
To view or add a comment, sign in
Riya Talukdar

--
1w
Report this post
Just finished the course “Deploying Scalable Machine Learning for Data Science” by Dan Sullivan! Check it out: https://lnkd.in/dqA2aStA #machinelearning #datascience.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
sharad jain

Domain Technical Manager at Nokia
3w
Report this post
Just finished the course “Deploying Scalable Machine Learning for Data Science” by Dan Sullivan! Check it out: https://lnkd.in/gVaiXCBT #machinelearning #datascience.

Certificate of Completion

linkedin.com

1 Comment
Like Comment
To view or add a comment, sign in

1,287 followers

16 Posts

View Profile Connect

Rudyar Cortes, PhD’s Post

More Relevant Posts

Explore topics