Databricks Workflows and Snowflake Tasks are never going to replace Airflow While they can run a sequence of Tasks, Airflow's ability to give developers flexibility to do literally anything in python and the large range of adapters make it far more flexible You would not believe how many people I have spoken to that have Tasks calling Stored procedures, Databricks workflows being called by ADF that themselves call other Databricks workflows But the worst part is visibility Even though the Airflow UI leaves a lot to be desired, at least you can see everything in one place. This is much better than what the "native" orchestrator offer This is why you need a Control Plane like Orchestra instead of adopting the "let's just keep calling DAGs that call DAGs that call PROCs". This leads to huge technical debt and the question of "oh how do I see what's going on" when really it should be a "Ok how can we structure these pipelines optimally?" Rant over #dataengineering #workflows #orchestration
"Databricks workflows being called by ADF that themselves call other Databricks workflows" I think can somewhat be attributed to historically built pipelines/practices from past time of limitations/not knowing all features. I've worked at 2 companies where Workflows was virtually all that was needed. If you are ingesting data using APIs/etc, there is limited need for Airflow when it comes to Databricks, except for in some scenarios.
Does it run in self-hosted mode?
Google Cloud Champion Innovator | Speaker | I write to 16K followers about data engineering and data reliability
3wIf visibility across systems is the worst part (and not orchestrating execution), couldn’t that be addressed by something like open lineage?