Benjamin Rogojan’s Post

View profile for Benjamin Rogojan, graphic

Fractional Head Of Data | Reach Out For Data Infra And Strategy Consults

I am constantly reminded how solutions like Snowflake, BigQuery and Databricks make my life easier as a data engineer. If you didn't start on SQL Server, Postgres, or Oracle like Jeff Skoldberg and Ryan H. who pointed out some of the ways CDWs make data professional lives easier, then you might be unaware of some of the limitations. So here are a few of the limitations or admin tasks you used to have to perform. 1. Limited Storage - With most cloud solutions you have unlimited storage(which sure, comes with an unlimited bill) but you never have to sit there and wonder if you have a temp table somewhere that is causing storage issues or just generally need to wonder if you need to migrate hardware... 2. Limited compute - This also goes for compute. If you've never had to open up a database activity monitor to see what query is holding up all your other queries, do you even DBA(just kidding nowadays you gotta worry about an accidental 10k query)? 3. General Admin - Ryan Howe covered some of this but he recently came up against having to deal with trying to release space on his database, yet after he released it it wasn't fixed. You can read more about it in the comments below. 4. Query History - Jeff Skoldberg referenced this one. Technically you can find your query history often buried in the sys tables on traditional DBS but cloud data warehouses make it so easy. You can easily find query history as well as metadata about the query, how long it took, its query profile, etc. Now I am sure there are other benefits, which I'd love to hear in the comments below, but I am also sure there are people out there who still prefer using solutions like Postgres for their DW(which I'd also like to hear about)!

Jeff Skoldberg

Cut Data Stack Cost | dbt + Snowflake + Tableau Expert | DM me for data consultation!

3mo

My first client 6 years ago, we were using Postgres for the data warehouse before they adopted Snowflake. I love Postgres, it is my favorite pre-cloud rdbms... But we were on a WebEx with the DBA every month for one issue or another. - Logs filling a partition on the Linux machine - RAM spilling to disk then not getting cleared - Out of memory errors Mostly what you said about "limited compute", but it manifests in many ugly ways.

Joe Reis

Author | Data Engineer and Architect | Recovering Data Scientist ™ | Global Keynote Speaker | Professor | Podcaster & Writer | Advisor & Investor

3mo

Constraints breed curmudgeons

Josue A. Bogran

Architect @ Kythera | Advisor to SunnyData, Sigma, and Lumel | Databricks Product Advisory Board Member & Databricks MVP

3mo

*Orchestration enters the list*

i remember having to profile a query by choice on Oracle 11g, not by default.

Yeah, I get it, "The Cloud" and all, with its mighty elasticity and storage/compute decoupling....plus "pay as you go"... Don't get me wrong: I love Snowflake and I am fond of Databricks as well, but let's be honest here for a moment.... There is no such a stark dichotomy here: just take your loved DBMS to the cloud (managed or unmanaged), that's all. That said, and for Postgres in particular, scalability to petabyte scale is not an issue nowadays.....plus, and this is a BIG plus, relational DBs have long enjoyed features that have just very recently been released by such "lake house" vendors as super awesome, "new" things.... ACID anyone?? Transactions?? Pure SQL support??? (Including user defined functions)...log based CDC?? C'mon!! Last but not least, Postgres is OSS, and comes with a huge ecosystem that allows it to do pretty much anything on top of it, or built into it already (REST APIs, message queues, polymorfic tables....). And most old, battle tested relational DBs are apt for both transactional and analytical workloads with minor or no tweaking. "Cloud" just adds lots of complexity around a project as a trade off for no capex upfront.

Andrew Muñoz

Swiss Army Knife of Oil and Gas

3mo

Safely backing up for PITR is so much more important than I realized. Much easier to replicate to cloud Postgres instance or just host there. BigQuery is also becoming a great tool for just paying for what you use. I think it’s much easier and less stress than maintaining a database server.

VICTOR IWUOHA

Data Engineer | Platform

3mo

Limited Compute.. And activity monitor.. .. :Crazy issue 😂. Nowadays, you just RUN the query.. It's scaled under the hood and you get your Snowflake bill later. Again, if the Org is small and can't afford the $$ for a CDW. I'd mostly recommend a PostgreS instance. But the architecture of schemas and tables (partitions, etc). must be planned with care for ease of migration when the Org grows bigger.

Gabrielle Zelhof

Senior Data Engineer at Relay Network

3mo

Yes, cloud is nice... I definitely don't think fondly back to on-prem DBs. But it should be noted - it's easy to hemorrhage money with cloud solutions unless you're really savvy with cloud costs and how queries run under the hood, how your storage is set up, etc. And you can STILL get issues like locked tables, lack of resources, long-running queries etc., with those cloud systems, depending on what you're using and how it's set up. Not to mention that adding another layer on top of hosting like AWS - for example Databricks, is only going to cost you even more for the added convenience. That said, my opinion is somewhat in the middle. I think the sweet spot is having a very simple cloud data solution, taking it as close to "bare metal" in the cloud as you can. AWS makes this extremely easy now with s3, glue, redshift serverless etc. You get the benefits of quick experimentation as you can easily load s3 data into all sorts of AWS-realm DBs, and if you tier storage correctly you won't be paying an arm and a leg. ** Also, to be pedantic (because I'm an engineer and we can never shake being pedantic), you can still use Postgres for your DW while also being in the cloud - RDS, or Redshift - which is based on Postgres. )

Looking back I’m kinda glad I cut my teeth on SsIS and Sql Server

See more comments

To view or add a comment, sign in

Explore topics