Snowflake vs Postgres and Redshift

Decisions made early in the data architecture design process can have far-reaching consequences. Criteria such as which technology to choose, data volumes predicted and typical usage drives the design.

Snowflake is winning in a lot of comparisons, their growth has been exceptional in the past few years. They generally sit in the OLAP space (analytical) rather than OLTP (transactional). This comparison is for datasets to generally be >100Gb. Why would you choose Snowflake over Redshift in a typical AWS environment?

No maintenance overhead

When you go with Postgres RDS or Redshift, you have to perform maintenance to ensure your database is operational. In practice, this means a DevOps engineer has to provision and resize your cluster as it grows. This can be difficult depending on the security and team structures you have in place.

In contrast, Snowflake is a web-based solution that allows you to upload your data and query. There is no provisioning.

Better pricing

Postgres RDS and Redshift price based on data volumes. You provision your cluster and pay based on the size of the machines. Snowflake adopts a different strategy here. They have much cheaper (think S3) charging for data and then charge per query. This means that data sets that are rarely queried are much cheaper to maintain.

Performance

Snowflake offers very granular performance tuning that is not available in Postgres or Redshift. It can allocate CPU to specific queries to provide more resources at execution. Have a slow-running but ultra-important query? Snowflake lets you tune this to a low level that traditional RDBMS and Redshift don't.

Quality of Life Features

Snowflake offers very easy ways of sharing data between user accounts. If someone wants to share a particular table, they can find another user and add them as users to their table. This is all done through user accounts in Snowflake. Doing this via RDBMS is possible but much more clunky and forces users to use low-level DB admin commands. If a data engineer doesn't need that low-level access, snowflake has user-friendly tooling.

Different Architectures under the hood

Snowflake uses micro-partitioning. This makes it more efficient under the hood than Postgres or even Redshift. This improves join performance and provides a host of other benefits around query tuning.

Declining popularity of Redshift

BigQuery rules the roost when it comes to GCP. Redshift is not proving that popular to the data engineering community (sharing of compute/store resources etc.). Snowflake is where to be in the AWS ecosystem for large-scale data these days.

Conclusion

The market has largely spoken on Snowflake. It's continuing rise will drive adoption and companies continue to migrate from Postgres. When your Postgres and Redshift instances are proving difficult to size, you are continually investing in performance optimisations and your teams are complaining about performance, it's time to move!

Snowflake is just better at big data than Redshift or Postgres RDS. Snowflake's justified success will continue.

Jim Collins

Subscribe for the latest blogs and news updates!

data engineering

Feb 16, 2024

Change data capture and microservices

Developers have been tracking dependent API changes for years, we need to include data engineering as schema consumers that also need to be made aware of changes.