A High-Level Overview of Snowflake
Using a modern data warehouse, like Snowflake, can give your organization improved access to your data and dramatically improved analytics. When paired with a BI tool, like Tableau, or a data science platform, like Dataiku, you can gain even faster access to impactful insights that help your organization fuel innovation and drive business decisions.
In this post, we’ll provide a high-level overview of Snowflake, including a description of the tool, why you should use it, pros and cons, and complementary tools and technologies.
Overview of Snowflake
Snowflake was built from the ground up for the cloud, initially starting on AWS and scaling to Azure and GCP. With no servers to manage and near-unlimited scale in compute, Snowflake separates compute from storage and charges based on the size and length of time that compute clusters (known as “virtual warehouses”) are running queries.
- Cross cloud lets organizations choose the cloud provider to use
- Dynamic compute scaling saves on cost
- Micro-partitioned storage with automatic maintenance
- Rapid auto-scaling of compute nodes allows for increased cost savings and high concurrency on demand, and compute and storage are separated
- Built for MPP (massive parallel processing)
- Optimized for read via a columnar backend
- Dedicated compute means no concurrency issues
- Ability to assign dedicated compute
- High visibility into spend
- Native support for JSON, XML, Avro, Parquet, and ORC semi-structured data formats
- SnowSQL has slight syntax differences
- Introduction of Snowpark for Snowflake native development
- Full visibility into queries executed, by whom, and how long they ran
- Precision point-in-time restore available via “time-travel” feature
Why Use Snowflake
Decoupled from cloud vendors, it allows a true multi-cloud experience. You can deploy on Azure, AWS, GCP, or any combination of those cloud services. With near-unlimited scale and minimal management, it offers a best-in-class data platform but with a pay-for-what-you-use consumption model.
Pros of Snowflake
- Allows for a multi-cloud experience built on top of existing AWS, Azure, or GCP resources, depending on your preferred platform
- Highly-performant queries utilizing uniquely provisioned pay-as-you-go compute and automatically derived partitioning
- Easy implementation of security and role definitions for less frustrating user experience and easier delineation of cost while keeping data secure
- Integrated ability to share data to partners or other consumers outside of an organization and supplement data with publicly available datasets within Snowflake
Cons of Snowflake
- Ecosystem of tooling continues to grow as adoption expands, but some features are not readily available
- Due to the paradigm shift in a cloud-born architecture, taking full advantage of Snowflake’s advanced features requires a good understanding of cloud data architecture
Select Complementary Tools and Technologies for Snowflake
- Apache Kafka
- AWS Lambda
- Azure Data Factory
- Power BI
We hope you found this high-level overview of Snowflake helpful. If you’re interested in learning more about Snowflake or other modern data warehouse tools like Amazon Redshift, Azure Synapse, and Google BigQuery, contact us to learn more.
The content of this blog is an excerpt of our 2021 Modern Data Warehouse Comparison Guide. Click here to download a copy of that guide.
Originally published at https://aptitive.com on November 2, 2021.