
When using a modern data warehouse, like Google BigQuery, your organization will likely see improved access to your data and dramatically improved analytics. If you pair a modern data warehouse with a BI tool, like Power BI, or a data science platform, like Dataiku, your organization can more quickly gain access to impactful insights that help you fuel innovation and drive business decisions.
In this post, we’ll provide a high-level overview of Google BigQuery, including a description of the tool, why you should use it, pros and cons, and complementary tools and technologies.
Overview of Google BigQuery
Google’s homegrown columnar database service, BigQuery was built for querying massive data sets. It is deeply embedded within the GCP ecosystem with pricing charged by the amount of data queried.
Value Prop:
- Ability to train and use machine learning models right in the database
- More control in how data is partitioned
Scalability:
- Google does all resource provisioning behind the scenes, so the infrastructure is managed for the organization
Performance:
- BigQuery transparently and automatically provides highly durable, replicated storage in multiple locations and high availability with no extra charge
- Materialized views allow for accelerated query performance and reduced costs
Features:
- BigQuery BI Engine: an in-memory analysis service built into BigQuery that enables users to analyze large and complex datasets interactively with sub-second query response time and high concurrency
- A high-speed streaming insertion API provides a powerful foundation for realtime analytics
Security:
- Data encrypted at rest and in transit by default
- Ability to restrict data access for users at the column level
Why Use Google BigQuery
With very little to manage, BigQuery excels at analyzing extremely large data sets. Existing GCP customers can leverage the tight integration with other GCP services.
Pros of Google BigQuery
- Fully managed platform that does not require downtime for updates and automatically ensures high availability and geo-redundancy (physical separation of data centers that span multiple geographic locations to increase resiliency)
- Low cost of storage with speed among the best in the industry for very large data sets
- BigQuery Omni allows for querying data across cloud platforms, including Azure, AWS, and GCP
- Excels at analyzing extremely large data sets and automatically uses artificial intelligence to optimize storage
Cons of Google BigQuery
- Queries that have not been performance-tuned and queries returning a lot of redundant data can become costly very quickly
- Works best with flat tables, which can make managing an enterprise data model difficult
- Tooling support outside of the GCP ecosystem is often lacking compared to other platforms
Select Complementary Tools and Technologies for Google BigQuery
- Google AI Hub
- Google Cloud Dataflow
- Google Data Studio
- Dataiku
- Looker
- Tableau
We hope you found this high-level overview of Google BigQuery helpful. If you’re interested in learning more about Google BigQuery or other modern data warehouse tools like Amazon Redshift, Azure Synapse, and Snowflake, contact us to learn more.
The content of this blog is an excerpt of our 2021 Modern Data Warehouse Comparison Guide. Click here to download a copy of that guide.
Originally published at https://aptitive.com on July 15, 2021.