Data has become the lifeblood of any modern enterprise. Making decisions on the basis of hunches and intuition can be valuable (assuming your hunches are correct most of the time). But to operate a medium- or large-size business, you need solid data and a way to extract actionable meaning from that data. Effective use of data is central to many companies’ digital transformation initiatives.
The trouble is that modern businesses accumulate data at an ever-accelerating pace. The number of data sources is exploding as well, and now includes innumerable internet-of-things (IoT) sensors and other devices, as well as streaming sources.
Managing, storing, protecting, and analyzing all that data has become more than traditional IT and business intelligence (BI) teams can handle. And the traditional on-premise data center has become a bottleneck that prevents analysts from uncovering valuable insights from enterprise data.
Enterprises have looked to the cloud for solutions, and the cloud has provided.
Several cloud-based database and data warehousing solutions have emerged, and enterprises are finding the value propositions of these solutions quite compelling.
One of the newer cloud-based data warehousing solutions on the market is called Snowflake, developed and supported by a company of the same name. Snowflake has attracted keen interest because of its major advantages:
- Availability on the three major cloud service providers (Amazon Web Services, Microsoft Azure, and Google Cloud Platform)
- Cloud-only design that takes full advantage of the benefits of a cloud environment
- Unique architecture that separates storage, compute, and cloud services function into independent, scalable layers
For all of its advantages, how does Snowflake measure up to its alternatives? Let’s have a look:
Snowflake vs. On-Premise Solutions
Numerous data warehousing systems are available for on-premise or private cloud deployment. Many of them have been around for a long time and are quite mature, such as Microsoft SQL Server, Teradata, and Oracle Exadata.
Because these alternatives do their jobs on your own network, data transfer speeds are naturally much higher than with Snowflake or any other cloud solution. However, the Snowflake makes up for this disadvantage in several ways:
- Multi-level scalable architecture means storage and computing resources can scale up or down as needed according to changing demands. It can accommodate any number of concurrent users. Data object contention, record-locking, and table-locking issues are eliminated. These advantages can’t be found in on-premise solutions deployed on physical servers in a local data center.
- Snowflake offers a data-sharing service that enables entities (your customers, vendors, or other business partners) to share data with one another without copying, storing, or transmitting separate data files.
- Snowflake requires no installation, minimal configuration, and no hardware to purchase or maintain. All performance-tuning is handled automatically in the background.
One area in which on-premise solutions have a clear advantage over Snowflake is transactional data. On-premise solutions can manage or integrate with transactional data.
Snowflake is a cloud-only solution.
It’s not designed to handle or manage transactional data, and even if it were, the network connection to the cloud could become a bottleneck if it had to the handle thousands of transactions per second seen by larger enterprises. Snowflake is optimized as a data warehouse solution to store summarized data and perform queries.
Snowflake vs. Other Cloud Solutions
Snowflake’s main competitors in cloud-based data warehousing include Amazon Redshift, Google BigQuery, and Microsoft Azure SQL Data Warehouse. They run only in their respective cloud services, but Snowflake can run on any or all of them—it’s “cloud agnostic” and can operate on data within or across different cloud services.
Snowflake is more flexible and is ideal for enterprises that have a multi-cloud data landscape.
What the Reviewers Say
Reviewers of Snowflake are in general agreement on its advantages and disadvantages:
- Because it’s compatible with ANSI standard SQL, it has a gentle learning curve and can be used right away by analysts of all skill levels.
- It features separate interfaces for data analysts and data engineers. The interfaces are optimized for the respective needs and skillsets.
- Snowflake supports many structured, semi-structured, and unstructured data formats; ORC, XML, JSON, and more.
- Snowflake’s per-second pricing means customers pay for only the resources they use.
- The security scheme is highly granular. This enables administrators to exercise tight control over sensitive data and simplifies compliance with various data security regulations.
- Error messages are sometimes cryptic and unhelpful.
- The system does not easily integrate with traditional databases, complicating ETL management.
- The system has limited support for stored procedures.
- The web-based BI workbench suffers from poor UI design and occasionally slow page loads.
As you can see, the disadvantages tend to be more technical in nature; these are shortcomings that will likely be addressed as the product matures.
THE Bottom Line
Snowflake is a robust data warehousing tool that enables customers to take advantage of all that cloud computing has to offer without the headache of setting up and optimizing a cloud environment.
For companies that are outgrowing their ability to build and operate their data warehouses using local resources, Snowflake offers a combination of flexibility, reliability, and performance that deserves an in-depth look.