How Do You Build a Data Warehouse?

September 9, 2019

The short answer is that there are three methods:

  1. You can custom build your own data warehouse (the most difficult and time-intensive method).
  2. You can use a data warehouse service (like Amazon Redshift, Snowflake, Panoply—still time intensive but less work than building a custom DWH).
  3. You can use an end-to-end business intelligence platform that includes data warehousing (the fastest and most direct option, but also the least robust).

The long answer is that it depends on a lot of different factors (which is everyone’s least favorite response).

What’s the role of a data warehouse in business intelligence?

A data warehouse stores massive amounts of data (years of data). Because of its expansive size, it enables your data analyst to perform complex queries that help you dig deep.

How do you build a data warehouse? | Grow

But a data warehouse, while important, is not the beginning and end of business intelligence. Equally important are the systems that support and depend on a data warehouse: your ETL, your analytics software, your data visualization tools (to name a few). Whichever of the three building methods you choose in the list above, you’re going to have to configure your data warehouse with the rest of the tools in your stack.

In this blog post, we’ll discuss the process of building a business intelligence stack around a data warehouse.

Where your data warehouse fits.

Before your data can be stored in your data warehouse, it must be properly cleaned and prepped.

After data is stored in your data warehouse, it's queried and used to create data visualizations.

The data warehouse is sandwiched neatly between the cleaning and prepping layer (ETL), and the querying and visualization layer (BI).

ETL—data prep and normalization

ETL stands for Extract, Transform, Load – the three functions that can be combined into a single tool to prepare your raw data for storage and subsequent analysis.

Your reporting systems (your CRM, ERP, etc) will invariably report data in different formats. In order for your data to be queried all together, it needs to be normalized.

SQL-fluent data analysts should be in charge of your ETL process, ensuring integration with all of your data sources and transforming raw data to normalized data centralized in your data warehouse for subsequent retrieval.

  1. Extract: The data is read from the original sources and the desired subsets extracted for processing.
  2. Transform: The extracted data is cleaned, removing unwanted information and duplicates, and standardized into a recognizable and consistent format.
  3. Load: The cleaned and prepped data is organized and stored in the data warehouse, where it can be queried on demand.

Data warehouse—storage

How do you build a data warehouse? | Grow

Your data warehouse holds your cleaned and prepped data, typically organized in files and folders for easy querying, retrieval, and comparison.

How your data is organized inside your warehouse will dictate how easy and intuitive it is to create metrics. By normalizing your data from different sources into a single easily recognized format, you create optimal conditions for data retrieval, comparison, matching, and pattern spotting.

Business intelligence layer—analytics and visualization

The business intelligence layer is designed to pull the prepped data from the data warehouse in order to build metrics and create visualizations.

Since a data warehouse can hold massive amounts of data that has been gathered from different sources and normalized, you can track patterns over the long term, helping to drive predictive analysis, identify “trigger points,” and suggest next actions.

Building Your Data Warehouse

Option 1: Custom Building Your Own Data Warehouse

Custom building your own data warehouse is a massive development project. It needs to be organized to align with the quantitative measurements used by your business to measure activity (the business objectives of a digital marketing agency are going to look very different from an ecommerce company’s business objectives).

Your data warehouse will also have to be built to communicate and integrate with your data sources, in addition to the other tools in your business intelligence stack (more on that below).

You will then need to configure your own server to support it, dedicate processing power to its management, and deploy a fast server connection to allow your users to access your data warehouse.

There are only a few cases where custom-building a data warehouse is the best option. In most cases, however, the cost and time required to build a data warehouse is prohibitive.

Unless you have the resources to build and maintain a data warehouse, exact knowledge of how you need your data warehouse to be built, and access to a team that understands the finer points of data warehouse construction, you’re probably better off using one of the services that provide data warehouses.

Option 2: Using a Cloud-Hosted Data Warehouse

Alternately, you can select a cloud service to host your data warehouse. In this case, you remove the need to configure the hardware, and if you choose a quality service, access should be fast and easy. The downside to this option is the expense.

Option 3: Using an End-to-End Business Intelligence Platform

An end-to-end platform will not be as robust as a custom data warehouse (even if it does include data warehousing). That being said, unless you’re a massive enterprise business it’s likely that your best option is an end-to-end platform.

(If you’re still unsure whether you need a custom data warehouse or not, you can see our checklist).

An end-to-end platform combines data warehousing storage capabilities with ETL, data visualization, and analytics. It’s an effective one-stop shop.

how do you build a data warehouse? | Grow

Tackling data warehousing

One final word about data warehouses: they’re not absolutely necessary.

They’re a powerful tool and extremely helpful, but they aren’t vital to business intelligence now like they were a decade ago. If you’re on the fence about whether or not you should build a data warehouse, make sure you consider whether or not an alternative system is helpful.

If you're looking for a new, end-to-end business intelligence solution you could give Grow a try. Grow is designed to deliver the power of ETL, data warehousing, and business intelligence in a single SaaS solution, giving you and everyone on your team the tools you need to use big data to its full potential. Ready to see it in action for yourself? Let us know if you’d like to start a free trial.

Are You Ready for Some Fantasy Football?

Are You Ready for Some Fantasy Football?

Read More ›
Who’s Afraid of the Dark Data?

Who’s Afraid of the Dark Data?

Read More ›
How Your Data Moves Through a Data Warehouse

How Your Data Moves Through a Data Warehouse

Read More ›
Join the 1,000s of business leaders winning with grow.

Request a free trial & unlock the answers hiding in your data.