-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Setting up the prebuilt views in Redshift and PostgreSQL
HOME > SNOWPLOW SETUP GUIDE > Step 6: Get started analyzing Snowplow data > Setting up the prebuilt views (cubes and recipes) in Redshift and PostgreSQL
Snowplow data is stored in a single events table, where the complete event stream for each visitor can be viewed.
Whilst this gives analysts immense power to crunch Snowplow data in a huge number of ways to answer a wide variety of business questions, it can be daunting for new Snowplow users to get started analyzing the Snowplow data set.
In order to help analysts get productive with Snowplow data faster, we've shipped Snowplow with a number of views, that make certain common analyses much easier, because rather than craft SQL queries themselves, analysts can simply retrieve views of the data that we have already created for them.
This guide covers adding those views to your Snowplow database
Setting up the views in your database is straightforwards: you simply need to run the different .sql
files in the analytics section of the repo.
First, we need to identify the relevant SQL files in the repo. If you are running Redshift, they are here, if you are running PostgreSQL, they are here. Now you can use psql
to run the SQL files into your Redshift / Snowplow database as follows:
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f recipes/recipes-basic.sql
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f recipes/recipes-catalog.sql
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f recipes/recipes-customers.sql
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f cubes/cube-pages.sql
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f cubes/cube-visits.sql
$ psql -h <HOSTNAME> -U <USERNAME> -d <DATABASE NAME> -p <PORT> -f cubes/cube-transactions.sql
You will need to replace <HOSTNAME>
, <USERNAME>
, <DATABASE NAME>
and <PORT>
with the appropriate values for your database setup.
Note that you need to run cube-visits.sql
before you run cube-transactions.sql
.
Once you have setup the views, you should be able to see them in your database.
Log into your database using your favorite front end (e.g. Navicat):
You should be able to see the new schemas alongside the atomic
schema that contains the raw events table, and the public
schema that is there by default:
recipes_basic
recipes_customer
recipes_catalog
cubes_ecomm
cubes_pages
cubes_visits
Note that for Redshift users, in some database front ends, the new schemas and views will only be accessible if you update the Redshift search path to include the new schemas. Doing this is covered in the Redshift setup guide. Note that even without adding the new schema to the Redshift search path you should still be able to see those actual views by executing a query against them e.g.
SELECT *
FROM recipes_basic.uniques_and_visits_by_day
Contrast the results with viewing the atomic.events
table directly:
We recommend that you experiment by viewing and plotting the output of the different views, and examine the SQL that was used to generate them (in the files in the repo). By modifying these in simple ways, you can craft your own, bespoke analyses, to meet your own particular business needs.
Home | About | Project | Setup Guide | Technical Docs | Copyright © 2012-2021 Snowplow Analytics Ltd. Documentation terms of use.
HOME » SNOWPLOW SETUP GUIDE » Step 6: Get started analysing Snowplow data
- Step 1: Setup a Collector
- Step 2a: Setup a Tracker
- Step 2b: Setup a Webhook
- Step 3: Setup Enrich
- Step 4: Setup alternative data stores
- Step 5: Data modeling
-
Step 6: Analyze your data!
- 6.1: setting up the default recipes and cubes
- 6.2: setting up Looker to visualize your data
- 6.3: setting up ChartIO to visualize your data
- 6.4: setting up Excel to analyze and visualize your data
- 6.5: setting up Tableau to perform OLAP analysis on your data
- 6.6: setting up R to perform more sophisticated data analysis
- 6.7: get started analyzing your data in EMR and Hive
- 6.8: using Qubole to analyze your data in S3 using Hive and Pig
Useful resources