Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application. This chart bootstraps an Apache superset deployment on a Kubernetes cluster using the Helm package manager.

Apache Superset is a data exploration and visualization web application.

Superset provides:

  • An intuitive interface to explore and visualize datasets, and create interactive dashboards.
  • A wide array of beautiful visualizations to showcase your data.
  • Easy, code-free, user flows to drill down and slice and dice the data underlying exposed dashboards. The dashboards and charts act as a starting point for deeper analysis.
  • A state of the art SQL editor/IDE exposing a rich metadata browser, and an easy workflow to create visualizations out of any result set.
  • An extensible, high granularity security model allowing intricate rules on who can access which product features and datasets.
    Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, …)
  • A lightweight semantic layer, allowing to control how data sources are exposed to the user by defining dimensions and metrics
    Out of the box support for most SQL-speaking databases
  • Deep integration with Druid allows for Superset to stay blazing fast while slicing and dicing large, real-time datasets
    Fast loading dashboards with configurable caching

Database Support

Superset speaks many SQL dialects through SQLAlchemy, a Python ORM that is compatible with most common databases.

The superset can be used to visualize data out of most databases:

  • MySQL
  • Postgres
  • Vertica
  • Oracle
  • Microsoft SQL Server
  • SQLite
  • Greenplum
  • Firebird
  • MariaDB
  • Sybase
  • IBM DB2
  • Exasol
  • MonetDB
  • Snowflake
  • Redshift
  • Clickhouse
  • Apache Kylin
  • more! look for the availability of an SQLAlchemy dialect for your database to find out whether it will work with Superset

Features

  • A rich set of data visualizations
  • An easy-to-use interface for exploring and visualizing data
  • Create and share dashboards
  • Enterprise-ready authentication with integration with major authentication providers (database, OpenID, LDAP, OAuth & REMOTE_USER through Flask AppBuilder)
  • An extensible, high-granularity security/permission model allowing intricate rules on who can access individual features and the dataset
  • A simple semantic layer, allowing users to control how data sources are displayed in the UI by defining which fields should show up in which drop-down and which aggregation and function metrics are made available to the user
  • Integration with most SQL-speaking RDBMS through SQLAlchemy
  • Deep integration with Druid.io

Selection process

We used Superset in our project for a fitness mobile app with a huge fast-growing customer base. On the one hand, a BI tool was requested by business stakeholders, who needed a number of specific reports to monitor trend changes in application usage and better understand customer behavior. On the other hand, a BI tool could be used by our data science team to perform exploratory data analysis in relation to different user cohorts before building Machine Learning models.

We needed a tool that would satisfy the following requirements:

  • interactivity. The BI users from the marketing department wanted to have interactive filters on many fields of different field types, e.g., string, date, integer filters.
  • no coding. Our users were mostly marketing professionals so it was supposed that all the functionality should be accessible via buttons and other controls.
  • completely free! We were looking for an open source data visualization tool.
  • After searching for available solutions, we selected Superset and Pentaho for further evaluation.

Superset was seen as a more attractive tool for us for the following reasons:

  • Superset visualizations looked more appealing to us. Both the customer and our team loved the visualizations right away.
  • The superset is completely open-source and implemented in Python, while Pentaho is in Java. It was a big plus for us because we were mostly a pythonic team. And running ahead, I should say that it served us really well.
  • The superset is quite a new tool, so we were interested in testing it. We were already familiar with Pentaho from the previous projects but we were not completely satisfied with it.

A short introduction to the tool

The superset is a data exploration platform designed to be visual, intuitive and interactive. Superset’s main goal is to make it easy to slice, dice and visualize data. Its developer claims that Superset can perform analytics at the speed of thought. As we have already mentioned the open source data visualization tool is written in pythonic web framework Flask.

This project was originally named Panoramix, was renamed to Caravel in March 2016, and is currently named Superset as of November 2016. Source.

Key Features

  • Superset supports 30 types of visualizations
  • Airbnb uses Superset with Druid, but it accepts all the data sources that support SQL Alchemy.
  • configurable caching options for loading dashboards
  • easy to use the constructor for visualizations

Tell us about a new Kubernetes application

Newsletter

Never miss a thing! Sign up for our newsletter to stay updated.

About

Discover and learn about everything Kubernetes

Navigation