Data Hub Modeling glossary

Data models

Term

Definition

Data Hub vs Industry Term

Link

Calculated Column

A configurable column in a Pipeline, where records (in the new column) are calculated from those in one or more other columns already in the pipeline.

Data Hub

Calculated Column

Data Source

A resource type to capture the details necessary to extract data from a source. Also the source for the data itself. For example, a Database and API or file.

Industry

Data Source

Filter

A modeling configuration to pipeline data for a set of configurations. There are two types of filter in Data Hub:

Source - Reduces the number of rows retrieved from the source.

Pipeline - Reduces the number of rows returned on a step.

Data Hub

Filter

A modeling configuration to pipeline data for a set of configurations. There are two types of filter in Zap Data Hub:

Source: Reduces the number of rows retrieved from the source.

Pipeline: Reduces the number of rows returned on a step.

Data Hub

Source Filter

Pipeline Filter

Gateway, Data Gateway

A Windows service installed on a server (with access to company source data) and configured in ZAP Data Hub. Once running, it creates a persistent, two-way connection between the data source(s) and ZAP Data Hub.

Data Hub

Gateway

Hierarchy

An arrangement of data consisting of sets and subsets where every subset is of lower rank than the set.

Industry

Hierarchy

In Column, Out Column

Each step takes a set of columns (in-columns) and returns a different set of columns (out-columns) based on the step type of any look-ups and calculations.

Note

Out-columns are bold on the step they are first introduced.

Data Hub

Managing Step Columns

Index

Data structure that improves the retrieval speed of data from a database table. Indexes are used to quickly find data without having to search every row in a database table every time the table is accessed.

Industry

Index

Model Server

A configurable resource containing settings for connection to the warehouse.

Data Hub

Model Server

Model, Data-Model

Data modeling complements existing ZAP Data Hub functionality by building cubes from source data.

Data Hub

Model

Modeling, Data-Modeling

Umbrella term for the design functionality of ZAP Data Hub in setting up a data model to consolidate and process data.

Industry

Data-Modeling

Pipeline

A configurable resource that defines a source, steps that transform or augment the data, and a destination. the destination can be a warehouse table or a SQL statement that combines multiple tables.

The industry term for this is data pipeline.

Data Hub

Pipelines

Process, Model Process

A single data refresh for ZAP Data Hub. Can include any of the following (a full process includes all):

  • A refresh of staging table data from the source.

  • A refresh of warehouse data by transforming the staging data.

  • A refresh of any other destination e.g. API endpoint or Cube.

Note

By default, a manually triggered process will include a Publish.

Data Hub

Process

Process Configuration

A configurable resource that defines what is processed, and how / when the process takes place. For example, a process configuration could be used to process a select group of sales-related pipelines every hour during business hours to provide timely data. 

Data Hub

Process Configuration

Publishing

Publishing a model saves a copy of a model that has processed successfully, and so is therefore guaranteed to process again successfully.

Data Hub

Publishing

Relationship

Relationships define how the pipelines in the model are connected to one another, by specifying the columns they have in common. 

Industry

Relationship

Staging, Staging Tables

Transient structures that hold table data (from linked data sources) prior to processing.

Data Hub

Staging

Steps, Pipeline Steps

A pipeline setting that configures the transformations to be applied to column(s) in the pipeline. Examples include a column union, a look-up, value filtering, etc.

Data Hub

Steps

Unified Layer / Unified Data Model

A Unified Layer is a resource that provides a list of columns (and types) that a pipeline must output. A Unified Layer also adds column descriptions to the output. Each pipeline can only have a single Unified Layer applied. A collection of Unified Layers is a Unified Data Model.

ZAP solution models use the ZAP Unified Data Model to standardize pipeline outputs making it possible for the same analytics to work across multiple models.

Data Hub

Unified Layer

Warehouse, Data Warehouse

The principle structured output resulting from processing the user's data model. Stored in a SQL database, the processed warehouse is the basis for an OLAP cube (optional) which would be used to generate required analytics.

Industry

Data Warehouse

Warehouse Tables

A Warehouse Table (or Dimension Table) is a table in a data warehouse.

Industry

Warehouse Tables