Data Hub Modeling glossary
Data models
Term | Definition | Data Hub vs Industry Term | Link |
Calculated Column | A configurable column in a Pipeline, where records (in the new column) are calculated from those in one or more other columns already in the pipeline. | Data Hub | |
Data Source | A resource type to capture the details necessary to extract data from a source. Also the source for the data itself. For example, a Database and API or file. | Industry | |
Filter | A modeling configuration to pipeline data for a set of configurations. There are two types of filter in Data Hub: Source - Reduces the number of rows retrieved from the source. Pipeline - Reduces the number of rows returned on a step. | Data Hub | |
Filter | A modeling configuration to pipeline data for a set of configurations. There are two types of filter in Zap Data Hub: Source: Reduces the number of rows retrieved from the source. Pipeline: Reduces the number of rows returned on a step. | Data Hub | |
Gateway, Data Gateway | A Windows service installed on a server (with access to company source data) and configured in ZAP Data Hub. Once running, it creates a persistent, two-way connection between the data source(s) and ZAP Data Hub. | Data Hub | |
Hierarchy | An arrangement of data consisting of sets and subsets where every subset is of lower rank than the set. | Industry | |
In Column, Out Column | Each step takes a set of columns (in-columns) and returns a different set of columns (out-columns) based on the step type of any look-ups and calculations. NoteOut-columns are bold on the step they are first introduced. | Data Hub | |
Index | Data structure that improves the retrieval speed of data from a database table. Indexes are used to quickly find data without having to search every row in a database table every time the table is accessed. | Industry | |
Model Server | A configurable resource containing settings for connection to the warehouse. | Data Hub | |
Model, Data-Model | Data modeling complements existing ZAP Data Hub functionality by building cubes from source data. | Data Hub | |
Modeling, Data-Modeling | Umbrella term for the design functionality of ZAP Data Hub in setting up a data model to consolidate and process data. | Industry | |
Pipeline | A configurable resource that defines a source, steps that transform or augment the data, and a destination. the destination can be a warehouse table or a SQL statement that combines multiple tables. The industry term for this is data pipeline. | Data Hub | |
Process, Model Process | A single data refresh for ZAP Data Hub. Can include any of the following (a full process includes all):
NoteBy default, a manually triggered process will include a Publish. | Data Hub | |
Process Configuration | A configurable resource that defines what is processed, and how / when the process takes place. For example, a process configuration could be used to process a select group of sales-related pipelines every hour during business hours to provide timely data. | Data Hub | |
Publishing | Publishing a model saves a copy of a model that has processed successfully, and so is therefore guaranteed to process again successfully. | Data Hub | |
Relationship | Relationships define how the pipelines in the model are connected to one another, by specifying the columns they have in common. | Industry | |
Staging, Staging Tables | Transient structures that hold table data (from linked data sources) prior to processing. | Data Hub | |
Steps, Pipeline Steps | A pipeline setting that configures the transformations to be applied to column(s) in the pipeline. Examples include a column union, a look-up, value filtering, etc. | Data Hub | |
Unified Layer / Unified Data Model | A Unified Layer is a resource that provides a list of columns (and types) that a pipeline must output. A Unified Layer also adds column descriptions to the output. Each pipeline can only have a single Unified Layer applied. A collection of Unified Layers is a Unified Data Model. ZAP solution models use the ZAP Unified Data Model to standardize pipeline outputs making it possible for the same analytics to work across multiple models. | Data Hub | |
Warehouse, Data Warehouse | The principle structured output resulting from processing the user's data model. Stored in a SQL database, the processed warehouse is the basis for an OLAP cube (optional) which would be used to generate required analytics. | Industry | |
Warehouse Tables | A Warehouse Table (or Dimension Table) is a table in a data warehouse. | Industry |