Modeling function listing

Overview

Calculated column function templates are user friendly function templates for creating calculated columns in pipelines. They provide predefined SQL-based functions, allowing users to create basic calculated columns on pipelines without knowledge of SQL expressions.

Calculated column templates

This page provides links to information for each calculated column function template.

Aggregate

Function

Description

Average

Outputs the average of the values in the Input Column, grouped by the categories in the Partition Columns provided.

Count

Counts the number of non-NULL values in the Input Column, grouped by the Partition Column(s) provided.

Maximum

Outputs the maximum value in the Input Column, grouped by the categories in the Partition Columns provided (optional).

Minimum

Outputs the minimum value in the Input Column, grouped by the categories in the Partition Columns. provided (optional).

Standard Deviation

Outputs the standard deviation of values in the Input Column, treated as a sample (n = 'number of items' - 1) and grouped by the categories in the Partition Columns provided.

Standard Deviation (Population)

Outputs the standard deviation of values in the Input Column, treated as a population (n = 'number of items') and grouped by the categories in the Partition Columns provided.

Sum

Outputs the sum of values in the Input Column, grouped by the categories in the Partition Columns provided.

Variance

Outputs the variance of values in the Input Column, treated as a sample (n = 'number of items' - 1) and grouped by the categories in the Partition Columns provided.

Variance (Population)

Outputs the variance of values in the Input Column, treated as a population (n = 'number of items') and grouped by the categories in the Partition Columns provided.

Analytic

Function

Description

Cumulative Distribution

Returns the relative position of rows based on a cumulative distribution for the column(s) selected in the Order By step.

First Value

This function will return the first value in the Input Column after sorting (Order By) and grouping (Partitioning).

Last Value

This function will return the last value in the Input Column after sorting (Order By) and grouping (Partitioning).

Offset - Lag

Returns data from preceding rows in the Input Column selected.

Offset - Lead

Returns data from subsequent rows in the Input Column selected.

Percentile (Continuous Distribution)

Returns values representing the Percentile for the Order By column(s) within each partition (optional). This function uses a continuous distribution to determine percentiles and returns a computed value that may not exist in the input column.

Percentile (Discrete Distribution)

Returns values representing the Percentile for the Order By column(s) within each partition (optional). This function uses a discrete distribution to determine percentiles and returns a value from the input column.

Percentile Rank

Returns the rank of rows as a percentile for the column(s) selected in the Order By step.

JSON

Function

Description

JSON String Validity Test

This function determines if there are valid JSON strings in the selected column.

Modify JSON Value

This function returns the input JSON column, with the New Value at the JSON Path specified.

Query JSON (Object)

This function returns the result of the Input JSON Query Expression.

Query JSON (Value)

Returns the value at the specified JSON Path for the Input Column.

Ranking

Function

Description

Ranking Functions

Provides several ranking functions:

Rank: Returns a numerical rank, using duplicate numbers where there are duplicate entries in the Input Column (such as: 1, 2, 2, 2, 5, 6).

Dense Rank: Similar to Rank but values are not skipped after duplicates (such as: 1, 2, 2, 2, 3, 4).

Row Number: Similar to Rank but numbers all rows sequentially, regardless of duplicates (such as: 1, 2, 3, 4, 5, 6).

String

Function

Description

Concatenate

Concatenates (joins) the Input Columns, separated by the Delimiter provided.

Left (Truncate)

This function is used for truncating text, retaining the number of characters Length specified and will retain characters from the left of the string.

Replace String

This function is used to find and replace content in the Input Column. When the Search Pattern is discovered in the Input Column, it will be replaced with the Replacement String.

Right (Truncate)

This function is used for truncating text, retaining the number of characters Length specified and will retain characters from the right of the string.

Trim After Pattern

This function truncates the input column based on the search pattern provided. All content after (and including) the first match for the input search pattern will be removed for each cell.

Trim Before and After Pattern

This function truncates the input column based on the search pattern provided. All content before the first match and after the last match for the input search pattern will be removed for each cell (including the matched content).

Trim To Pattern

This function truncates the input column based on the search pattern provided. All content before (and including) the first match for the input search pattern will be removed for each cell.

Trim Whitespace

Trims leading and trailing whitespace from data in the Input Column.

Time

Function

Description

Create Date

Joins the input Year, Month, and Day columns provided to produce a date.

Date Offset (Date Add)

The date in the Input Date Column will be offset based on the Amount of Offset and the Measure of Time (or Units of Time).

Day from Date Column

Returns the day (DD) value from the Input Date Column.

Difference between Dates

Calculates the difference between two date, time, or datetime columns. The result is returned in the Measure of Time (Units) selected.

Extract Date Component

This function returns the Date Component selected of the input Date, Time, or DateTime data.

Month From Date Column

Returns the month (MM) value from the Input Date Column.

Time Zone Conversion

Converts the source datetime column to another time zone.

Year From Date Column

Returns the year (YYYY) value from the Input Date Column.

Other

Function

Description

First Non-NULL Value

Returns the first non-NULL value in each row for the selected Input Columns.

If (Complex)

The If (Complex) function will evaluate the When, Then, and Else expressions provided.

If (Simple)

The If (Simple) function will evaluate the Input Expression based on the When, Then, and Else expressions provided.

Random Number Between

Returns random integers between Upper and Lower limits provided. The Upper and Lower limits are included in the range of values generated.

SQL Expression

You can enter a scalar (single value) SQL expression to define a new column using the +Calculation button.