Transformation Operations

The following operations can be used within the transformation mapping by setting the type to "operation" and specifying the name and parameters.


map

Maps a source value to a different destination value based on a predefined dictionary.

Mapping Fields

FieldNotes
source_pathThe dot notation JSON path to extract data from the input document.
destination_pathThe JSON path where the mapped value will be placed in the output document.

Additional Parameters

FieldTypeDescription
maparrayAn array of objects, each containing a from and to key. The from key is a regular expression to match against the source value.
altstringAn alternative value to use if no match is found in the map.

Example

Map product types to a description

{
  "type": "operation",
  "name": "map",
  "source_path": "product.type",
  "destination_path": "product.description",
  "parameters": {
    "alt": "Other Products",
    "map": [
      {
        "from": "^toys.*$",
        "to": "Toys & Games"
      },
      {
        "from": "^audio.*$",
        "to": "Audio Electronics"
      },
      {
        "from": "^pc.*$",
        "to": "Laptop & Desktop Computers"
      }
    ]
  }
}

hash

Generates a hash from the concatenated values of one or more fields.

Mapping Fields

FieldNotes
destination_pathThe JSON path where the calculated hash value will be placed in the output document.
destination_formatIf set to hexadecimal for a hex result, otherwise a GUID is generated.

Additional Parameters

Field

Type

Description

fields

array

A list of JSON paths to the fields whose values will be concatenated and hashed.

type

string

The hashing algorithm to use. Supported values are

  • MD5
  • SHA-256 (default).

Example

Generate an id from a handle and time stamp.

{
  "type": "operation",
  "name": "hash",
  "destination_path": "id",
  "destination_format": "hexidecimal",
  "parameters": {
    "type": "MD5",
    "fields": [
      "handle",
      "timestamp"
    ]
  }
}

concat

Joins the string values of multiple fields together.

Mapping Fields

FieldNotes
destination_pathThe JSON path where the formatted hash value will be placed in the output document.

Additional Parameters

FieldTypeDescription
fieldsarrayA list of JSON paths to the fields whose values will be concatenated.
separatorstringA string to insert between each value. Defaults to an empty string.

Example

Generate an author full name from the first and last name

{
  "type": "operation",
  "name": "concat",
  "destination_path": "author.full_name",
  "parameters": {
    "separator": " ",
    "fields": [
      "author.first_name",
      "author.last_name"
    ]
  }
}

format

Creates a formatted string from the values of one or more fields.

Mapping Fields

FieldNotes
destination_formatThe JSON path where the formatted value will be placed in the output document.

Additional Parameters

FieldTypeDescription
fieldsarrayA list of JSON paths to the fields whose values will be used in the format string.

Example

Generate an author full name from the first and last name

{
  "type": "operation",
  "name": "format",
  "destination_path": "author.full_name",
  "destination_format": "{0} {1}",
  "parameters": {
    "fields": [
      "author.first_name",
      "author.last_name"
    ]
  }
}

extract

Extracts a value from a string using a regular expression.

Mapping Fields

Field

Description

source_path

The dot notation JSON path to extract data from the input document.

format

The regular expression to use for extraction.

destination_path

The JSON path where the extracted value(s) will be placed in the output document.

destination_format

An optional format for the destination document. Use 0, 1, etc., to reference groups captured from the source Regex.

  • If not set then a value of {0} is assumed.
  • If set to {*} then the values will be added as an array at the destination path. This can be useful to extract values from text content

Additional Parameters

No additional parameters

Example

Extract @xyz mentions from field containing free format text.

{
  "type": "operation",
  "name": "extract",
  "source_path": "main_text",
  "destination_path": "content.mentions[*]",
  "format": "@([A-Za-z0-9_]{4,15})",
  "destination_format": "{*}"
}

trim_array

Removes null or empty values from an array.

Requiressource_path, format and destination_path to be present in the mapping JSON.

Mapping Fields

Field

Description

source_path

The dot notation JSON path to extract data from the input document.

format

An optional control to specify what values are removed.

  • If not set then both null or empty values are removed (default).
  • null only null values are removed from the array.
  • empty only empty values (e.g. "" or {}) are removed from the array.

destination_path

The JSON path where the trimmed array will be placed in the output document.

To trim an array in place:

  • Set this to the same as value source_path .
  • Set source_type to output.

Additional Parameters

No additional parameters

Example

Remove empty hashtag values from the output document.

{
  "type": "operation",
  "name": "trim_array",
  "source_path": "content.hashtags",
  "destination_path": "content.hashtags",
  "format": "null, empty",
  "source_type": "output"
}

key_to_value

Transforms an object with dynamic keys into an array of objects. This is useful when the number of unique key names is not defined but there are limits on the number of unique keys (e.g. an Elasticsearch index)

Mapping Fields

FieldNotes
source_pathThe dot notation JSON path to extract data from the input document.
destination_pathThe JSON path where the mapped values will be placed in the output document.

Additional Parameters

Field

Type

Description

key_name

string

The name of the new property that will hold the original key.

value_name

string

The name of the new property that will hold the original value.

If the original value is an object, this can be omitted and the key property will be added to the object.

Example

Convert :

  • {"metrics": {"clicks": 10, "views": 100}}

To

  • {"metrics": [{"metric": clicks", "value": 10}, {"metric": "views": "value": 100}]}
{
  "type": "operation",
  "name": "key_to_value",
  "source_path": "metrics",
  "destination_path": "metrics",
  "parameters": {
    "key_name": "metric",
    "value_name": "value"
  }
}

i18n_language_shorthand

Converts a full language name (e.g., "English") to its two or three-letter ISO code.

Mapping Parameters

FieldDescription
source_pathThe JSON path to the source language name.
formatThe desired output format. Supported values are "two_letter_code" (default) and "three_letter_code". This is specified in the top-level destination_format property.
destination_pathThe JSON path where the language short code will be placed in the output document.

Additional Parameters

No additional parameters

Example

{
  "type": "operation",
  "name": "i18n_language_shorthand",
  "source_path": "language_name",
  "destination_path": "language",
  "format": "two_letter_code"
}