Azure Blob Storage Egress

Component Configuration

Use the Datastreamer Pipeline to export data directly to your Azure Blob Storage container and folder. This component supports multiple egress types and data Collation.


Collation Type

It is recommended to use File Collation (default). This groups documents into files for the job. Alternative options are to collate based on messages (internal process for managing requests into manageable units for pipeline processing) or individual files for each document received.

Container (required)

Specify the Azure Blob Storage container name for egress.

Use Metadata Tag (Optional)

Specify the Metadata Tag "name" to be used for the output folder in the container. The Tag "value" is configured as part of job creation. See Creating Jobs (Portal, API). If the Tag is not present on the document/file received by the Azure Blob Storage Egress component, the Metadata Tag value will be used by default as the folder name.

Collation Size

Integer (bytes) specifying the collation size of the output JSON file to be created in the Azure Blob Storage container. While processing a job, the Azure Blob Storage Egress component will collate results until the file size is reached. Once the size is reached the file will be uploaded to the Azure Blob Storage bucket. Where the job generates more results additional files will be created with an incrementing number appended to the file name i.e. "-1", "-2".

The Azure Blob Storage Storage Egress component will wait for 60 seconds for new documents to collate, if no more are received in that time, the collated file is uploaded to the Azure Blob Storage Egress container even if size limit is not reached.

Egress Data

It is recommended to use Documents (default configuration). For Ingress and Operation Components (i.e. WebSightLine File Fetcher) that process file objects (i.e. images, PDFs) these objects can be retained in cache for additional processing and egressed at the end of the pipeline using the alternative options: Files & Documents or Files only.

Output Format

Options for JSON collation format.

Shared Access Token OR Account Key (Required)

You have the option to use a Shared Access Token or an Account Key to provide access to the pipeline component to output the files into your Azure Blob container. Add your key or token to the "Keys & Secrets" page from the Portal menu. Make the write access have been granted.

Learn more about Shared Access Token (SAS).