Multi-source Entity Recognition Pipeline Template

Import this pipeline template to Unify and detect entities in multiple social sources.

When to use this Template?

This pipeline template is a perfect starting template for analysis and enrichment of multiple social sources, handling various schemas and routing based on source type.

You can easily customize this pipeline to your needs.

Pipeline Workflow Design

Within this template Unify converts the data source to the Datastreamer Default schema, allowing all the following enrichments to run on default configuration. GenAI Entity Recognition then detects the entities within the post, highlighting organizations, individuals, and more. As content is not being filtered by language, a GenAI-based classifier solves language discrepancies. Filtering from the JSON Document Router separates by source type, passing News content to the Hard News enrichment, highlighting the difference between fact-based and opinion (soft) news articles.

Document Inspector is used in various locations to view the content as it passes through the pipelines as there, as no static egress (like a data warehouse or web hook) has been specified.

Edit Mode view of the Template Pipeline


Pipeline Architecture Components

This pipeline uses the following components:

  • Ingress (3): Vetric and Socialgist data sources.
  • Transformations (1): Unify Transformer
  • Pipeline Management (2): Document Inspector, JSON Document Router
  • Enrichments (2): GenAI Entity Recognition, Hard News Classifier

Start working with this Pipeline

To import this pipeline, save as a ".pipeline" file and import from your Pipelines screen within Portal.

🔄

Update Pipeline Access Tokens

The data source access tokens in the imported pipelines will not automatically adapt to the keys saved in your "Keys and Secrets". You will need to change them to match.

{
  "version": 2,
  "exported_date": "2025-08-07T20:20:53Z",
  "pipeline_name": "Multi-source Company Updates",
  "pipeline_description": "This pipeline processes content from multiple sources for organization updates.",
  "steps": [
    {
      "type": "ingress",
      "component_name": "vetric-linkedin-discover-people-ingress",
      "step_id": "57fq9xg6",
      "properties": {
        "vetric_use_system_key": false,
        "api_key": "VETRICLI"
      },
      "next_steps": [
        "w28x6cby"
      ]
    },
    {
      "type": "ingress",
      "component_name": "socialgist-news-ingress",
      "step_id": "8l24tzas",
      "properties": {
        "access_token": "SOCIALGIST_ACCESS_TOKEN"
      },
      "next_steps": [
        "w28x6cby"
      ]
    },
    {
      "type": "ingress",
      "component_name": "vetric-linkedin-discover-posts-ingress",
      "step_id": "e3jqr7w9",
      "properties": {
        "vetric_use_system_key": false,
        "api_key": "VETRICLI"
      },
      "next_steps": [
        "w28x6cby"
      ]
    },
    {
      "type": "connector",
      "component_name": "unify-transform",
      "step_id": "w28x6cby",
      "properties": {},
      "next_steps": [
        "undjgwyc"
      ]
    },
    {
      "type": "connector",
      "component_name": "genai-entity-extraction",
      "step_id": "undjgwyc",
      "properties": {
        "text": "content.body",
        "destination_path": "enrichment.entities"
      },
      "next_steps": [
        "lchzt46a"
      ]
    },
    {
      "type": "connector",
      "component_name": "json-document-router",
      "step_id": "lchzt46a",
      "properties": {
        "allow_multiroute": false,
        "routing_table": [
          {
            "route": 0,
            "filter": {
              "operator": "eq",
              "value_type": "string",
              "path": "data_source",
              "value": "socialgist_news"
            }
          },
          {
            "route": 1,
            "filter": null
          }
        ]
      },
      "next_steps": [
        "m4xprlo7",
        "lq4c3bl3"
      ]
    },
    {
      "type": "connector",
      "component_name": "hardnews",
      "step_id": "m4xprlo7",
      "properties": {
        "destination_path": "enrichment.hard_news",
        "text": "content.body",
        "language": "enrichment.language",
        "title": "content.title",
        "condition": {
          "operator": "and",
          "conditions": [
            {
              "path": "content.body",
              "default": false,
              "operator": "exists"
            },
            {
              "path": "enrichment.language",
              "value": "en",
              "default": false,
              "operator": "eq"
            }
          ]
        }
      },
      "next_steps": [
        "lq4c3bl3"
      ]
    },
    {
      "type": "egress",
      "component_name": "inspector-egress",
      "step_id": "lq4c3bl3",
      "properties": {}
    }
  ]
}

Need help?

If you need help with setting up, customizing, or importing this template Pipeline, just let us know!