GenAI Entity Recognition Classifier

AI entity recognition classifier that extracts persons, locations, and organizations from social media data.

The GenAI Entity Recognition Classifier is designed to identify and extract key entities such as persons, locations, and organizations from English language content. This AI-powered model efficiently recognizes entities and presents them in a structured list. Optimized for both real-time and batch processing, it is an essential tool for analyzing large-scale data, monitoring engagement, and uncovering actionable patterns.

Adding to your Dynamic Pipeline

This component can be added to your Dynamic pipelines through the "GenAI Entity Recognition Classifier" component. It requires the following fields for configuration:

  • Destination Path (Required): The enrichment.entities field holds the output from the GenAI Entity Recognition Classifier. But you can map it to another field or create a new one. This field will contain recognized entity names in an array.
  • Target Text (Required): The metadata field containing the input text for entity recognition. By default, this is set to content.body, but any field containing relevant text can be used.

If the Gemini Model encounters safety issues with certain content, the Gemini API will fail to generate output.

Dynamic Pipeline Example Configuration

The example below demonstrates the dynamic pipeline configuration for the GenAI Entity Recognition Classifier component. If Unify is the preceding step in your pipeline, you can set it up as shown in the example:

  • content.body from the input document is specified as the Target Text for the GenAI Entity Recognition Classifier.
  • enrichment.entities is designated as the Destination Path for storing the output of the GenAI Entity Recognition Classifier.

Sample Example Output

{
  "enrichment": {
    "entities": [
      "John Smith",
      "Microsoft Corporation",
      "New York",
      "Tesla Inc",
      "California"
    ]
  }
}

Compatible Languages

Currently, the GenAI Entity Recognition Classifier supports content in English only. As languages other than English are tested and improved, they will be added incrementally.

LanguageLanguage ID (ISO-639)
Englishen