AI Entity Recognition Classifier
Instantly uncover key entities from social media data and transform them into meaningful insights!
The AI Entity Recognition Classifier is designed to identify and extract key entities such as persons, locations, and organizations from social media content. This advanced model efficiently recognizes entities across multiple languages, offering valuable insights into user discussions, geographic trends, and organizational mentions. Optimized for both real-time and batch processing, it is an essential tool for analyzing large-scale data, monitoring engagement, and uncovering actionable patterns in diverse contexts.
Adding to your Dynamic Pipeline
This component can be added to your Dynamic pipelines through the "AI Entity Recognition" component. It requires the following fields for configuration:
- Destination Path (Required): The "enrichment.entities " field holds the output from the AI Entity Recognition Classifier. But you can map it to another field or create a new one. This field will contain recognised entity names in an array.
- Target Text (Required): The metadata field containing the input text for entity recognition. By default, this is set to content.body, but any field containing relevant text can be used.
If the Gemini Model encounters safety issues with certain content, you will find that Gemini API failed to generate output.
Dynamic Pipeline Example Configuration
The example below demonstrates the dynamic pipeline configuration for the AI Entity Recognition Classifier component. If Unify is the preceding step in your pipeline, you can set it up as shown in the example:
- content.body from the input document is specified as the Target Text for the AI Entity Recognition Classifier.
- enrichment.entities is designated as the Destination Path for storing the output of the AI Entity Recognition Classifier.
Sample Example Output
Compatible Languages
The Micro Classifier supports content in multiple languages. When the input text is in a language other than English, the component automatically detects the language and performs the Entity Recognition accordingly. The language coverage is continuously improved as this component uses Google Gemini 1.5 Flash in the back end. Referring to https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash the language coverage is:
Language | Language ID (ISO-639) |
---|---|
Arabic | ar |
Bengali | bn |
Bulgarian | bg |
Chinese | zh |
Croatian | hr |
Czech | cs |
Danish | da |
Dutch | nl |
English | en |
Estonian | et |
Finnish | fi |
French | fr |
German | de |
Greek | el |
Hebrew | iw |
Hindi | hi |
Hungarian | hu |
Indonesian | id |
Italian | it |
Japanese | ja |
Korean | ko |
Latvian | lv |
Lithuanian | lt |
Norwegian | no |
Polish | pl |
Portuguese | pt |
Romanian | ro |
Russian | ru |
Serbian | sr |
Slovak | sk |
Slovenian | sl |
Spanish | es |
Swahili | sw |
Swedish | sv |
Thai | th |
Turkish | tr |
Ukrainian | uk |
Vietnamese | vi |
Updated 19 days ago