Categorization
This classifier is a specialized multiclass text classifier that categorizes media posts based on the IPTC 17 Media Topic NewsCodes for the Media extracted from the given text. “other” label will be displayed when it has multiple topics or not enough information. The output would be the following 12 labels, including ITPC combined NewsCodes topics.
- Entertainment -> arts, culture, and entertainment
- Conflicts -> conflicts, war, and peace
- Crime -> crime, law, and justice
- Disaster -> disaster, accident, and emergency incident
- Business -> economy, business, and finance
- Health -> health
- Lifestyle -> lifestyle and leisure
- Politics -> politics, religion, and labor
- Technology -> science and technology, environment
- Society -> society and education
- Sport -> sport
- Other -> multiple topics, short text, or not enough information
Statistics
Type | Speed | Partner Type |
---|---|---|
Post-Processing Classifier | Instant | Datastreamer Internal |
Example Use Cases
- Marketing and security companies can use a category classifier in combination with hard news classifiers to identify and track recent news about organizations or people.
- Media companies can aggregate technology news based on categories and published dates.
Compatible Data Sources
As a Post-Processing operation, it can be run on any data source.
Recipe Available
View the below recipe to see using post-processing, and easily view how to integrate it into your own data pipeline.
🎛️
Using Post-Processing Operations
Open Recipe
Usage
This Operation allows a user to specify the destination field, source fields, and separator.
{
"query": {
...
},
"operations": [
{
"name": "category",
"destination_path": "operations.category",
"parameters": {
"language": "enrichment.language",
"main": "content.body"
}
}
]
}
```
Updated 10 months ago