Social Voice Toxicity Detection

Automatically detect harmful, toxic, or inappropriate language in text to protect your community and ensure brand safety

Protect your platform and brand by automatically identifying toxic content. This component uses machine learning models to detect various forms of harmful language, including harassment, hate speech, insults, and threats. It provides detailed toxicity scores and classification categories, enabling you to flag, review, or automatically filter content to maintain a safe and respectful environment.

Component Configuration

Users must specify the Target Text field, which is the JSON path to the property from the incoming document, containing the text to be analyzed (e.g., content.body).

When the Social Voice API returns the response, the response is attached to the original document in a social_voice property.

Example

The final document structure would look like:

{
  // Original document content...
  "social_voice": {
    "toxicity": {
      "status": "ok",
      "toxicity": {
        "identity_attack": "0.00014578887",
        "insult": "0.00017357475",
        "obscenity": "0.00020515062",
        "severe_toxicity": "0.00011932302",
        "threat": "0.000116896044",
        "toxicity": "0.0007594495"
      }
    }
}

Updated 3 months ago