Datastreamer Schema

A Datastreamer-designed schema to use as a basis or as your platform schema.

Datastreamer Default Schema and Unify

Datastreamer's pipeline contains a high-powered AI-enriched transformation service that can convert billions of individual fields into a common schema per minute. This unlocks the ability to easily generate features and machine learning capabilities that use multiple sources concurrently; as well easy integration of new data sources, and pipeline capabilities.

Using the Datastreamer Default Schema is suggested if you do not have a set schema that you wish to use. This eases adoption, support, and ability to leverage default and examples present in components.

Unify Transform

Within the Datastreamer Platform, "Unify Transform" converts integrated sources from data partners into the Datastreamer Default Schema. This component also leverages Datastreamer's own schema generation product that reads the available schema of existing content and auto-aligns to the default schema.

Schema Format

The Datastreamer Schema is nested JSON.This nested JSON is set by field type. Fields regarding a similar focus of the document (Source, Author, Content, Enrichment, etc...) are nested in categories.

            "id": "3420227781307210486-wsl_instagram",
            "doc_date": "2024-07-25T21:29:05Z",
            "data_source": "wsl_instagram",
            "source": {
                "link": "https://instagram.com/p/C93FwvDvRr2/"
            },
            "content": {
                "body": "Ruélia vermelha ou Ruélia da Amazônia \n\n🤓 Nome científico..."

View Current Datastreamer Default Schema

As the Datastreamer Default Schema is constantly expanded with new fields, you can view the Datastreamer Default Schema at the link below. This link also shows many of the common data partner sources and if they exist for that source.

Datastreamer Default Schema (Google Sheet)


What’s Next

Response examples are provided within each API detail page.