Connect REST Data Sources

🚧

Legacy Functionality

This feature has been deprecated, please use the Dynamic Pipelines, it offers the same capabilities with greater flexibility and scalability.

Datastreamer unlocks the ability to seamlessly and quickly add new data sources into your data pipeline. Adding new sources follows the below process.

For REST sources, Datastreamer will act as a proxy in requesting, transforming, and delivering required data.

Go-Live Checklist

To register a REST source:

  • Have the API key or authentication method to leverage the source.
  • Upload a schema.
  • Connect your source.

Schema Mapping

Datastreamer classifiers and other sources are mapped to use Datastreamer's published metadata fields. While not required, taking advantage of the other features and functionality of the Datastreamer platform is recommended.

📘

Metadata Field Template

You can use the following Google Doc as a metadata field template to help develop your schema. Metadata Matching Schema Template

To create the schema, you need to specify the source metadata field (source_path), destination metadata fields (destination_path), and type (string, date, etc.). Here is an example of a schema that shows a schema named Datas having fields mapping. The source_path from the original gets mapped to destination_path i.e. the Datastreamer data schema along with a data type.

{
  "schema": {
    "name": "Datas",
    "mappings": [
      {
        "source_path": "thread.uuid",
        "destination_path": "id",
        "type": "string"
      },
      {
        "source_path": "thread.published",
        "destination_path": "doc_date",
        "type": "date"
      },
      {
        "source_path": "thread.published",
        "destination_path": "content.published",
        "type": "date"
      },
      {
        "source_path": "thread.url",
        "destination_path": "source.link",
        "type": "string"
      },
      {
        "source_path": "thread.title",
        "destination_path": "content.title",
        "type": "string"
      },
      {
        "source_path": "text",
        "destination_path": "content.body",
        "type": "string"
      }
    ],
    "schema": {
      "nbd": {
        "id": "string",
        "doc_date": "date",
        "source": {
          "link": "string"
        },
        "content": {
          "body": "string",
          "title": "string",
          "published": "string",
        },
      }
    }
  }
}

Submitting a Schema

To submit, validate, modify, or delete a schema, please view the examples and guides available in the API reference. https://datastreamer.readme.io/reference/post_api-schemas

Connecting your Streaming Datasource

After you have created and submitted a schema, you can send your data through the Document Submission Endpoint.

Using the data in Datastreamer

Once the schema is validated and the streaming data source is successfully connected to the Datastreamer, the data will be available in the Datastreamer pipeline for usage. Utilize the Datastreamer APIs and metadata fields defined in your schema to begin using the integrated data within your application.