Connect Streaming Data Sources
Legacy Functionality
This feature has been deprecated, please use the Dynamic Pipelines, it offers the same capabilities with greater flexibility and scalability.
Datastreamer unlocks the ability to seamlessly and quickly add new data sources into your data pipeline. Adding new sources follows the below process.
Go-Live Checklist
To register a streaming source:
- Have a streaming data source able to deliver to either: AWS S3 bucket or Webhook
- Upload a schema
- Connect your source
Schema Mapping
Datastreamer classifiers and other sources are mapped to use Datastreamer's published metadata fields. While not required, taking advantage of the other features and functionality of the Datastreamer platform is recommended.
Metadata Field Template
You can use the following Google Doc as a metadata field template to help develop your schema.Metadata Matching Schema Template
To create the schema, you need to specify the source metadata field (source_path), destination metadata fields (destination_path), and data type (string, date, etc.). Here is an example of a schema that shows a schema named Datas having field mapping. The source_path from the original gets mapped to destination_path i.e., the Datastreamer data schema along with a data type.
{
"schema": {
"name": "Datas",
"mappings": [
{
"source_path": "thread.uuid",
"destination_path": "id",
"type": "string"
},
{
"source_path": "thread.published",
"destination_path": "doc_date",
"type": "date"
},
{
"source_path": "thread.published",
"destination_path": "content.published",
"type": "date"
},
{
"source_path": "thread.url",
"destination_path": "source.link",
"type": "string"
},
{
"source_path": "thread.title",
"destination_path": "content.title",
"type": "string"
},
{
"source_path": "text",
"destination_path": "content.body",
"type": "string"
}
],
"schema": {
"nbd": {
"id": "string",
"doc_date": "date",
"source": {
"link": "string"
},
"content": {
"body": "string",
"title": "string",
"published": "string",
},
}
}
}
}
Submitting a Schema
Please view the examples and guides available in the API reference to submit, validate, modify, or delete a schema. https://datastreamer.readme.io/reference/post_api-schemas
Connecting your Streaming Datasource
Please reach out to your Data Consultant to integrate streaming data.
Using the Data in Datastreamer
Once the schema is validated and the streaming data source is successfully connected to the Datastreamer, the data will be available in the Datastreamer pipeline for usage. Utilize the Datastreamer APIs and metadata fields defined in your schema to begin using the integrated data within your application.
Updated about 1 month ago