Datastreamer's pipeline contains a high-powered transformation service that can convert billions of individual fields into a common schema per minute. This unlocks the ability to easily generate features and machine learning capabilities that use multiple sources concurrently; as well easy integration of new data partners and pipeline capabilities.
View "Integrated Data Sources" and "Integrated Enrichments" to see which metadata categories apply to individual sources.
Each data source has its own specific response parameters based on the source from which information is being fetched. Each data source response is merged with common information like Source, and Enrichment details.
|Common for all sources, this is the URL link directly to the content and can serve as a UID for the content.
|For sources with subtypes (such as dark web content), this covers the subtypes.
|When the content was discovered
|When the content was originally published
|When the content was last updated
|Any hashtags within the content
|Any links found within the content
|A count of social media favorites of the content at the time of index
|A count of social media followers of the content at the time of index
|A count of the social media following count of the content at the time of index
|Profiles that are being mentioned in content or directed to with said pieces of content
|Contains all URLs to images that were present within or associated to the content.
|The header (title) of the content.
|A summarization or excerpt of the content.
|The associated domain of the content record.
|Any cashtags within the content
|A count of number of comments on the social media post
|Location mentioned with social media post, usually provided by the author.
|Video url links within social media post [*]
|A count of number of times social media post was shared
|A count of number of times social media post was played
|A count of number of times social media post was viewed
|A count of number of times social media post was liked
|Contains images information associated with the content
|URL to image found in the content
|Alternative text or accessibility caption associated with image found in the content
|Name of the author
|Handle of the user
|A personal description of the author
|Location of the author
|Link to the profile image of the author or profile.
|Gender of the author
|URL to the author's profile
|URLs mentioned in the author profile
|User id mentioned in author profile
|Is author profile verified ? Boolean True or False
|A count of number of times author posted videos
|Author phone number in the author profile
|Full name of the individual (if available).
|Avatar of image associated with the person.
|Alternative or maiden names of the individual
|Address associated with the record.
|Country of person
|Region of person
|Place or location of birth.
|Current employment with associated title (if available), at time of document date.
|Volunteer experience of person.
|Available biography of person.
|Any certifications received.
|Any languages spoken.
|If the profile has been verified as accurate by partner.
|Creation date of the profile.
|Date of birth, of the person.
|Date of death, of the person.
|Convictions (if available) of the person.
|Any family relations present in the result.
|Any business relations present in the result.
|Any business partnerships present in the result.
|Contributions and awards of the person.
|Any lobbying partnerships or relationships.
|Any stakeholder partnerships or relationships.
|Any contributions by the person
|Any certifications obtained by person
|Number of followers for the person
|Number of following for the person
|Headline or title of the person
|Honors mentioned for the person
|user id for the person in the metadata
|Hashtags related to the person
|Prior first or last name of the person
|Person working/affiliated with the organizations
|Patents obtained by the person
|Projects worked on by the person
|Publications by the persons
|Bio summary of a person
|Exam test scores of a person
|Technical skills of a person
|Courses created or completed by person
|Person's website url
|Case number of the offense. Ex: "095978201010"
|Case type of the offense as defined by the source. Ex: "felony", "gross misdemeanor", "Offense Infraction"
|Date the offense occurred. Ex: "2003-07-28"
|Class of the offense
|Subclass of the offense
|Offense description from the source. Ex: "SEXUAL ASSAULT CHILD", "BATTERy', "AGGRAV STALKING"
|Degree of the offense from the source. Ex: "FELONY", "Misdemeanor", "Infraction"
|Feld classifying offense level.Ex: "FELONY", "MISDEMEANOR", "UNKNOWN", "INFRACTION/VIOLATION/ORDIANCE/TRAFFIC"
|Date the defendant was sentenced. Ex: "2004-06-03"
|Type of sentence as defined by the source. Ex: "fined", "failed", "probation"
|Description of the sentence.
|Date the defendant was incarcerated. Ex: "2007-06-03"
|Date the defendant was arrested. Ex: "2003-07-28"
|Arresting agency. ExL UNITED STATES MARSHALL'S SERVICE", "Elwood Police Department", "Madison County Sheriff"
|Case number of the offense. Example: "MCRDCRTR03-0001124-008"
|Name of the court that tried the defendant. Ex: "CA Shasta Superior Court", "GA Dept of Corrections (Web)"
|Release date of the offender. Ex: "2019-03-26"
|Feld to calculate the age of the record. Ex: "2003-02-19"
|Date the case was filed. Ex: "2003-02-19"
|Disposition of the offense. Ex: "DISM OTHER", "DEFERRED ADJUDICATION TERMINATED", "Adjudicated Guilty", "NOLLE PROSEQUI", "DISMISSED CASEREFILED"
|Internal field classifying the disposition. Ex: "GUILTY", "NOT GUILTY", "WITHHELD", "NOT DISPOSED", "NO VERDICT", "UNKNOWN"
|Date the offense was disposed. Ex: "2013-03-28"
|County associated with the offense. Ex: "Benton", "JACKSON", "PORTER"
|Jurisdiction associated with the offense City, State, or County. Ex: "Woodbury", "Black Hawk", "Hudson", "Jersey City Municipal Court"
|Date of conviction. Ex: "2004-06-03"
|The title of the product in the product record
|The brand name of the product (ie: Nike)
|The description on the product record
|The date the product was listed
|The listed price
|The currency of the price, using ISO standard (ie: USD)
|The displayed name of the product seller
|The URL to the profile or site of the seller
|The location of the product
|Links to images of the product from the product record
|List of organizaitions that was acquired by the Company through a merger, consolidation, combination, exchange of shares, acquisition
|List of affiliated company's which is a minority shareholder of another company
|Hashtags used for company
|Following company types : Public company, self-employed, government agency, non-profit
|Description of the organization
|Location of organization headquarters
|Number of followers for the organization
|Founded date for the organization
|user id of the organization within metadata
|Location of the organization within metadata
|Name of the organization
|Parent company of the organization
|Profile image of the organization
|Specialities of the organization
|Organization headline or tagline
|Handle of the organization
|Organization website url
|A count of range of number of employees of the organization
|The extracted sentiment of the source content
|Per ISO: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes#Partial_ISO_639_table
|Display label representing the violence: 'No violence', 'Threat of physical violence', 'Reported violence', 'Sexual violence'
|Displays a score of reported violence (both directed and indirected) arising from our in-house violence classifier. Can be filtered by confidence (.confidence) of 0.0 to 1.0
|Displays the infered city location of the top 31 cities in the United states. Currently only available on English wsl_Twitter
|Displays a score of location inference confidence. Can be filtered by confidence (.confidence) of 0.0 to 1.0
|Displays NLP-extracted locations from within the content.body of the media.
|Displays a score of Location entity recognition confidence. Can be filtered by confidence (.confidence) of 0.0 to 1.0
|Displays NLP-extracted persons from within the content.body of the media.
|Displays a score of Person entity recognition confidence. Can be filtered by confidence (.confidence) of 0.0 to 1.0
|Displays NLP-extracted organizations from within the content.body of the media.
|Displays a score of Organization entity recognition confidence. Can be filtered by confidence (.confidence) of 0.0 to 1.0
|Displays label of whether content has been determined as 'Hard News'.
|Displays a score of Hard News classification confidence. Can be filtered by confidence (.confidence) of 0.0 to 1.0
For more information on Enrichments, view the dedicated page linked in the sidebar.
Data source-specific fields
The below sections contain fields that are not common to every data source, and apply only to specific data sources.
|The type of content (POST, REPLY or RETWEET)
|Retweet type (NONE, RAW, or QUOTE)
|Link of the original tweet post
|The identification string of that piece of content
|If the user account has been verified
|Unique Twitter ID of the user on Twitter (as usernames can be changed)
|A count of number of times Twitter post was tweeted with quote
|A count of number of times Twitter post was give reply
|A count of number of times Twitter post was retweeted
|Link of the replied post or comment
|The type of content (comment, post)
|Unique Meta ID of the user on Instagram (as usernames can be changed)
|The type of content (TEXT, IMAGE, CAROUSEL, or VIDEO)
|The type of post (POST, QUOTE, REPOST, or REPLY)
|The original post URL if post_type is REPOST, QUOTE or REPLY
|The identification string of that piece of content
|If the user account has been verified
|Unique Meta ID of the user on Threads (as usernames can be changed)
|Number of replies at the moment this post was found.
|True/False if the post is the ad
|True/False if the post is official
|True/False if the post is original
|True/False if the post is shared enable
|True/False if the post is stitch option enable
|True/False if the post is type of activity
|True/False if the post is duet option enable.
|True/False if the post is for friend
|List of stickers in Tiktok social media post
|List of duet video tiktok social media post
|List of music details showing title, album name, url, duration etc.
|Type of Facebook post
|A count of number of angry emoji reactions on Facebook post
|A count of number of haha emoji reactions on Facebook post
|A count of number of likes emoji reactions on Facebook post
|A count of number of love emoji reactions on Facebook post
|A count of number of sad emoji reactions on Facebook post
|A count of number of support emoji reactions on Facebook post
|A count of number of wow emoji reaction on Facebook post
|A count of number of reactions on Facebook post
|A count of number of members on Facebook page
|List of industries mentioned on LinkedIn company and member profiles. This allow member/company to choose an industry that best matches their interest in company or the type of work that they do
|The Interests section displays Companies, Schools, Newsletters, and Groups that a LinkedIn member is following, subscribed to or joined, as applicable
|The Interests top voices section displays Top Voices followed, subscribed to or joined, as applicable by LinkedIn member
|Boolean True/False field for creator mode profile setting that can help creator member to grow your reach and influence on LinkedIn.
|Boolean True/False field showing if a member is a LinkedIn influencer who is a thought leader within a particular industry who shares organic content to a large LinkedIn audience
|Boolean True/False field showing a company or member is premium subscriber
|A count of number of interest subscribed by member
|A count of number of times kudos on the post
|A count of number of times empathy or support emoji on the post
|List of multiple images and/or videos in a single post
|A count of number of employees of organization on LinkedIn
|Boolean True/False field shows open to new job opportunities on member profile
|A count of number of member connections
|Rank of the news site in the world
|Rank of the news site in the country
|Length of the article, as determined by wordcount
|A DarkOwl-unique field, measuring the Hackishness of any piece of content.
|Used to search for a desired credit card number in the body of a result. Input is 6-20 digits and supports trailing wildcards.
|Used to search for an individual SSN record.
|Used to search for an individual email address. Note: search the keyword of "thedomain.com" to find email addresses within an entire domain.
|Used to search for a cryptocurrency address in the body of a result. Supported types: bitcoin, dash, ethereum, litecoin, monero, zcash.
|Used to search for an ip address in the body of a result.
|The title of the thread for the content
|True/False on if the post is the first post (head) of a forum thread.
|The date of the original creation of content. While this parallels content.published in many cases, some sources that do not provide content.published. This date is always the date of document creation at its source.
Updated about 1 month ago