WebSightLine Threads
WebSightLine (WSL) Threads is a high-sampling of near-time public Threads content.
The WebSightLine Threads component provides a live feed with millions of Threads posts and comments each day.
New to Datastreamer? Start here.
Unify Schema
This data source already use Unify Schema.
How to use?
The WebSightLine Threads is powered by the Jobs System, when interacting with the component you have the option to define your jobs queries.
Search Queries
Filters
Available filters for WebSightLine Threads can be found in the table below:
Filter Name | Description |
---|---|
query | List of keywords or a phrase to search |
max_documents | Set a limit for the number of posts that will be fetched for the search. |
The Lucene Query is supported for this component in the query field. Here are some of the basics queries that you can try:
Keywords:
cats
Fields:
title:lucene
Phrases:
"apache lucene"
Wildcards:
tes\*
Boolean operators:
cats OR dogs
Examples
Search for cats or dogs
Query cats or dogs every 6 hours:

You also have the option to use the API. You can use the Code button to extract this example:
curl --location 'https://dev.api.platform.datastreamer.io/api/pipelines/{PIPELINE_ID}/components/{COMPONENT_ID}/jobs?ready=true' \
--header 'apikey: <your-api-key>' \
--header 'Content-Type: application/json' \
--data \
'{
"job_name": "8740eace-160a-468e-a2f5-8d2db803f9f6",
"data_source": "wsl_threads",
"query": {
"query_string": "cats OR dogs"
},
"job_type": "periodic",
"schedule": "0 0 0/12 1/1 * ? *",
"max_documents": 50
}'
For more details on creating data collection jobs, see Job Management.
Additional Details
Stats
Searchable Records | Update Frequency | Partner Type |
---|---|---|
45 million (3 months) | Near-time (Max 10-minute latency) | Stream Integrated |
Compatible Metadata Fields
Applicable Metadata Categories | Compatible |
---|---|
Source | Yes |
Content | Yes |
Author | Yes |
Person | No |
Enrichment | Yes |
Organization | No |
Data source-specific fields? | Yes, please see the Metadata page. |
Compatible Classifiers & Models
Classifier & Model | Compatible |
---|---|
Named Entity Recognition | No |
Location_Inference | Yes |
Language | Yes |
Reported_Violence | No |
Sentiment | No |
Hard_News | No |
Compatible Features
As a Stream-Integrated partner, all streaming features are available.
Features | Compatible |
---|---|
Search API | Yes |
Date Histograms | Yes |
Term Aggregations | Yes |
Highlighting | Yes |
Fuzzy and Proximate Search | Ye |
Available Fields
The available fields can be changed by the data provider (WebSightLines), but currently this is the possible field names.
Common fields (present in most entries):
id
doc_date
data_source
(e.g.,"wsl_threads"
)source.link
(URL to the post)author.name
author.bio
author.profile_image_source
author.url
author.handle
author.verified
(boolean as string, e.g.,"False"
)content.body
(post text)content.published
content.found
(timestamp when post was scraped)content.found_by
(e.g.,"wsl_threads_profile_robot"
)content.last_updated
content.likes_count
content.followers
(author's follower count)internal.provider_document_id
internal.last_updated
internal.annotations
(array of objects withname
/value
, e.g.,"found_with: profile: calfirelnu"
)internal.destinations
(array, e.g.,["public"]
)
Content Metadata:
content.hashtags
(array of hashtags, e.g.,["dogs", "fyp"]
)content.mentions
(array of mentioned handles, e.g.,["lostdogrescue"]
)content.images
(array of objects withurl
and optionalalternative_text
)content.video_urls
(array of URLs, if post contains video)
Threads-Specific fields:
threads.content_type
(e.g.,"TEXT"
,"IMAGE"
,"VIDEO"
,"CAROUSEL"
)threads.post_type
(e.g.,"POST"
,"REPLY"
)threads.post_identifier
(unique ID for the post)threads.user_verified
(boolean as string)threads.user_id
threads.source_link
(for replies, links to parent post)
Enrichment Fields:
enrichment.language
(e.g.,"en"
,"es"
,"nl"
)enrichment.location_inference_country.label
(e.g.,"US"
,"CA"
)enrichment.location_inference_country.confidence
(float, e.g.,0.8323
)
Built-In Language Detection
WSL_Threads has by default a built-in language enrichment provided by WebSightLine, the languages currently supported are:
ISO 639-1 | Language |
---|---|
AR | Arabic |
BG | Bulgarian |
CS | Czech |
DA | Danish |
DE | German |
EL | Greek |
EN | English |
ES | Spanish |
ET | Estonian |
FA | Persian |
FI | Finnish |
FR | French |
HE | Hebrew |
HI | Hindi |
HR | Croatian |
HU | Hungarian |
ID | Indonesian |
IT | Italian |
JA | Japanese |
KO | Korean |
MS | Malay |
NL | Dutch |
NO | Norwegian |
PL | Polish |
PT | Portuguese |
RO | Romanian |
RU | Russian |
SL | Slovenian |
SV | Swedish |
TH | Thai |
TR | Turkish |
UK | Ukrainian |
VI | Vietnamese |
ZH | Chinese |
U | Undefined |
Updated 3 days ago