Socialgist Tiktok
Source Attributes and Stats
Socialgist Tiktok is pre-adapted to the Unify schema and uses Lucene query logic. It is available as a filtered stream that uses Datastreamer's Job system.
There are 3 types of content, video, comment and tag. You can use the field tiktok.content_type
to filter the kind you want.
- Crawling is focused on user and hashtag pages.
- Available data: videos, hashtag stats, top level comments. Only 2 years of data is available at current rollout.
- Crawl frequency:
- User and hashtag pages: every 12 hours (includes updating the
video metadata stats) - Hashtag stats: every 6 hours
- Comments: periodically for 30 days after initial video discovery
(initial crawl, 1, 5, 10, 30 days)
- User and hashtag pages: every 12 hours (includes updating the
- Initial Collection:
- User pages: when adding user pages, 30 most recent videos will be captured with initial crawl and new videos moving forward
- Hashtag pages: these pages are not sorted by date so not possible to guarantee latest videos; with each crawl the 30 “top” videos are captured. This sorting is determined by TikTok.
- To participate in program, submit user pages and hashtag pages your clients wish to track and analyze.
Example Queries
Here are some queries that you could use within your Jobs for socialgist-tiktok:
Find drinks Keywords
content.body: drinks
Keywords
cats OR dogs
Comments only
tiktok.content_type: comment
Videos only
tiktok.content_type: video
Comments and Videos
tiktok.content_type: (comment OR video)
Hashtags only
tiktok.content_type: tag
English content
enrichment.language: en
Search by hashtags
tiktok.content_type: video AND content.hashtags:(fashion OR perfume)
Example Result
Tag
{
"data_source": "socialgist_tiktok",
"tiktok": {
"content_type": "tag"
},
"doc_date": "2024-07-15T16:35:28.364Z",
"source": {
"link": "https://www.tiktok.com/tag/gadget"
},
"content": {
"hashtags": [
"#gadget"
],
"views_count": "22000000000",
"found": "2024-07-15T16:35:28.364Z"
},
"id": "c51d1f71907fb416528dda034a78c6d637baeca3ca0adb2eb3820e8e8b21b225"
Updated about 1 month ago