socialgist-tiktok

Source Attributes and Stats

Socialgist Tiktok is pre-adapted to the Datstreamer schema and uses Lucene query logic. It is available as a filtered stream that uses Datastreamer's Job system.

There are 3 types of content, video, comment and tag. You can use the field tiktok.content_type to filter the kind you want.

  • Crawling is focused on user and hashtag pages.
  • Available data: videos, hashtag stats, top level comments. Only 2 years of data is available at current rollout.
  • Crawl frequency:
    • User and hashtag pages: every 12 hours (includes updating the
      video metadata stats)
    • Hashtag stats: every 6 hours
    • Comments: periodically for 30 days after initial video discovery
      (initial crawl, 1, 5, 10, 30 days)
  • Initial Collection:
    • User pages: when adding user pages, 30 most recent videos will be captured with initial crawl and new videos moving forward
    • Hashtag pages: these pages are not sorted by date so not possible to guarantee latest videos; with each crawl the 30 “top” videos are captured. This sorting is determined by TikTok.
    • To participate in program, submit user pages and hashtag pages your clients wish to track and analyze.

Example Queries

Here are some queries that you could use within your Jobs for socialgist-tiktok:

Find drinks Keywords
content.body: drinks

Keywords
cats OR dogs

Comments only
tiktok.content_type: comment

Videos only
tiktok.content_type: video

Comments and Videos
tiktok.content_type: (comment OR video)

Hashtags only
tiktok.content_type: tag

English content
enrichment.language: en

Search by hashtags
tiktok.content_type: video AND content.hashtags:(fashion OR perfume)

Example Result

Tag

{
                "data_source": "socialgist_tiktok",
                "tiktok": {
                    "content_type": "tag"
                },
                "doc_date": "2024-07-15T16:35:28.364Z",
                "source": {
                    "link": "https://www.tiktok.com/tag/gadget"
                },
                "content": {
                    "hashtags": [
                        "#gadget"
                    ],
                    "views_count": "22000000000",
                    "found": "2024-07-15T16:35:28.364Z"
                },
                "id": "c51d1f71907fb416528dda034a78c6d637baeca3ca0adb2eb3820e8e8b21b225"