The Facebook source collects posts, comments, and public page content from Facebook as part of a Data Stream.
Datastreamer selects from available providers automatically based on your Job configuration. You do not need to manage provider accounts or credentials to use this source.
Configuring a Job
When creating a Job for the Facebook source, you define what content to collect. Common configuration options include:
- Keywords / query: terms to search for in post content
- Page or account targets: specific Facebook pages to monitor
- Date range: the time window to collect data from
- Language filter: restrict results to a specific language
- Document limit: maximum number of documents per Job run
Refer to the Job creation documentation for full configuration details: Creating Jobs
What is Collected
Each document returned from the Facebook source represents a post or comment. Fields are mapped to the Datastreamer unified schema and include content text, author metadata, engagement metrics (reactions, comments, shares), and post date.
Platform-specific fields are available under the facebook schema namespace. See the Schema Reference for field details.
Troubleshooting
Job fails or returns no data
- Check that the query or account target is valid and publicly accessible
- Verify the date range contains data
- Review Job logs for specific errors
Unexpectedly high document counts
- Add more specific filters to narrow the query
- Set a document limit on the Job to cap volume per run
- Review the DVU pricing page to understand how document counts affect costs
Provider switch noted in Job logs
- This is expected behavior. If a provider is unavailable, the Job is routed to an alternative automatically. No action is required.
Related
- Sources Overview
- Creating Jobs
- Data Volume Units
- Direct Integrations: use if you need a specific provider directly
