Searchable Storage

Datastreamer-managed and in-pipeline searchable database solution.

About Searchable Storage

Searchable Storage is a managed storage solution that can be used within a Pipeline to manage, deduplicate, store, and search content. As Pipelines focus on processing the data live, in-pipeline storage can offer a number of benefits.

Adding Searchable Storage to your Pipelines

You can add Searchable Storage to any of your pipelines by adding the component. Once that version is deployed, the Platform will automatically generate the storage index for you.

Searchable Storage added to the pipeline as a component

Searchable Storage added to the pipeline as a component

Interacting with your Searchable Storage Directly

You can interact with your Searchable Storage directly using a number of APIs. Using these APIs with your API token for authentication allows you to interact with your Searchable Storage

API NameDescriptionDocumentation
Count APIProvides quantities and counts for Lucene-based search queries exceeding 10,000 results.Documentation
Search APIFull-text search API based on Apache LuceneDocumentation
Term Aggregation APIAggregated view of the results, based on the provided parameters.Documentation
Histogram APIQuery the aggregated result based on a timeframe. Mostly used to generate dashboards, reports, and graphs.Documentation

Performance and Capacity

Searchable Storage is designed to be instantly-responsive high-performance storage. Users of Searchable Storage have ingested entire social media Firehoses, and expanded storage to 5-10 terabytes without performance issues. Data in encrypted at rest and stored within the same Google Cloud environment as the Platform, ensuring safety and protection.

The Searchable Storage also handles many data performance and data maitenance tasks, such as: optimization of the data storage pattern, deduplication of content, and also augmenting stored data with updates if a newer version of duplicate content is received.

Use cases for Searchable Storage

The use cases for Searchable Storage are numerous, but here are some common use cases:

  • Deduplication of content, especially those received from multiple sources.
  • Direct usage as a search engine or key product database using available APIs
  • Storing of content for batch or later processing.
  • Buffer or backup to hold content in case of a customer's maintenance or outages in their own products.
  • In cases of very complex requirements, as the egress of one pipeline, to then use it as the ingress of many others.
  • Converting a high volume or highly variable firehose or real-time data source into a more manageable data stream.

Managing your Searchable Storage

Your Searchable Storage will be accessible alongside your other storages within Portal. Since your storages are all completely managed, however you can directly search from this screen, and delete storage blocks.

Portal screenshot open to the Manage sidebar panel

Portal screenshot open to the Manage sidebar panel

Additional management and creation features are coming soon.