Get Started

Document Inspector

The Inspector UI allows you to test and observe the pipeline by providing the ability to fetch documents directly in the browser.

Getting Started

To begin, add a Document Inspector Egress component at any point in the pipeline where you want to fetch documents. The Inspector will cache the latest 1,000 documents per job, as well as up to 1,000 documents not associated with any job. This component does not require any additional configuration.

Once added, new features will be available in the pipeline after deployment.

“Standalone” Inspector

When you click Inspect on a deployed Document Inspector Egress, a panel will display the most recent documents processed by that component.

From this panel, you can expand a document to view its full JSON content. The title of each document in the list is dynamic and will change based on available fields. The current fallback priority is as follows:

  1. document.content.body
  2. document.doc_date
  3. document.id
  4. Stringified JSON

If your pipeline has more than one Inspector, a Display Inspector selector will be available, allowing you to quickly switch between different Inspectors.

If the Inspector identifies documents with job labels, a Job Filter will be available, enabling you to select and view documents related to a specific job.

❗️

Clear Cache Button

The Clear Cache button will remove all documents stored in the Inspector component itself, this will not affect documents in other stages of the pipeline.

To export the documents currently visible in the UI, you can choose between JSON and JSON Lines formats.

The Inspector loads 100 documents at a time. You can load additional documents by clicking Load More at the bottom of the list.

Job System Inspector

In components that handle jobs, if any Inspector in the current pipeline has documents related to previous jobs, a Looking Glass button will be available. Clicking this button will take you to a shared Inspector page, pre-configured to display documents from the selected job.

On this shared page, all the features from the Standalone Inspector will be available.

Add & Inspect

Additionally, adding an Inspector component to the pipeline unlocks the Add & Inspect button. This feature allows you to add a job and jump directly to the shared panel with auto-refresh enabled.

🚧

Auto-Refresh

Since the time it takes for a document to be received by the Inspector varies depending on data source and pipeline steps, the Inspector will auto-refresh for up to 5 minutes before considering that no documents were found. You can still refresh manually after this timeout.