Assets & Bundles

Assets are composable - they have parent-child relationships. A CSV is one file, but it’s also individual rows you can analyse separately. A PDF is one document, but it’s also pages. Most data you’d want to work with - mbox archives, article feeds, survey exports - can be broken down this way. For example:

PDF

CSV

If an adapter for your format doesn’t exist yet, it’s straightforward to build one.

Formats & Uploading

PDF

The main asset is the PDF with metadata and text content. The pages are sub-assets.

CSV

CSV Parent + Rows as sub-assets

Articles

Complex content, like articles with images and links. Soon with our own compositor for writing.

Web/RSS

Scraped websources with metadata as Articles

Text/ Markdown

Simplest Asset

Images

Images in PNG & JPG.

Streaming Sources

Many sources stream data continuously. Set up ongoing discovery and ingestion from:

Web Search

Tavily, SearXNG, and other search APIs

RSS Feeds

RSS & RSS-style XML feeds

Site Discovery

Scrape all items from a page (e.g. cnn.com/economy)

URL Lists

Ingest URLs, detect changes on refresh

Bundles

Bundles organise your assets. Think folders, but an asset can live in multiple bundles without duplicating.

By Project

“Climate Policy Research”, “Election Coverage 2024”, “Grant Applications Q1”

By Source

“Government Documents”, “News Articles”, “Academic Papers”

Bundles can nest. Assets can belong to several at once. When you run analysis, you pick which bundles to include.

Project

Getting Started

Core Concepts

How To

PDF

CSV

Formats & Uploading

PDF

CSV

Articles

Web/RSS

Text/ Markdown

Images

Streaming Sources

Web Search

RSS Feeds

Site Discovery

URL Lists

Bundles

By Project

By Source

Project

Getting Started

Core Concepts

How To

PDF

CSV

​Formats & Uploading

PDF

CSV

Articles

Web/RSS

Text/ Markdown

Images

​Streaming Sources

Web Search

RSS Feeds

Site Discovery

URL Lists

​Bundles

By Project

By Source

Formats & Uploading

Streaming Sources

Bundles