Skip to main content

Assets & Bundles

Upload and organise your research materials

What Are Assets?

Assets are individual pieces of content: PDFs, web articles, CSV files, text documents. Each asset becomes a searchable, analysable unit in your workspace. Example: Upload a 50-page government report. System creates:
  • Parent asset: “Government Report.pdf”
  • Child assets: Individual pages for targeted analysis
  • Extracted text: Searchable content from each page

Supported Content Types

PDF Documents

Processing: Automatic text extraction with page by page breakdown
  • Extract text content from each page
  • Preserve document structure and metadata
  • Handle scanned PDFs with OCR capabilities
  • Create child assets for individual pages
Use Cases: Research papers, government documents, reports, legislation

Web Articles

Processing: Content scraping with media extraction
  • Extract clean article text and metadata
  • Identify and download associated images
  • Parse publication dates and author information
  • Handle RSS feeds and bulk URL processing
Use Cases: News monitoring, blog analysis, social media tracking

Text Content

Processing: Direct text ingestion with metadata support
  • Raw text blocks with custom metadata
  • Structured article creation with embedded assets
  • Support for markdown and rich text formatting
Use Cases: Interview transcripts, survey responses, social media posts

CSV Files

Processing: Parsing with row level asset creation
  • Automatic delimiter detection
  • Header row identification and validation
  • Create individual assets for each data row
  • Support for large files with streaming processing
Use Cases: Survey data, financial records, voting data, statistical datasets

Uploading Content

File Upload

The most direct way to add content:
  • Drag and drop files onto the upload area
  • Select files using the file picker
  • Paste URLs for web articles
  • Upload CSV files for structured data

Bulk Upload

For large document sets:
  • Zip files containing multiple documents
  • CSV files with URLs to scrape
  • RSS feeds for continuous monitoring
  • API integration for automated ingestion

URL Ingestion

Add web content directly:
  • Single URLs: Individual articles or pages
  • Bulk URLs: Paste multiple URLs at once
  • RSS Feeds: Subscribe to news feeds
  • Search Results: Import search result pages

Organising with Bundles

Bundles are collections of related assets, think of them as folders for your research materials.

Creating Bundles

Project-Based Organisation:
  • “Climate Policy Analysis”: All climate-related documents
  • “Election Coverage 2024”: News articles from election period
  • “Grant Applications Q1”: Funding requests for review
Source-Based Organisation:
  • “Government Sources”: Official documents and reports
  • “News Outlets”: Media coverage and analysis
  • “Academic Papers”: Research and studies
Topic-Based Organisation:
  • “Immigration Policy”: All immigration-related content
  • “Economic Analysis”: Financial reports and data
  • “Social Issues”: Community and social policy content

Bundle Management

Nested Structure:
Climate Policy Analysis/
├── Government Reports/
│   ├── IPCC Report 2023.pdf
│   └── National Climate Plan.pdf
├── News Coverage/
│   ├── Guardian Articles/
│   └── Reuters Coverage/
└── Academic Research/
    ├── Climate Science Papers/
    └── Policy Analysis Studies/
Cross References:
  • Assets can belong to multiple bundles
  • Create thematic collections without duplicating content
  • Maintain relationships between related documents

Asset Relationships

Parent-Child Relationships

When processing complex documents, the system creates hierarchical relationships: PDF Processing:
📄 "Research Report.pdf" (parent)
├── 📝 "Page 1" (child, extracted text)
├── 📝 "Page 2" (child, extracted text)  
├── 📝 "Page 3" (child, extracted text)
└── 🖼️ "Figure 1" (child, extracted image)
Each page becomes a separate asset for targeted analysis while maintaining connection to the source document. Web Article Processing:
🌐 "News Article" (parent)
├── 🖼️ "Featured Image" (child, role="featured")
├── 🖼️ "Chart 1" (child, role="content")
└── 🖼️ "Photo Gallery" (child, role="content")
Images are extracted and categorised by their role in the article structure. CSV Processing:
📊 "Survey Data.csv" (parent)
├── 📋 "Row 1: Respondent 001" (child)
├── 📋 "Row 2: Respondent 002" (child)
├── 📋 "Row 3: Respondent 003" (child)
└── ... (additional rows)
Each row becomes an individual asset for granular analysis and filtering.
I