Technical Architecture

Open Politics combines powerful data processing capabilities with intuitive user interfaces to democratise political intelligence. This page explains how our system is built and how the components work together.

The following is a very technical section about our tools and methods. If you are interested in the app (HQ) and what you can do with it as a user, please click here

Overview

The platform consists of two primary components that operate together:

HQ

Web application that visualises and provides analytical tools for the data

OPOL

Data engine that ingests, processes, and enriches political content

This separation of concerns allows us to evolve each component independently whilst maintaining seamless integration between them.

OPOL: The Data Engine

OPOL serves as the foundation of Open Politics, transforming diverse political data into structured, enriched information ready for analysis.

OPOL Stack Architecture with Workflow Orchestration

Key Capabilities

Data Ingestion

Collection of data from diverse sources through configurable scrapers

Vector Processing

Conversion of text into numerical vectors for semantic analysis

Entity Recognition

Identification of people, organisations, locations, and other entities

Geocoding

Translation of location mentions into geographical coordinates

LLM Classification

Application of user-defined analytical frameworks using language models

Semantic Search

Finding content based on meaning rather than just keywords

Processing Pipeline

OPOL processes content through a series of interconnected microservices, each handling a specific aspect of the data enrichment pipeline:

Content Ingestion

Scrapers collect articles, documents, and other content from configured sources

Custom sources can be added by creating Python scripts that return data in the format: URL | Headline | Paragraphs | Source

Vectorisation

Text content is converted into vector embeddings for semantic operations

Vectors enable similarity search, clustering, and other advanced operations

Entity Extraction

Named entities are identified and categorised using NLP techniques Entity Extraction Process

Geocoding

Location entities are enriched with geographical coordinates

This enables geospatial visualisation and analysis of political content

Classification

User-defined analytical frameworks are applied using LLMs

Classifications can be customised for specific analytical needs

Technical Implementation

Microservice Architecture

Each component of the data pipeline is implemented as an independent microservice:

├── service-scraper      # Content ingestion
├── service-embeddings    # Vector processing
├── service-geo           # Geocoding
└── app-opol-core         # API and coordination

This modularity allows for independent scaling and development of each component.

Workflow Orchestration

Storage Layer

LLM Integration

HQ: The User Interface

HQ is a modern web application that provides intuitive access to the processed data and analytical capabilities of OPOL.

Key Components

Globe Visualisation

Interactive 3D representation of global political events

Event clustering and filtering
Location-based exploration
Temporal playback
Entity highlighting

Classification Runner

Interface for defining and applying analytical frameworks

Schema definition
Document selection
Model configuration
Results visualisation

Search Interface

Unified search across multiple data types

Semantic search
Faceted filtering
Relevance scoring
Search history

Workspace System

Customisable environments for different analytical needs

Layout customisation
Dashboard creation
Tool integration
Result sharing

Technical Implementation

Framework: NextJS-based web application
Frontend: TypeScript with React components
Styling: Tailwind CSS for responsive design
Visualisation: Three.js for 3D globe and graph visualisations
UI Components: ShadCN for accessible interface elements
Authentication: Secure user authentication and authorisation

Deployment Options

Open Politics supports multiple deployment scenarios to accommodate different needs:

Self-Hosted

Deploy the entire stack on your own infrastructure

Complete data sovereignty
Full customisation options
Higher resource requirements
Technical expertise needed

Hybrid

Run HQ locally with a hosted OPOL backend

Reduced infrastructure requirements
Data processing handled externally
Simplified setup and maintenance
Balance of control and convenience

Fully Hosted

Access through our public hosted instance

No local infrastructure needed
Immediate access to features
Regular updates
Community data sharing

Custom Deployment

Tailored deployment for specific organisational needs

Optimised for particular use cases
Integration with existing systems
Specialised data sources
Custom security configurations

System Requirements

For local deployments of the complete stack:

RAM

32GB recommended

Storage

20GB minimum

Processor

4+ cores recommended

GPU is optional but can significantly improve performance for LLM operations and vector processing.

Next Steps

Installation Guide

Step-by-step instructions for setting up Open Politics

API Documentation

Reference for programmatic interaction with the platform

Configuration Options

Detailed configuration settings and customisation

Contributing Guide

How to extend and improve the Open Politics platform

The Project

The App

The Data Engine

Overview

HQ

OPOL