Introduction

Open Politics’ vision is to democratise political intelligence. The mission is to create an open-source data science and AI toolkit to analyse, summarise, and visualise political information.

There is a preliminary alpha running, we encourage your own experiments with the stack though. Please contact us at engage@open-politics.org if you want to request access.

How we engineer data

Table of Contents

Why Open Politics Exists

  • All things regarding politics, be they news, conflicts or legislative procedures, are hard to keep track of. It’s hard to find the time to read through all the documents and news articles necessary to gain a broad and well-informed understanding of political situations. Technology offers great possibilities to make such processes more accessible. Recently, the advent of Large Language Models has extended the capabilities of textual analysis and understanding. Especially the ability to formulate tasks in natural language opens up new possibilities for analysing text data. Potentially revolutionising the way qualitative and quantitative research can be combined.
  • This project aims to combine the best of natural language LLM interfacing and classical Data Science methods to build tools that provide a comprehensive overview of political topics, including summaries of news articles, information about political actors, and the relationships between them.
  • The goal of this project is to make politics more accessible and understandable for everyone.

Update: SSARE Release

SSARE is Open Politics’ data aggregation system and vector storage endpoint. It aims to create up-to-date and relevant datasets for the LLMs to work with.

A microservice infrastructure continuously scrapes news sites and stores them in a vector storage and a relational database (Postgres). Sources can be added with Python scripts which yield a dataframe with: URL | Headline | Paragraphs | Source. Just clone the service, add your scripts and bring your own data endpoint into production.

Want to engage? Look into our Developer Jour Fixe!

  • Interested in the project? Want to contribute? Share a thought?
  • Every Wednesday 15:30 Berlin Time
  • Discord Server Join and talk about the project, ask questions, propose ideas, or just listen in.
    Currently needed:
  • Data Scraper Modules
  • Interdisciplinary collaboration on the instruction sets for the LLMs
  • Prompt Engineering suggestions
  • Frontend/UX/UI work

Tasks

Generally researching:

  • Issue/ Area Identification
  • Actor Identification
  • Stance Triangulation

Including but not limited to tasks like:

  • Information summarization
  • Vector storage & retrieval
  • Information clustering
  • Entity Extraction (Named Entity Recognition)
  • Q&A Chatbots (for interactive information)
  • Providing historical context
  • Statement & Intention decoding
  • Visual representation of political data
  • Monitoring and alerts
    [..+ unmaintained and heavily overloaded list of features]