Open Politics 🌐
Introduction
Open Politics’ vision is to democratise political intelligence.
The mission is to create an open-source data science and AI toolkit to analyse, summarise, and visualise political information.
There is a preliminary alpha running, we encourage your own experiments with the stack though. Please contact us at engage@open-politics.org if you want to request access.
How we engineer data
Table of Contents
Why Open Politics Exists
- All things regarding politics, be they news, conflicts or legislative procedures, are hard to keep track of. It’s hard to find the time to read through all the documents and news articles necessary to gain a broad and well-informed understanding of political situations. Technology offers great possibilities to make such processes more accessible.
Recently, the advent of Large Language Models has extended the capabilities of textual analysis and understanding. Especially the ability to formulate tasks in natural language opens up new possibilities for analysing text data. Potentially revolutionising the way qualitative and quantitative research can be combined.
This project specialises on assembling data, building infrastructure and embedding tools into meaningful user interfaces.
We generally categorise our work into three pillars
Update: SSARE Release
SSARE is Open Politics’ data aggregation system and vector storage endpoint. It aims to create up-to-date and relevant datasets for the LLMs to work with.
A microservice infrastructure continuously scrapes news sites and stores them in a vector storage and a relational database (Postgres). Sources can be added with Python scripts which yield a dataframe with: URL | Headline | Paragraphs | Source. Just clone the service, add your scripts and bring your own data endpoint into production.
Want to engage? Look into our Developer Jour Fixe!
- Interested in the project? Want to contribute? Share a thought?
- Every Wednesday 15:30 Berlin Time
- Discord Server
Join and talk about the project, ask questions, propose ideas, or just listen in.
Currently needed: - Data Scraper Modules
- Interdisciplinary collaboration on the instruction sets for the LLMs
- Prompt Engineering suggestions
- Frontend/UX/UI work
Tasks
Generally researching:
- Issue/ Area Identification
- Actor Identification
- Stance Triangulation
Including but not limited to tasks like:
- Information summarization
- Vector storage & retrieval
- Information clustering
- Entity Extraction (Named Entity Recognition)
- Q&A Chatbots (for interactive information)
- Providing historical context
- Statement & Intention decoding
- Visual representation of political data
- Monitoring and alerts
[..+ unmaintained and heavily overloaded list of features]