Classification
How we classify the content in our engine
We use a combination of natural language prompting and structured data validation to perform qualitative content analysis at quantitative scale. This approach allows us to create sophisticated classification schemes that combine the nuanced understanding of LLMs with strict data validation.
The key components of our classification strategy are:
-
Natural Language Codebooks: We define our classification criteria in natural language, allowing for rich, nuanced instructions that LLMs can understand and apply consistently.
-
Structured Output Enforcement: Using Pydantic models, we strictly define the shape and validation rules for our classification outputs.
-
Type Safety: We ensure that all classifications conform to our predefined schemas while allowing the flexibility of natural language interpretation.
Here’s an example of how these components work together:
We are using a local LLM for this example, but you can just change the LOCAL_LLM Flag of the .env file to False We are routing this through litellm to ensure api compatibility. But you can use any model and endpoint that is Instructor—>LiteLLM compatible (bust most are)