Why one model is never enough
The instinct when building AI-powered learning tools is to route everything through the most capable foundation model available. A sufficiently large model can answer questions, explain concepts, generate examples, and evaluate responses. It does all of these things adequately. The problem is that adequacy is not the standard for learning. A model that generates a plausible explanation of photosynthesis is not the same as a model that detects a specific misconception about the Calvin cycle and generates an explanation targeted precisely at that gap. The second task requires different reasoning, different calibration, and often a different architecture.
The five cognitive task types
- Factual recall — closed-domain retrieval; small, fast model, high confidence threshold
- Concept formation — requires analogies and examples; larger model with creativity latitude
- Procedural application — step-by-step reasoning; chain-of-thought model with validation
- Causal reasoning — multi-hop inference; largest context, ambiguity tolerance high
- Metacognitive reflection — assessing learner's own understanding; specialised evaluator
Validation before delivery
The routing layer is only half of the Talos architecture. Every model output passes through a validation layer before it reaches the learner. This layer checks for factual consistency, appropriate difficulty calibration, and absence of confabulation. It is not a secondary model judging the output of a primary model — it is a set of lightweight classifiers trained on labelled examples of good and bad learning responses. The result is a system that is slower than a direct API call and significantly more reliable than any single model operating without oversight.