Study Roadmap
AI-Native 2026
The technical guide for engineers mastering Neuro-Symbolic Context Engineering — from LLM internals and KV cache to MCP, DSPy, ontologies and autonomous agent production.
LLM Internals & Scaling Laws
Transformer architecture, attention mechanisms, KV cache and the laws governing cognitive capability emergence
Transformer Architecture In Depth
Q/K/V as linear projections, Multi-Head vs Grouped-Query Attention, FlashAttention and KV cache mechanics that make inference efficient in long contexts
- Self-attention: Q/K/V projections, √d_k scaling, softmax operation
- Multi-Head vs Multi-Query Attention vs Grouped-Query (GQA) trade-offs
- KV cache: K/V accumulation per token, eviction policy, prefill vs decode phase
- Positional Encoding: RoPE (smooth extrapolation), ALiBi (linear bias), NoPE
- FlashAttention 2/3: IO-aware attention, SRAM tiling, sub-quadratic memory
- Speculative decoding: drafter + verifier, token acceptance rate and speedup
Scaling Laws & Capability Emergence
Kaplan and Chinchilla laws, phase transitions for emergent capabilities, BPE tokenization and post-transformer MoE and SSM architectures
- Kaplan (2020): power law between compute, parameters and loss — earlier overshooting
- Chinchilla (2022): tokens = 20× parameters, optimal compute frontier
- Emergence: BIG-Bench phase transitions, unpredictable capability jumps
- BPE & SentencePiece tokenization: byte-level, vocab size vs coverage trade-off
- MoE (Mixture of Experts): routing, sparse activation, GPT expert claims
- SSM alternatives: Mamba (selective state space), RWKV hybrid approaches
Context Engineering
The discipline of designing, compressing and managing context to maximize cognitive performance — my core specialty
Context Window Architecture
Token budget management, RAPTOR and compression chains, prefix caching and sliding window policies for 1M+ token contexts
- Budget allocation: static vs dynamic, per-component accounting, headroom policy
- "Lost in the Middle": relative position matters — primacy and recency bias
- RAPTOR: recursive abstractive processing, hierarchical semantic clustering
- KV cache reuse: prefix sharing, cache warming, TTL and invalidation triggers
- Sliding window + chunking: overlap, stride, relevance-based span selection
- Context compression: entropy-weighted pruning, selective summarization chains
Advanced RAG & Memory Systems
Vector store internals (HNSW, IVF, PQ), hybrid BM25+dense retrieval, neural reranking and episodic memory architectures for long-running agents
- Dense retrieval: bi-encoders, cross-encoders, late interaction ColBERT
- HNSW vs IVF+PQ: recall@k, search latency, index size trade-offs
- Hybrid search: BM25 + dense, RRF (Reciprocal Rank Fusion)
- Neural reranking: cross-encoder reranker, MonoT5, listwise rerankers
- Episodic vs semantic memory: MemGPT, Mem0, A-MEM consolidation
- Memory policies: TTL, importance scoring, forgetting curves
Neuro-Symbolic Architecture
The convergence between symbolic reasoning and statistical learning — the foundation of Neuro-Symbolic Context Engineering
DSPy & Declarative LM Programming
DSPy transforms prompt engineering into typed LM module programming — Signature, ChainOfThought, Retrieve and MIPRO/BootstrapFewShot optimizers
- DSPy Signature: typed input/output spec replacing prompt string literals
- Modules: dspy.Predict, ChainOfThought, ReAct, ProgramOfThought, Retrieve
- Optimizers: BootstrapFewShot, MIPRO v2, COPRO — automatic prompt optimization
- Assertions & Suggestions: declarative constraints that deflect or assert on output
- TypedPredictor: Pydantic models as output type, automatic validation
- End-to-end pipeline: compilation, traces, evals integrated with optimizer
Ontologies, Graphs & Formal Reasoning
OWL/RDF ontology engineering, SPARQL for knowledge graph queries, GraphRAG and first-order logic integration with LLMs
- OWL 2: classes, object properties, axiomatic restrictions (DL expressivity)
- RDF/SPARQL 1.1: triple graphs, SELECT/CONSTRUCT/ASK, property paths
- KG Embeddings: TransE, RotatE, ComplEx — latent space representation
- GraphRAG & Subgraph-RAG: subgraph retrieval as structured context
- Constraint propagation: SAT, CSP solvers as LLM output validators
- Logic programming + LLMs: Prolog, Datalog, Answer Set Programming (ASP)
MCP & Agentic Protocols
Model Context Protocol spec 2025-11-25: transports, tool contracts, OAuth security and agent-to-agent A2A protocol
MCP Internals: JSON-RPC & Transports
Host/Client/Server architecture, JSON-RPC 2.0 over stdio, HTTP+SSE and Streamable HTTP — session lifecycle and capability negotiation
- JSON-RPC 2.0: request/response/notification, batch, error code taxonomy
- stdio transport: newline-delimited framing, process lifecycle, init sequence
- HTTP+SSE: SSE for server→client (GET), POST for client→server
- Streamable HTTP (spec 2025-11-25): session resumption, SSE upgrade
- Capability negotiation: initialize handshake, protocol versioning, roots
- Tool annotations: readOnlyHint, destructiveHint, idempotentHint, openWorldHint
MCP Security & Tool Contracts
OAuth 2.1 with PKCE for remote servers, JSON Schema validation for tools, Sampling schema and prompt injection defense via MCP tools
- OAuth 2.1 + PKCE: authorization code flow, token rotation for remote MCP
- Tool JSON Schema: strict input validation, additionalProperties: false
- Sampling schema: temperature, top_p, stop sequences, max_tokens as contract
- Prompt injection via MCP: attack vectors, tool result poisoning, mitigations
- Sandboxing: isolated Docker for destructive tools, read-only mounts
- A2A Protocol (Google): agent-to-agent via HTTP+JSON-RPC, agent cards
Autonomous Agent Patterns
ReAct, Tree-of-Thoughts, Reflexion, MCTS and multi-agent patterns with explicit coordination and Human-in-the-Loop
Reasoning Patterns & Self-Reflection
ReAct (reason+act), Tree-of-Thoughts with beam search and MCTS, Reflexion with verbal memory and Self-Consistency via multiple sampling
- ReAct: thought→action→observation loop, external environment grounding
- Chain-of-Thought (Wei et al.): zero-shot CoT, exemplar selection, step-by-step
- Tree-of-Thoughts: reasoning nodes, beam search, BFS vs DFS vs MCTS
- Reflexion (Shinn et al.): episodic state, self-eval criteria, verbal memory
- Self-Consistency: multiple reasoning paths, voting aggregation
- Evaluator-Optimizer: generator + critic loop with defined external criterion
Multi-Agent & Orchestration
Orchestrator-Workers, Parallelization, structured inter-agent communication, shared state and Human-in-the-Loop patterns with checkpoints
- Orchestrator-Workers: dynamic delegation, capability and specialization routing
- Parallelization: fan-out + join, rate limiting, concurrency control per tool
- Inter-agent communication: typed message contracts, schema validation
- Shared state: eventual consistency, conflict resolution, CRDT patterns
- Self-healing: automatic diagnosis, retry with backoff, circuit breaker
- HITL (Human-in-the-Loop): checkpoints, interrupt patterns, approval gates
AI-Native Development
Claude Code, GitHub Copilot, Cursor — and the design of CLAUDE.md, AGENTS.md, instructions, hooks and skills that shape agentic behavior
Claude Code & Copilot — Agentic Loops
Agentic loop perceive→plan→act→reflect, parallel subagents, CLAUDE.md as agent contract and GitHub Copilot agent mode with MCP integration
- Claude Code: subagents, parallel tasks, extended thinking in code review
- CLAUDE.md: project structure, commands, best practices — agent contract
- AGENTS.md: multi-agent coordination, project map, agent skill routing
- Copilot agent mode: inline + sidebar + agent, tool calls, MCP servers
- .instructions.md: applyTo globs, scoped context, instruction layering
- Cursor: .cursor/rules vs .cursorrules, composer context, notepads
Skills, Hooks & Context Injection
SKILL.md design, lifecycle hooks (SessionStart, PostToolUse), automatic context injection and the Neuro-Symbolic Context Engine as single source of truth
- SKILL.md: structure, trigger conditions, domain knowledge packaging
- Hooks: SessionStart (pre-load), PostToolUse (observe), pre-commit (validate)
- Context injection: auto-sync, workspace manifest, pre-loaded knowledge digests
- Neuro-Symbolic Context Engine: projectId, activity routing, depth levels
- Knowledge base: contexts, agents, shared infrastructure, MCP auto-generation
- Self-healing protocol: implement → tsc → vitest → fix loop (max 3 cycles)
Resources
Evaluation, Observability & Production
RAGAS, LLM-as-judge, distributed tracing with LangSmith/Phoenix, adversarial red-teaming and production governance
LLM & RAG Evaluation (Evals)
RAGAS (context_precision, faithfulness, answer_relevancy), LLM-as-judge, Expected Calibration Error, hallucination detection and technical benchmarks
- RAGAS: context_precision, context_recall, faithfulness, answer_relevancy — RAG metrics
- LLM-as-judge: preference modeling, G-eval, scalable oversight for annotation
- Calibration: ECE (Expected Calibration Error), reliability diagrams, temperature scaling
- Hallucination detection: factuality scoring, entailment classifiers, SelfCheckGPT
- Benchmarks: MMLU, HELM, BIG-Bench, LMSYS Arena Elo, GAIA, SWE-bench
- Evals framework: promptfoo, LangFuse evals, custom harness with CI integration
Observability & Production Safety
LangSmith and Phoenix/Arize for LLM tracing, adversarial red-teaming, Constitutional AI, guardrails and cost-efficient deployment strategies
- Tracing: LangSmith, Phoenix/Arize — spans, traces, token accounting per request
- Metrics: P95/P99 latency, TTFT (Time-to-First-Token), throughput, tokens/s
- Red-teaming: jailbreaks, indirect injection, data poisoning, model inversion
- Constitutional AI: RLHF with principle feedback, harmlessness, helpful, honest
- Guardrails: NeMo Guardrails, Llama Guard 3, Rebuff prompt injection detector
- Deployment: serverless vs batch inference, cost/quality frontier, caching
Follow the Evolution
Daily news, analysis and deep dives on each topic from this roadmap. Follow on the portal.