#llm

3 posts

LLM Fundamentals: Parameters, Embeddings, and Attention

How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.

RAG Architecture Fundamentals — pgvector, FastAPI, SSE Streaming, and Embedding Models

Core RAG concepts understood while planning LinguaRAG: offline/online phase separation, SSE streaming mechanics, prompt assembly, and the role of pgvector.

Effective Context Engineering for AI Agents

Anthropic's context engineering guide covers strategies for optimizing AI agent performance through deliberate token management — moving beyond simple prompt engineering to optimize the entire information ecosystem, including system instructions, tools, external data, and message history.