LLM Fundamentals: Parameters, Embeddings, and Attention
How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.
3 posts
How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.
Core RAG concepts understood while planning LinguaRAG: offline/online phase separation, SSE streaming mechanics, prompt assembly, and the role of pgvector.
Anthropic's context engineering guide covers strategies for optimizing AI agent performance through deliberate token management — moving beyond simple prompt engineering to optimize the entire information ecosystem, including system instructions, tools, external data, and message history.