LLM Fundamentals: Parameters, Embeddings, and Attention
How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.
6 posts
How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.
A structured map of GenAI application patterns and a practical framework for deciding how deep to go — calibrated to AI Product Engineer vs AI Engineer roles.
A comprehensive breakdown of AI pipeline concepts learned building LinguaRAG — a Korean-German textbook AI tutor using RAG, SSE streaming, multi-layer prompts, and pgvector.
How to reliably detect structured unit boundaries in a bilingual PDF and prevent boilerplate text from polluting RAG vector chunks.
Hard-won lessons from building a robust PDF chunker for a Korean-German textbook: multiple detection guards, line-level copyright stripping, and RAG behavior verification.
Core RAG concepts understood while planning LinguaRAG: offline/online phase separation, SSE streaming mechanics, prompt assembly, and the role of pgvector.