#rag

6 posts

LLM Fundamentals: Parameters, Embeddings, and Attention

How LLM parameters encode meaning, what embedding dimensions actually represent, and why Transformer attention is computed in parallel.

Generative AI Design Patterns Landscape and AI Engineer Career Positioning

A structured map of GenAI application patterns and a practical framework for deciding how deep to go — calibrated to AI Product Engineer vs AI Engineer roles.

AI Pipeline Patterns from LinguaRAG: RAG, Streaming, and Prompt Architecture

A comprehensive breakdown of AI pipeline concepts learned building LinguaRAG — a Korean-German textbook AI tutor using RAG, SSE streaming, multi-layer prompts, and pgvector.

PDF RAG Indexing: Unit Detection and Chunk Noise Filtering

How to reliably detect structured unit boundaries in a bilingual PDF and prevent boilerplate text from polluting RAG vector chunks.

PDF Indexing Pipeline: Unit Detection Guards and Copyright Filtering

Hard-won lessons from building a robust PDF chunker for a Korean-German textbook: multiple detection guards, line-level copyright stripping, and RAG behavior verification.

RAG Architecture Fundamentals — pgvector, FastAPI, SSE Streaming, and Embedding Models

Core RAG concepts understood while planning LinguaRAG: offline/online phase separation, SSE streaming mechanics, prompt assembly, and the role of pgvector.