M

Building Blocks · Days 21-30

RAG and Knowledge Systems

Retrieval augmented generation connects models to private knowledge. Learn chunking, vector databases, hybrid search, reranking, metadata filters, GraphRAG, document parsing, and multimodal retrieval.

Intermediate 10 subtopics 10 daily blocks

Outcome

Design retrieval systems that parse documents, chunk intelligently, search hybrid indexes, rerank results, and cite grounded answers.

Practice builds

Personal knowledge-base RAGLaravel docs assistantPDF citation assistant

What to learn

Chunking strategies: fixed, semantic, structural, late chunking
Vector databases: Qdrant, Pinecone, Weaviate, pgvector, Chroma, Milvus
Hybrid search: BM25 plus vector search and reciprocal rank fusion
Query rewriting, expansion, and HyDE
Multi-hop retrieval and agentic retrieval
Reranking with Cohere, BGE, and Voyage-style models
Metadata filtering and structured retrieval
GraphRAG and knowledge-graph augmentation
Document parsing pipelines: Unstructured, LlamaParse, Docling
Multimodal RAG for images, tables, and charts

Daily study plan

Day 21: Parse a PDF or Markdown document into clean text blocks.
Day 22: Compare fixed, semantic, and structural chunking.
Day 23: Store embeddings in pgvector or Qdrant and retrieve top-k chunks.
Day 24: Add BM25 search and combine results with reciprocal rank fusion.
Day 25: Add metadata filters for source, date, topic, and document type.
Day 26: Add a reranker and compare answer quality.
Day 27: Add query rewriting or HyDE for weak user questions.
Day 28: Build citations and show which chunks supported the answer.
Day 29: Explore GraphRAG or structured entity extraction.
Day 30: Package the RAG app with a repeatable ingestion pipeline.

Resources