Building Context-Aware RAG Models in Python
In this talk, we’ll walk through how to build context-aware Retrieval-Augmented Generation (RAG) systems in Python. The idea is to combine the power of large language models with our own custom data to create powerful semantic data search technology. Whether we’re creating AI assistants, document Q systems or chatbots, drawing from my experience building several RAG models at Adobe, in this session will walk us through the fundamentals and advanced implementation of search-augmented LLMs using Python.
The talk will begin with a quick overview of RAG systems and why they matter, and then I’ll dive into how to set up semantic search with vector databases and connect that with LLMs for contextual responses.
1. Introduction to RAG: what it is and why it's useful
2. How to convert documents into vector embeddings for search
3. Using FAISS or Chroma to store and query vectors in Python
4. Hooking up LLMs to retrieve context-aware answers
5. Common challenges (like hallucinations) and how to address them