Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
articles		articles
tests		tests
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
data_loader.py		data_loader.py
embeddings.py		embeddings.py
main.py		main.py
rag.py		rag.py
requirements.txt		requirements.txt
vector_store.py		vector_store.py

Repository files navigation

Basic RAG (Retrieval-Augmented Generation)

Overview

This project is a simple Python implementation of Retrieval-Augmented Generation (RAG).
It demonstrates how to:

Load and split .txt documents
Generate embeddings using OpenAI
Store and query embeddings in ChromaDB
Generate concise answers with context retrieved from documents

It was created to practice document retrieval, vector databases, and LLM-based question answering.

Tools & Libraries

The project uses the following Python libraries:

openai → to generate embeddings and chat responses
chromadb → as the vector database for storing and retrieving document chunks
python-dotenv → to manage API keys securely in a .env file

Files in this Repository

main.py → Entry point to run the pipeline and ask questions interactively
config.py → Handles environment variables and configuration
data_loader.py → Loads documents and splits them into chunks
embeddings.py → Generates embeddings using OpenAI
vector_store.py → Sets up and manages ChromaDB collection
rag.py → RAG pipeline logic (retrieval + response generation)
articles/ → Folder containing .txt files used as knowledge base
requirements.txt → List of dependencies

Workflow

Load documents → split into chunks
Create embeddings for each chunk
Store embeddings in ChromaDB
User asks a question
Retrieve top matching chunks
LLM generates an answer using context

About

basic-rag is a minimal implementation of Retrieval-Augmented Generation (RAG). It demonstrates how to index documents, retrieve relevant context with vector search, and use a large language model to generate grounded answers.

Report repository

Releases

No releases published

Packages

Contributors

Languages

Python 100.0%