This project provides a streamlined way to ingest GitHub repositories, process their contents, and use Retrieval-Augmented Generation (RAG) with a Language Model to answer questions about the code. This project runs locally and requires Ollama with the LLM of your choice installed.
- GitHub Repository Ingestion: Extracts code summaries, file structures, and content from a given GitHub repository.
- File Storage: Saves ingested data to a local file for persistence.
- Retrieval-Augmented Generation (RAG): Uses a vector-based search system to query indexed documents.
- LLM-Powered Queries: Utilizes an Ollama-powered model to answer questions about the ingested codebase.
- Web Interface: A Flask-based front-end with Bootstrap styling for a clean UI.
Ensure you have Python installed (recommended: Python 3.8+). Then install the required dependencies:
pip install llama-index torch gitingest
Additionally, install Ollama and the LLM of your choice:
ollama pull <your-llm-model>
-
Clone the repository:
git clone https://github.com/SHASWATSINGH3101/Git_RAG cd Git_RAG
-
Run the Flask application:
python app.py
-
Open your browser and go to:
http://127.0.0.1:5000/
- Enter a GitHub repository URL to ingest code.
- Specify directories and embedding model parameters.
- Input a query to retrieve insights about the code.
- View results in the web interface.
project_root/
│── data/
│ ├── data.txt
│── index/
│── app.py
│── requirements.txt
│── README.md
You can modify key parameters in app.py
:
- Embedding Model: Change
embed_model_name
to another supported model. - LLM Model: Update
llm_model_name
for different LLM variants. - Index & Data Storage: Adjust
index_dir
anddocs_dir
paths.
Which function is used for severity detection in the code?
- Implement user authentication
- Support for multiple repositories
- Additional visualization tools
This project is licensed under the MIT License.
- My Github - GitHub Profile