Skip to content

PythonicVarun/Anveshak-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎈 Anveshak AI - Query English Documents in Sanskrit

Anveshak AI is an advanced Retrieval-Augmented Generation (RAG)-based Streamlit application that allows users to query English documents using Sanskrit. It creates a vector database for a selected document and processes queries using Ollama and LangChain.

🌟 Features

Upload & Process PDFs - Converts documents into vectorized data for efficient retrieval.

Multi-Language Query Support - Users can ask questions in Sanskrit, and the system retrieves relevant English information.

Advanced AI Models - Utilizes Ollama embeddings and LLM models to enhance query responses.

Seamless Integration - Built with Streamlit, allowing for an interactive and user-friendly experience.

Efficient Query Handling - Uses LangChain for better contextual understanding and accurate responses.

🚀 Installation & Setup

1️⃣ Clone the Repository

 git clone https://github.com/PythonicVarun/Anveshak-AI.git
 cd Anveshak-AI

2️⃣ Set Up Virtual Environment (Recommended)

 python -m venv venv
 source venv/bin/activate   # For Linux/macOS
 venv\Scripts\activate      # For Windows

3️⃣ Install Dependencies

 pip install -r requirements.txt

4️⃣ Pull the Required Ollama Models

Ensure you have the required models before running the application:

 ollama pull nomic-embed-text
 ollama pull llama2

5️⃣ Create Environment File

Copy the provided .env.example to a new file named .env. This file contains the default environment settings including Ollama host, vector DB path, and logging level. You can modify it if needed.

cp .env.example .env

6️⃣ Set Environment Variables

Set the following environment variable to avoid issues with Protocol Buffers:

 export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

▶️ Running the Application

 python run.py

🔧 How to Use

1️⃣ Upload a PDF or Select a Sample Document from the provided list.

2️⃣ Choose a LLM model from the available Ollama models.

3️⃣ Type your query in Sanskrit 📜 in the chatbox.

4️⃣ The system will process the question and return accurate answers based on the document's content.

5️⃣ Click "Delete Collection" if you want to clear uploaded documents from memory.

📦 Dependencies

  • Python 3.9+ 🐍
  • Streamlit (for UI)
  • Ollama & LangChain (for AI processing)
  • ChromaDB (for vector storage)
  • PDFPlumber (for PDF parsing)

📜 Submission for Hackademia 2k25

Anveshak AI is built as part of the Hackademia 2k25 hackathon challenge to push the boundaries of AI-assisted multilingual knowledge retrieval! 🚀


🌟 Star ⭐ this repo if you like this project!

"Unlock the power of Sanskrit queries with AI-powered retrieval!" 🚀

Built with ❤️ by Varun Agnihotri!

Follow me on GitHub | X | LinkedIn | Instagram

About

AI agent made for PDF prompting for Hackademia 2K25 ⚡

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published