Arav AI 🤖✨

Multi-Domain-Humanized-AI-Assistant

Arav is a highly responsive, voice-activated AI assistant with a unique Hinglish personality. Built with Python, Groq, and state-of-the-art speech technologies, Arav is designed to be a helpful, context-aware, and engaging companion.

🚀 Features

Voice-Activated: Hands-free operation using wake words ("Hey Arav").
High-Speed Responses: Powered by the Groq LPU™ Inference Engine with the Llama 3.1 model for near-instantaneous replies.
Unique Personality: A custom "Hinglish" persona that makes interactions feel natural and fun.
Context-Aware: Knows the current time, date, and your location for relevant answers.
Real-time Weather: Integrated with Open-Meteo for live weather updates.
Noise Reduction: Cleans microphone input for better recognition in noisy environments.
Switchable Voices: Choose between male or female voice output.
Conversation Memory: Remembers recent dialogue for follow-up questions.
Extensible Skills:
- Movie Recommendations
- Finding Places
- Emotional Support Conversations
- Weather Forecasting
- Fashion Tips
- General Talk and Q&A

🛠️ How It Works

Arav operates on a simple yet effective loop: Listen → Process → Think → Respond

Wake Word Detection: Listens passively for "Hey Arav".
Command Recording: Starts recording after activation.
Audio Processing: Cleans audio via noise reduction algorithms.
Speech-to-Text (STT): Converts speech to text using Google STT.
Context Assembly: Combines the user's query, history, time, and weather.
LLM Inference: Sends the context to Groq for an intelligent, Hinglish-style response.
Text-to-Speech (TTS): Converts the response into natural-sounding audio.
Audio Playback: Plays the response instantly.

🔧 Tech Stack

LLM Engine: Groq (Llama 3.1 8B Instant)
STT: speech_recognition (Google STT)
TTS: edge-tts
Audio Processing: pyaudio, pydub, sounddevice, noisereduce
APIs: Open-Meteo, TMDB
Language: Python 3.9+
Async Framework: asyncio

⚙️ Setup and Installation

Prerequisites

Python 3.9+
API Keys:
- GroqCloud
- TMDB (optional but recommended)

System dependencies for PyAudio:

Debian/Ubuntu/Raspberry Pi OS

sudo apt-get update && sudo apt-get install portaudio19-dev

macOS
```
brew install portaudio
```

Installation Steps

Clone the repository

git clone https://github.com/your-username/arav-ai.git
cd arav-ai

Create & activate a virtual environment

python3 -m venv venv
source venv/bin/activate

Install dependencies
```
pip install -r requirements.txt
```

Set environment variables
Create a .env file in the root directory:

GROQ_API_KEY="gsk_YourGroqApiKeyHere"
TMDB_API_KEY="YourTmdbApiKeyHere"

📦 requirements.txt

groq
requests
speechrecognition
edge-tts
python-dotenv
pydub
numpy
sounddevice
pyaudio
scipy
noisereduce

📚 Libraries Used

Below are the libraries used throughout the project, grouped and explained for clarity.

🔧 Core Python Libraries

json

Used for saving & loading conversation history in .json format.

os

Handles environment variables, file paths, and file clean-ups.

datetime

Fetches live time info to include in the system prompt.

re

Used for regex filtering to remove emojis / markdown before TTS output.

time

Used for pauses, handling delays, and timing operations in audio processing.

🌐 Networking & API Libraries

requests

Performs external API calls including weather (Open-Meteo) and TMDB movie search.

dotenv (load_dotenv)

Loads API keys (GROQ_API_KEY, TMDB_API_KEY) from .env file securely.

🤖 AI Interaction

groq

Used to call Groq LLM (Llama 3.1) for generating conversational responses.

🎙 Audio Input / Output Libraries

speech_recognition

Handles speech-to-text using Google STT API.

pyaudio

Records microphone audio in raw PCM format.

wave

Saves raw byte-stream frames into .wav audio files.

sounddevice

Used to play final generated speech audio.

pydub

Converts MP3 → WAV, handles buffering, helps playback preparation.

edge_tts

Generates Text-to-Speech using Microsoft Edge Neural TTS voices.

io

Handles in-memory audio buffers for TTS playback.

🔊 Audio Processing & Noise Reduction

numpy

Performs array transformations for audio analysis & TTS playback.

scipy.io.wavfile

Used to read/write numeric WAV data for denoising.

noisereduce

Applies deep learning–based noise reduction on recorded voice commands.

🏠 Hardware Interaction

RPi.GPIO

Controls Raspberry Pi GPIO pins (optional future extension: LED, button, sensor triggers).

📦 Project Overview

This project is a fully offline + online hybrid voice assistant, named Arav, powered by:

Groq Llama 3.1 model

Google Speech Recognition

Microsoft Edge Neural TTS

Noise-reduced microphone input

Wake-word detection

Location-aware weather

Conversational memory

⚙️ Core Architecture Summary

1. Wake-Word Listener

Continuously records 3-second audio chunks and checks if the user said:

["arav", "ara", "aarav", "naru", "hey naru"]

2. Voice Command Recording

Once awakened, it records a 7-second command:

Applies noise reduction

Converts to WAV

Sends to Google STT

3. Command Understanding

Processes:

Voice switching

Location change

Sleep mode

Movie queries

Weather questions

4. LLM Interaction

Forms a system prompt including:

Time

Weather

Location

Personality profile (Hinglish, witty, meme-ish)

Movie recommendation

Then calls Groq API for reply.

5. Speech Output

Cleans text of emojis & markdown → sends to Edge TTS → plays via sounddevice.

📁 Important Configurations
Audio Settings:
MIC_SAMPLE_RATE = 16000
MIC_CHANNELS = 1
MIC_FORMAT = pyaudio.paInt16
MIC_CHUNK_SIZE = 1024

Filenames:

temp_raw_audio.wav  
temp_clean_audio.wav  
conversation_history.json

🔥 Why this Architecture Works Great

✔ Lightweight
✔ Super fast (Groq inference)
✔ Good noise reduction
✔ Real conversational memory
✔ Wake-word based hands-free experience
✔ Human-like Hinglish personality

🚀 Running the App

Install Dependencies

pip install -r requirements.txt

Add your .env file:

GROQ_API_KEY=your_key
TMDB_API_KEY=your_key

Run:

python main.py

Arav waits for your voice: “hey Arav”

→ processes your command → responds naturally.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Experiment_model.py		Experiment_model.py
Final_Modified_Code.py		Final_Modified_Code.py
README.md		README.md
conversation_history.json		conversation_history.json
led.py		led.py
raw_audio.wav		raw_audio.wav
voice_model.py		voice_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Arav AI 🤖✨

🚀 Features

🛠️ How It Works

🔧 Tech Stack

⚙️ Setup and Installation

Prerequisites

Installation Steps

📦 requirements.txt

📚 Libraries Used

🔧 Core Python Libraries

🌐 Networking & API Libraries

🤖 AI Interaction

🎙 Audio Input / Output Libraries

🔊 Audio Processing & Noise Reduction

🏠 Hardware Interaction

📦 Project Overview

⚙️ Core Architecture Summary

1. Wake-Word Listener

2. Voice Command Recording

3. Command Understanding

4. LLM Interaction

5. Speech Output

Arav waits for your voice: “hey Arav”

About

Uh oh!

Releases

Packages

Languages

NarayanTheRocker/Arav__Multi-Domain-Humanized-AI-Assistant

Folders and files

Latest commit

History

Repository files navigation

Arav AI 🤖✨

🚀 Features

🛠️ How It Works

🔧 Tech Stack

⚙️ Setup and Installation

Prerequisites

Installation Steps

📦 requirements.txt

📚 Libraries Used

🔧 Core Python Libraries

🌐 Networking & API Libraries

🤖 AI Interaction

🎙 Audio Input / Output Libraries

🔊 Audio Processing & Noise Reduction

🏠 Hardware Interaction

📦 Project Overview

⚙️ Core Architecture Summary

1. Wake-Word Listener

2. Voice Command Recording

3. Command Understanding

4. LLM Interaction

5. Speech Output

Arav waits for your voice: “hey Arav”

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages