V1 BodyLogic — RAG Agent with Tools and Memory

BodyLogic is a Streamlit app that delivers evidence-based fitness guidance using a Retrieval-Augmented Generation (RAG) pipeline, tool-calling, and lightweight conversational memory. It indexes curated YouTube transcripts into Pinecone with all-mpnet-base-v2 embeddings, then answers user questions with OpenAI's GPT via LangChain Runnables.

The streamlit app uses - rag_chain.py, requirements.txt and streamlit_app.py

Features

RAG on curated YouTube content using Pinecone (768-dim embeddings via all-mpnet-base-v2).
Tool-aware agent: calorie/protein estimator with strict usage rules.
Conversational memory: sliding window of the last 20 messages to preserve context.
Streamlit UI with demo prompts and LangSmith tracing enabled for observability.

Architecture

Ingestion: YouTubeTranscriptApi → chunking (500 characters, 50 overlap) → HuggingFace embeddings → Pinecone upsert.
This is done with the videotranscripts.py file
Serving chain (LangChain RunnableSequence):
- Retriever step: encodes user query and fetches top-k (5) matches from Pinecone.
- Prompt builder: system message + memory + optional knowledge base context message.
- First LLM call with bound tools decides whether to call tools.
- Tool execution and final LLM response.
Memory: global Python list capped at 20 messages.

Repo Structure

rag_chain.py — Core chain: initialization, retriever, tools, runnable steps, and chat function.
app.py — Streamlit UI, sidebar checks, demo prompts, and chat loop.
video transcripts.py — Script/notebook to fetch transcript, chunk, embed, and index into Pinecone.

Requirements

Python 3.10+
Recommended packages: streamlit, langchain, langchain-openai, langchain-core, langchain-text-splitters, sentence-transformers, pinecone-client, youtube-transcript-api, huggingface-hub, langchain-huggingface, tqdm, python-dotenv.

Environment Variables

Set via Streamlit secrets

V2 Documentation for the V2 BodyLogic.py file

This script is a Python CLI chatbot that combines LangChain agents, Pinecone-based RAG, and simple fitness tools to act as a personal-trainer-style assistant.

Project overview

This project provides a command-line chatbot that uses a LangChain agent with tools, short-term conversation memory, and a Pinecone vector index to answer fitness questions and general queries. The assistant behaves like a friendly personal trainer, can search a YouTube Q&A knowledge base, and offers utility tools such as word counting, current time, and simple training plan generation.

Features

LangChain ZERO_SHOT_REACT_DESCRIPTION agent with ConversationBufferWindowMemory for short conversation context.

RAG over a Pinecone index (youtube-qa-index) using SentenceTransformers for semantic search.

Fitness-focused tools: weekly training plan generator plus helper tools for time and word counting.

Prerequisites

Python 3.9+ installed on your system.

Accounts and API keys for:

OpenAI (chat model gpt-3.5-turbo)

Pinecone (vector index already created as youtube-qa-index)

LangSmith / LangChain tracing (optional but used here)

Installation

Clone your project repository and change into the directory.

Create and activate a virtual environment, then install requirements such as langchain, langchain-openai, pinecone-client, sentence-transformers, python-dotenv, and openai via pip.

Ensure your Pinecone index name matches INDEX_NAME = "youtube-qa-index" or update the script accordingly.

Environment variables

Create a .env file in the project root with at least:

OPENAI_API_KEY=...

PINECONE_API_KEY=...

LANGCHAIN_API_KEY=...

LANGCHAIN_PROJECT=memory-and-tools-rag-agent (or your preferred project name)

The script also enables LangSmith tracing via:

LANGCHAIN_TRACING_V2=true

LANGCHAIN_ENDPOINT=https://api.smith.langchain.com

How it works

Embeddings: Uses the SentenceTransformer model flax-sentence-embeddings/all_datasets_v3_mpnet-base to encode queries and retrieve relevant passages from Pinecone.

RAG: rag_search_func queries the Pinecone index, pulls text or passage_text from metadata, and returns concatenated context for the agent to use in responses.

Agent: initialize_agent wires the tools, ChatOpenAI model, and ConversationBufferWindowMemory into a ZERO_SHOT_REACT_DESCRIPTION agent that selects tools based on their descriptions.

Memory

The agent maintains conversational context using LangChain's ConversationBufferWindowMemory, which stores the last 5 messages ($k=5$) in the chat_history to give the LLM short-term recall. This ensures the agent remembers recent turns for relevant follow-up questions while preventing the prompt from becoming too long

Tools exposed to the agent

rag_search: Looks up information in the YouTube QA knowledge base using Pinecone.

get_current_time: Returns the current date and time string (input ignored).

word_count: Returns the word count for arbitrary text.

training_plan: Generates a simple weekly training split from natural language preferences (days per week, level, equipment).

The system prompt also instructs the assistant when each tool should be preferred for specific query types.

Running the chatbot

Ensure your .env is correctly configured and the virtual environment is active.

Run the script:

bash python V2 Bodylogic.py Use the CLI loop:

Type fitness questions like “Can you design a 3-day beginner routine for the gym?” to get a plan using both training_plan and RAG when relevant.

Ask knowledge-base questions such as “What did the YouTube coach say about walking?” to trigger rag_search over the Pinecone index.

Ask “What’s the current time?” or “How many words are in: …” to exercise the utility tools.

Type exit or quit to end the session.

Screenshots

I have included some screenshots of the notebook and langsmith working

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
README.md		README.md
Roland Final Project Bodylogic RAG Agent-2.pptx		Roland Final Project Bodylogic RAG Agent-2.pptx
V2 BodyLogic CLI Chat.png		V2 BodyLogic CLI Chat.png
V2 BodyLogic Langsmith Memory.png		V2 BodyLogic Langsmith Memory.png
V2 Bodylogic.py		V2 Bodylogic.py
rag_chain.py		rag_chain.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
v2 BodyLogic Langsmith trace.png		v2 BodyLogic Langsmith trace.png
video transcripts.py		video transcripts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

V1 BodyLogic — RAG Agent with Tools and Memory

Features

Architecture

Repo Structure

Requirements

Environment Variables

V2 Documentation for the V2 BodyLogic.py file

Project overview

Features

Prerequisites

Installation

Environment variables

How it works

Memory

Tools exposed to the agent

Running the chatbot

Screenshots

About

Uh oh!

Releases

Packages

Languages

rowlie/Bodylogic

Folders and files

Latest commit

History

Repository files navigation

V1 BodyLogic — RAG Agent with Tools and Memory

Features

Architecture

Repo Structure

Requirements

Environment Variables

V2 Documentation for the V2 BodyLogic.py file

Project overview

Features

Prerequisites

Installation

Environment variables

How it works

Memory

Tools exposed to the agent

Running the chatbot

Screenshots

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages