Skip to content

cs124/pa7-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Programming Assignment 7: Agent!

Late days CANNOT be used on this assignment. Please submit early and often to avoid last minute submission issues!

You should work in groups of 3 members. (To work in a group of size 2, you must get special permission from the staff.) All submissions will be graded according to the same criteria, regardless of group size.

In this assignment you will build a customer service agent that can help with movie ticket bookings and other movie-related requests. The assignment consists of a compulsory part on implementing the collaborative filtering algorithm to recommend movies to the user, and an open-ended part on implementing an LLM agent that can make tool calls, where you are encouraged to be creative and implement interesting functions and tools for the agent to use.

By the end of the assignment, you will submit:

  • Code file (agent.py) for Part 1
  • A text file showing a full transcript of the agent's conversation with the user covering all the features you implemented for Part 1 (transcript_part1.txt)
  • Your implementation for Part 2, preferably in the same file (agent.py), but if necessary, you can create a new file for it to break things down and upload all the agent-related files
  • A text file showing a full transcript of the agent's conversation with the user covering all the features you implemented for Part 2 (transcript_part2.txt)

Important Setup Note

Although this assignment mostly reuses the environment you set up in PA0, we need one additional package. Recall to activate your PA0 environment you will use:

conda activate cs124

Please run this command after activating your cs124 environment. We will be using the dspy library to build our agent to make tool calling easier.

pip install -U dspy

Starter Code

We have provided an interface code file for you: repl.py. It's a common pattern in software engineering known as the Read-Eval-Print-Loop, REPL. The REPL creates a prompt that gets an input string from the user. Then the REPL passes the input to an agent class that is responsible for doing the work. The response generated by the agent class is then handled again by the REPL, which prints the response and waits for more input.

In the starter code folder, fire up the REPL by issuing the following command: python3 repl.py

You can type your message in the prompt to the movie agent, and hit enter. To exit the REPL, write :quit.

All the code that you will need to write for this assignment will be in agent.py. We will describe the components that you will need to implement in the next sections.

Assignment Part 1: Basic Tool-Use Agent (50 points)

First Tool: Recommend Movies via Collaborative Filtering (25 points)

One of the core functions that your agent has to support is to recommend movies to the user. This is a classic problem in recommender systems, and we will use the collaborative filtering algorithm to solve it. Specifically, you will need to implement the binarize, similarity, and recommend_movies functions in agent.py.

We have included the movie ratings matrix in data/ratings.txt. You can load it using the util.load_ratings function. Moreover, we have populated some synthetic user profiles in synthetic_users.py (which you should not modify because our test cases rely on them). The recommend_movies function takes in the user name and the number of movies to recommend, and returns a list of movie titles to recommend based on collaborative filtering. You should implement item-item collaborative filtering with cosine similarity with no mean-centering or normalization of scores.

Here is an example of what you should expect to see when you run the REPL:

recommend_movies("Peter", 3)
['Back to the Future (1985)', 'Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)', 'Star Wars: Episode VI - Return of the Jedi (1983)']

You should expect your list of movies to match exactly the ones in the example above if you implement the function correctly. And we can see that the result makes sense because Peter is a sci-fi fan based on his profile in synthetic_users.py.

Integrating Tools into an LLM Agent (5 points)

Now that we have built a recommend_movies function, we can integrate it into an LLM agent so that it can make tool calls to the recommend_movies function. We will use the dspy library to build our agent to make tool calling easier.

We have provided a MovieTicketAgent class in agent.py that you can use as a starting point. We have also provided a general_qa function that you can use to answer general questions about the movie ticket agent.

To add these tools to your agent, you can simply add them to the tools list in the react_agent variable. Dspy will automatically implement the ReAct framework for you (based on https://arxiv.org/abs/2210.03629) to interleave reasoning and tool calls.

react_agent = dspy.ReAct(
    MovieTicketAgent,
    tools = [
        recommend_movies,
        general_qa,
    ]
)

You can then run the REPL script to see your agent in action: python3 repl.py.

You should expect to see your agent making tool calls to the recommend_movies function and answering general questions about the movie ticket agent. Here is an example of what you should expect to see:

Movie Ticket Agent> Hello! I'm the Movie Ticket Agent. How can I help you today?
> My name is Peter, recommend 3 movies to me.
Movie Ticket Agent> 
Prediction(
    trajectory={'thought_0': 'I need to recommend 3 movies to Peter. I will use the recommend_movies tool to generate a list of movie titles for him.', 'tool_name_0': 'recommend_movies', 'tool_args_0': {'user_name': 'Peter', 'k': 3}, 'observation_0': ['Back to the Future (1985)', 'Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)', 'Star Wars: Episode VI - Return of the Jedi (1983)'], 'thought_1': 'I have successfully recommended 3 movies to Peter: "Back to the Future", "Raiders of the Lost Ark", and "Star Wars: Episode VI - Return of the Jedi". Now, I will ask Peter if he would like to book a ticket for any of these movies.', 'tool_name_1': 'finish', 'tool_args_1': {}, 'observation_1': 'Completed.'},
    reasoning='I recommended 3 classic movies to Peter based on his request. The movies are "Back to the Future," "Raiders of the Lost Ark," and "Star Wars: Episode VI - Return of the Jedi." I have completed the recommendation process and am ready to assist him further if he wishes to book a ticket for any of these films.',
    process_result='I have recommended the following 3 movies to you, Peter: "Back to the Future," "Raiders of the Lost Ark," and "Star Wars: Episode VI - Return of the Jedi." Let me know if you would like to book a ticket for any of these!'
)

> give me a plot summary for "Star Wars: Episode VI - Return of the Jedi"
Movie Ticket Agent> 
Prediction(
    trajectory={'thought_0': 'The user is asking for a plot summary of "Star Wars: Episode VI - Return of the Jedi." Since this request does not involve booking tickets or any other specific action related to movie tickets, I will use the general question answering tool to provide the plot summary.', 'tool_name_0': 'general_qa', 'tool_args_0': {'user_request': 'give me a plot summary for "Star Wars: Episode VI - Return of the Jedi"'}, 'observation_0': ['"Star Wars: Episode VI - Return of the Jedi" follows the Rebel Alliance\'s final efforts to defeat the Galactic Empire. The film opens with Luke Skywalker, now a Jedi Knight, attempting to rescue Han Solo from the clutches of the crime lord Jabba the Hutt. After a daring rescue, Luke reunites with his friends Leia Organa, Han, and the droids C-3PO and R2-D2.\n\nMeanwhile, the Empire is constructing a new Death Star, even more powerful than the first. The Rebels learn of this and plan an assault to destroy it. Luke confronts Darth Vader, hoping to redeem his father and bring him back to the light side of the Force. The Emperor, aware of Luke\'s potential, seeks to turn him to the dark side.\n\nAs the Rebel fleet engages the Empire in a massive space battle, Luke faces the Emperor and Vader on the Death Star. In a climactic moment, Vader ultimately chooses to save his son, killing the Emperor and sacrificing himself in the process. The Death Star is destroyed, and the Rebels celebrate their victory, marking the fall of the Empire and the restoration of peace to the galaxy. The film concludes with a sense of hope and renewal, as the heroes reflect on their journey and the future ahead.'], 'thought_1': 'I have successfully provided the user with a plot summary for "Star Wars: Episode VI - Return of the Jedi." Since the user\'s request has been fulfilled, I can now mark the task as complete.', 'tool_name_1': 'finish', 'tool_args_1': {}, 'observation_1': 'Completed.'},
    reasoning='The user requested a plot summary for "Star Wars: Episode VI - Return of the Jedi." I utilized the general question answering tool to provide a detailed summary of the film\'s plot, covering key events and character arcs, which successfully addressed the user\'s request.',
    process_result='The plot summary for "Star Wars: Episode VI - Return of the Jedi" has been provided. The film follows the Rebel Alliance\'s final efforts to defeat the Galactic Empire, highlighting Luke Skywalker\'s journey to rescue Han Solo, confront Darth Vader, and ultimately bring balance to the Force.'
)

Here we are printing the trajectories including the tool calls and observations. You can see that the agent first recommends 3 movies to the user, and then answers the general question about the plot summary of "Star Wars: Episode VI - Return of the Jedi" by making a tool call to the general_qa tool. In your submission, make sure your transcript includes the full trajectories like this too so that we can judge whether your agent made the right tool calls.

Interfacing with Databases (20 points)

An important part of building a customer service agent is to be able to interface with databases to gather and record information about the user and the movies. As a starting point, implement the book_ticket and file_request tools in agent.py and integrate them into your agent.

We have created a mini movie showtime database in agent.py that you can use to gather information about the movies and help with movie booking requests. When your agent booked a new ticket, make sure to update the ticket_database and deduct the ticket price from the user's balance. For any requests that can't be handled by your agent, make a human customer support request by calling the file_request tool to add the request to the request_database. Your should print the databases whenever you make new updates to them.

Here is an example of what you should expect to see when you run the REPL:

user> alright book a ticket for Peter for Star Wars: Episode VI - Return of the Jedi
Movie Ticket Agent> 
Prediction(
    trajectory={'thought_0': 'I need to book a ticket for Peter for the movie "Star Wars: Episode VI - Return of the Jedi".', 'tool_name_0': 'book_ticket', 'tool_args_0': {'user_name': 'Peter', 'movie_title': 'Star Wars: Episode VI - Return of the Jedi'}, 'observation_0': 'Ticket booked successfully for Peter for the movie Star Wars: Episode VI - Return of the Jedi. The ticket number is fmbbto. Your new balance is 24.0.', 'thought_1': 'The ticket has been successfully booked for Peter. I should now check his balance to confirm the new amount after the ticket purchase.', 'tool_name_1': 'find_balance', 'tool_args_1': {'user_name': 'Peter'}, 'observation_1': 24.0, 'thought_2': 'The ticket has been booked and the balance confirmed. The task is now complete.', 'tool_name_2': 'finish', 'tool_args_2': {}, 'observation_2': 'Completed.'},
    reasoning='I successfully booked a ticket for Peter for the movie "Star Wars: Episode VI - Return of the Jedi". The booking was confirmed, and I also checked his balance after the purchase, which is now 24.0.',
    process_result='The ticket has been booked successfully for Peter. The ticket number is fmbbto, and his new balance is 24.0.'
)
user> print ticket_database

Printing ticket_database:
{'fmbbto': Ticket(user_name='Peter', movie_title='Star Wars: Episode VI - Return of the Jedi', time=Date(year=2025, month=11, day=15, hour=13, minute=0))}


user> can you give me a discount on Star Wars: Episode VI - Return of the Jedi?
Movie Ticket Agent> 
...
process_result='I have filed a request for customer support regarding your discount inquiry. The movie "Star Wars: Episode VI - Return of the Jedi" is scheduled for November 15, 2025, at 1:00 PM. You can book tickets through platforms like Fandango, AMC Theatres, or your local cinema\'s website. Additionally, I recommended some similar movies: \n1. Star Wars: Episode IV - A New Hope\n2. Star Wars: Episode V - The Empire Strikes Back\n3. Guardians of the Galaxy\n\nIf you would like to book a ticket for any of these movies or have any other questions, feel free to ask!'

user> print request_database
Printing request_database:

{'3th6rd': Request(user_request='can you give me a discount on Star Wars: Episode VI - Return of the Jedi?', user_name='')}

Transcript and Submission for Part 1

Once you have finished the above, generate a transcript of the agent's conversation with the user covering the following user questions:

- My name is Peter, recommend 3 movies to me.
- recommend 5 movies to Amy please
- Give me a plot summary for "Lord of the Rings: The Two Towers"
- Book a ticket for Peter for Lord of the Rings: The Two Towers
- print ticket_database
- Can you give me a discount on Lord of the Rings: The Two Towers?
- print request_database

Apart from the above transcript, you are also encouraged to add additional user questions that can showcase the use of all the tools you implemented. Make sure that you save the full trajectories of the agent's responses as we showed above in your transcript. Save the transcript as transcript_part1.txt.

Assignment Part 2: Real-World Extensions (50 points)

So far, our agent is still quite toy -- it relies on some synthetic user profiles and a fake movie database. Now it's your turn to extend your agent to support more functionalities that would make it useful in a real-world scenario. Your task is to implement functions that could support the following functionalities. We will outline how you might go about implementing each function, but you are free to implement them in any way you want. We will grade your implementation based on the interaction transcript that your agent has with the user.

Function 1: Web Search (25 points)

Often times our LLMs don't know the latest information (say you want to know about a new movie that's coming out soon). To overcome this, we want to integrate web search (through a search API) so that your agent can browse the latest information. Generally we can break this down into several steps: calling the web search tool, parsing the results, and using the results to answer the user's question.

There are many search APIs out there that are free for low-volume usage. One example is the Bing Search API through SerpAPI. Once you get an API key, you can use something like this to call the web search tool:

from serpapi import GoogleSearch

params = {
    "engine": "bing",
    "q": "knives out 2025",
    "api_key": "your_api_key"
}

search = GoogleSearch(params)
results = search.get_dict()

You can get the list of links from the results dictionary by:

links = [
    item.get("link")
    for item in results.get("organic_results", [])
    if item.get("link")
]

You might want to handle pagination of the results too if the results are too many.

Once you get a link, you can read its content with tools like BeautifulSoup:

from bs4 import BeautifulSoup

def extract_text(html: str) -> str:
    soup = BeautifulSoup(html, "html.parser")

    # Remove scripts/css/ads
    for tag in soup(["script", "style", "noscript"]):
        tag.decompose()

    text = soup.get_text(separator=" ", strip=True)
    return " ".join(text.split())  

By reading the content of the searched results, the agent should be able to give more accurate and up-to-date information to the user, for example, it should be able to handle a query like "do a web search and then tell me about the upcoming knives out movie in 2025 ".

Function 2: Memory (25 points)

You might notice that the current agent is stateless: it doesn't remember past interactions with the user and you have to explain who you are at every interaction. Try to implement a memory system so that your agent can remember past interactions with the user and use that memory to personalize the conversation. You can assume that within each interaction, the user is the same person.

There are many ways to implement the memory system. For simpliciy, we will briefly outline how you could use an existing agent memory library to integrate it into your Dspy ReAct agent.

We will use a library called Mem0. First, install it by running:

pip install mem0ai

You can initialize the memory system by using:

from mem0 import Memory

# Configure environment
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

# Initialize Mem0 memory system
config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini",
            "temperature": 0.1
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small"
        }
    }
}

And you can create tools that interact with the memory system by using:

class MemoryTools:
    """Tools for interacting with the Mem0 memory system."""

    def __init__(self, memory: Memory):
        self.memory = memory

    def store_memory(self, content: str, user_id: str = "default_user") -> str:
        """Store information in memory."""
        try:
            self.memory.add(content, user_id=user_id)
            return f"Stored memory: {content}"
        except Exception as e:
            return f"Error storing memory: {str(e)}"

    def search_memories(self, query: str, user_id: str = "default_user", limit: int = 5) -> str:
        """Search for relevant memories."""
        try:
            results = self.memory.search(query, user_id=user_id, limit=limit)
            if not results:
                return "No relevant memories found."

            memory_text = "Relevant memories found:\n"
            for i, result in enumerate(results["results"]):
                memory_text += f"{i}. {result['memory']}\n"
            return memory_text
        except Exception as e:
            return f"Error searching memories: {str(e)}"

    def get_all_memories(self, user_id: str = "default_user") -> str:
        """Get all memories for a user."""
        try:
            results = self.memory.get_all(user_id=user_id)
            if not results:
                return "No memories found for this user."

            memory_text = "All memories for user:\n"
            for i, result in enumerate(results["results"]):
                memory_text += f"{i}. {result['memory']}\n"
            return memory_text
        except Exception as e:
            return f"Error retrieving memories: {str(e)}"

    def update_memory(self, memory_id: str, new_content: str) -> str:
        """Update an existing memory."""
        try:
            self.memory.update(memory_id, new_content)
            return f"Updated memory with new content: {new_content}"
        except Exception as e:
            return f"Error updating memory: {str(e)}"

    def delete_memory(self, memory_id: str) -> str:
        """Delete a specific memory."""
        try:
            self.memory.delete(memory_id)
            return "Memory deleted successfully."
        except Exception as e:
            return f"Error deleting memory: {str(e)}"

And you can integrate it with our Dspy ReAct agent by using:

class MemoryQA(dspy.Signature):
    """
    You're a helpful assistant and have access to memory method.
    Whenever you answer a user's input, remember to store the information in memory
    so that you can use it later.
    """
    user_input: str = dspy.InputField()
    response: str = dspy.OutputField()

class MemoryReActAgent(dspy.Module):
    """A ReAct agent enhanced with Mem0 memory capabilities."""

    def __init__(self, memory: Memory):
        super().__init__()
        self.memory_tools = MemoryTools(memory)

        # Create tools list for ReAct
        self.tools = [
            self.memory_tools.store_memory,
            self.memory_tools.search_memories,
            self.memory_tools.get_all_memories,
            self.get_preferences,
            self.update_preferences,
        ]

        # Initialize ReAct with our tools
        self.react = dspy.ReAct(
            signature=MemoryQA,
            tools=self.tools,
            max_iters=6
        )

    def forward(self, user_input: str):
        """Process user input with memory-aware reasoning."""

        return self.react(user_input=user_input)


    def get_preferences(self, category: str = "general", user_id: str = "default_user") -> str:
        """Get user preferences for a specific category."""
        query = f"user preferences {category}"
        return self.memory_tools.search_memories(
            query=query,
            user_id=user_id
        )

    def update_preferences(self, category: str, preference: str, user_id: str = "default_user") -> str:
        """Update user preferences."""
        preference_text = f"User preference for {category}: {preference}"
        return self.memory_tools.store_memory(
            preference_text,
            user_id=user_id
        )

As a minimal demo of how this memory-augmented agent works, try running the following example:

import time
def run_memory_agent_demo():
    """Demonstration of memory-enhanced ReAct agent."""

    # Configure DSPy
    lm = dspy.LM(model='openai/gpt-4o-mini')
    dspy.configure(lm=lm)

    # Initialize memory system
    memory = Memory.from_config(config)

    # Create our agent
    agent = MemoryReActAgent(memory)

    # Sample conversation demonstrating memory capabilities
    print("🧠 Memory-Enhanced ReAct Agent Demo")
    print("=" * 50)

    conversations = [
        "Hi, I'm Alice and I love sci-fi movies.",
        "I'm Alice. My favorite movie is "The Matrix".",
        "I'm Alice. What do you remember about my preferred movie genre?",
        "I'm Alice. What is my favorite movie?",
    ]

    for i, user_input in enumerate(conversations, 1):
        print(f"\nπŸ“ User: {user_input}")

        try:
            response = agent(user_input=user_input)
            print(f"πŸ€– Agent: {response.response}")
            time.sleep(1)

        except Exception as e:
            print(f"❌ Error: {e}")

# Run the demonstration
if __name__ == "__main__":
    run_memory_agent_demo()

This is just a minimal demo to illustrate the memory capabilities of the agent, for more details, you can refer to this tutorial: https://dspy.ai/tutorials/mem0_react_agent/. After running through this demo, make sure to also integrate all the movie-related tools and the web-search tools from previous parts into your memory-augmented agent to make it fully functional.

Transcript and Submission for Assignment Part 2

Similar to Part 1, include the above example queries as well as additional user questions that can showcase the use of all the tools you implemented. Save the transcript as transcript_part2.txt. You should make sure to showcase that the agent is able to remember past interactions with the user and use that memory to personalize the conversation, and that it can make tool calls to the web search tool and answer questions based on the latest information.

Submit your assignment via Gradescope. We expect the following files in your final submission:

agent.py
api_keys.py
transcript_part1.txt
transcript_part2.txt
* any auxiliary code files you created for Part 2

We will use your API key to run the autograder on your submission alone. It is important that you make sure there is at least $0.1 left in your account. If you would like to work out an alternative accomodation please make a private Ed post.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages