Skip to content

Comments

implemented KG and added chatbot#48

Open
vishalgondke wants to merge 1 commit intoyogeshhk:masterfrom
vishalgondke:implement-kg-and-chatbot
Open

implemented KG and added chatbot#48
vishalgondke wants to merge 1 commit intoyogeshhk:masterfrom
vishalgondke:implement-kg-and-chatbot

Conversation

@vishalgondke
Copy link

@vishalgondke vishalgondke commented Jan 28, 2026

This PR addresses #38 and #45 to some extent.

Here is how the implementation works currently:

  • Specify the file that you want to create a KG for and run main.py
  • After successfully fetching and parsing information from the mentioned resume, a JSON object will be created and stored in parsed_resume_llm_based.json
  • This JSON file will be used to create a neo4j graph. Currently I have set it to overwrite the graph every time we run main.py for a new resume, but it can be set to "False" from here (refer below screenshot) so that our graph will have multiple nodes of different resumes
image
  • For chatbot functionality, we need to run src/app.py. For answering to user queries, this chatbot will make use of the same model that was previously used to fetch relevant information from the resume and provide answers

I have made use of neo4j graph database and this is how the visualization looks there:
image

This is how the current Streamlit representation looks like:
image

Chatbot functionality questions example:
This is how the chatbot query would work:
image

Note:
Please create a neo4j cloud/aura instance and add these 3 fields in your .env file for the graph implementation part:

  • NEO4J_URI
  • NEO4J_USERNAME
  • NEO4J_PASSWORD

Improvements:

  1. Google/flan-t5-large accepts 512 tokens. We can think of using other text2text generation models like Mistral 7B or google/flan-t5-xl for larger resumes, or chunking can also help
  2. Extraction of information (via llm based method) can certainly be improved, you can have a look at the extracted json and identify the fields requiring improvement, prompt modification would solve this issue.
  3. The chatbot currently responds to the user query but it does not have memory as of now, we can make it even more interactive by adding memory to remember contexts.
  4. Chatbot may face issues in semantic understanding / paraphrase generalization, please checkout below example
    Query 1: I used the word "education" in my query and since there is a education node in our neo4j graph, the chatbot could provide relevant response:
image

Query 2: I used the word "studied" in my query, but chatbot fails to respond because it could not correlate education & studied
image

oops! actually this time it did work but still keeping a note of this if at all a similar issue occurs again. We may need to change our prompt to tackle this I guess? Or maybe change the model? Let me know if there are any other ways please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant