-
Notifications
You must be signed in to change notification settings - Fork 0
investigating separating out documents from the rest of the message h… #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…istory and instructions.
anarchivist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting work - i'm excited to see this proceeding.
willa/chatbot/graph_manager.py
Outdated
| "start_index": str(doc.metadata.get('start_index')) if doc.metadata.get('start_index') else '', | ||
| "total_pages": str(doc.metadata.get('total_pages')) if doc.metadata.get('total_pages') else '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why we're returning empty strings here and not None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the model is expecting a string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we don't actually need these two values. I just thought it might be helpful to the model. we could drop both if we want.
| prompt = get_langfuse_prompt() | ||
| system_messages = prompt.invoke({'context': docs_context, | ||
| 'question': latest_message.content}) | ||
| system_messages = prompt.invoke({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this doing? where is the user's question being inserted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user's question is passed as a user message like it always is. There actually might be unrelated cleanup work for us to do with that. Right now, we're passing the user query as a user message twice (due to the summary bit) and then again as part of the system message (instruction prompt).
- this gets cohere specific response field that includes citations for the response text
9cce4d4 to
e999fa8
Compare
| additional_model_request_fields={"documents": documents}, | ||
| additional_model_response_field_paths=["/citations"] | ||
| ) | ||
| citations = response.response_metadata.get('additionalModelResponseFields').get('citations') if response.response_metadata else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's neat about this, is that the citations returned by cohere can be used to cite specific parts of the response message to documents that we passed above in line 145.
An example list of citations look like this:
[
{
"start": 184,
"end": 229,
"text": "admission criteria and standards of practice.",
"document_ids": ["doc_0"]
},
{
"start": 260,
"end": 275,
"text": "different roles",
"document_ids": ["doc_1", "doc_2"]
}
]
- temporarily add raw citations to response.
…istory and instructions.