You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Attempting to create a simple QA RAG chat between the RetrieveUserProxyAgent and AssistantAgent. I will provide my code below, however note that I am using one textbook in .txt format in my "./data/csc" directory for the context, which I won't provide here, however I will provide the retrieved context in its entirety for others to test on their end, just be prepared for a long copy/paste section below. See the code and output:
from autogen import AssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent
from autogen.cache import Cache
# Use Ollama's llama3:8b model for the User and Assistant Agents
llm_config = {
"llama3:8b": {
"config_list": [
{
"model": "llama3:8b",
"api_key": "NA",
"api_type": "openai",
"base_url": "http://localhost:11434/v1",
},
],
"temperature": 0.01,
"timeout": 7200,
}
}
# The task to ask the AssistantAgent to perform
CLIENT_REQUEST = "Ten years ago, what percentage of the average investor's financial assets (bank accounts, registered retirement savings plans, pension, insurance, etc.) were stocks, and what has that percentage grown to today?"
# Desired sentence to reference from context: "Ten years ago, 22% of the average investor’s financial assets (bank accounts, registered retirement savings plans, pension, insurance, etc.) were stocks. Today, this share has grown to 30%."
# The configuration for the RetrieveUserProxyAgent
custom_retrieve_config = {
"task": "qa",
"vector_db": "chroma",
"db_config": {},
"docs_path": "./data/csc",
"extra_docs": True,
"new_docs": True,
"model": llm_config["llama3:8b"]["config_list"][0]["model"],
"chunk_token_size": 1024,
"context_max_tokens": 4096,
"chunk_mode": "multi_lines",
"must_break_at_empty_line": True,
"embedding_model": "all-MiniLM-L6-v2",
"embedding_function": None,
"customized_prompt": None,
"update_context": True,
"collection_name": "CSCCollection",
"get_or_create": False,
"overwrite": False,
# "custom_token_count_function": len,
"custom_text_split_function": None,
# "custom_text_types": None,
"recursive": True,
"distance_threshold": -1,
}
# Create a Client UserProxyAgent to interact with the AssistantAgent.
client = RetrieveUserProxyAgent(
name="Client",
human_input_mode="ALWAYS",
is_termination_msg=None,
retrieve_config=custom_retrieve_config,
max_consecutive_auto_reply=3,
code_execution_config=False,
default_auto_reply=None,
llm_config=llm_config["llama3:8b"],
system_message="You are the Client UserProxyAgent asking the Assistant for answers to your questions.",
description="UserProxyAgent that asks the Assistant for answers to questions.",
)
# Create an AssistantAgent to generate answers for the Client.
assistant = AssistantAgent(
name="Assistant",
system_message="You are a helpful AssistantAgent that answers questions asked by the Client.",
llm_config=llm_config["llama3:8b"],
is_termination_msg=None,
human_input_mode="ALWAYS",
default_auto_reply="TERMINATE",
max_consecutive_auto_reply=3, # used only when human_input_mode is not ALWAYS
code_execution_config=False,
description="AssistantAgent that generates answers to questions asked by the Client.",
)
# Reset the AssistantAgent to clear its memory
def _reset_agent(agent):
agent.reset()
# START
if __name__ == "__main__":
_reset_agent(assistant)
with Cache.disk() as cache:
response = client.initiate_chat(assistant,
clear_history=False,
silent=False,
cache=cache,
max_turns=None,
message=client.message_generator,
problem=CLIENT_REQUEST,
n_results=3,
)
print(response)
Steps to reproduce
Python 3.11
Windows 10
ollama version is 0.3.10, with llama3:8b in my example
pyautogen==0.3.0
Create a "./data/csc" folder, make a .txt file in it, and copy paste this context:
Client (to Assistant):
You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
You must give as short an answer as possible.
User's question is: Ten years ago, what percentage of the average investor's financial assets (bank accounts, registered retirement savings plans, pension, insurance, etc.) were stocks, and what has that percentage grown to today?
Context is: For example, by the end of 2007, individual Canadians had just over $500 billion in personal
savings deposits at the chartered banks alone (Source: Bloomberg). They had many more billions
of dollars at other financial intermediaries such as trust companies, credit unions and investment
etc etc etc etc
--------------------------------------------------------------------------------
Replying as Assistant. Provide feedback to Client. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:
>>>>>>>> NO HUMAN INPUT RECEIVED.
>>>>>>>> USING AUTO REPLY...
[autogen.oai.client: 09-16 11:51:48] {349} WARNING - Model llama3:8b is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
Assistant (to Client):
I'm a retrieve augmented chatbot! I can answer your questions based on my knowledge. What's your question?
--------------------------------------------------------------------------------
Replying as Client. Provide feedback to Assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Answer the question I asked in the original prompt.
Client (to Assistant):
Answer the question I asked in the original prompt.
--------------------------------------------------------------------------------
Replying as Assistant. Provide feedback to Client. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:
>>>>>>>> NO HUMAN INPUT RECEIVED.
>>>>>>>> USING AUTO REPLY...
[autogen.oai.client: 09-16 11:52:28] {349} WARNING - Model llama3:8b is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
Assistant (to Client):
You originally asked me to be a "helpful AssistantAgent that answers questions asked by the Client." So, I'll do just that!
What's your question? I'm here to help!
So clearly the Assistant is unable to answer the question it was asked in the first prompt, and has no memory of it when asked to answer it a second time.
Additional Information
One thing I noticed is that even when setting the n_results to 3 or even 1, the context length is still long. It makes me wonder if the context being provided to the Assistant is too large to find the question, let alone the answer? I've tried other combinations of "single-line" for the chunk_mode, ranging token sizes and others, still no dice.
I would also like for the assistant to have a "memory" of the responses. For example, even if it didn't "get" the context from the first prompt, it should surely be able to see and review it on the second prompt? It's like each prompt is its own entity, and it doesn't "remember" anything after. So if someone can test on their end to fix the original issue, but then also maybe show proper syntax for remembering and storing a conversation's history? Thanks.
The text was updated successfully, but these errors were encountered:
Describe the issue
Attempting to create a simple QA RAG chat between the
RetrieveUserProxyAgent
andAssistantAgent
. I will provide my code below, however note that I am using one textbook in .txt format in my"./data/csc"
directory for the context, which I won't provide here, however I will provide the retrieved context in its entirety for others to test on their end, just be prepared for a long copy/paste section below. See the code and output:Steps to reproduce
Create a "./data/csc" folder, make a .txt file in it, and copy paste this context:
Screenshots and logs
Here is the log of the conversation:
So clearly the Assistant is unable to answer the question it was asked in the first prompt, and has no memory of it when asked to answer it a second time.
Additional Information
One thing I noticed is that even when setting the n_results to 3 or even 1, the context length is still long. It makes me wonder if the context being provided to the Assistant is too large to find the question, let alone the answer? I've tried other combinations of "single-line" for the
chunk_mode
, ranging token sizes and others, still no dice.I would also like for the assistant to have a "memory" of the responses. For example, even if it didn't "get" the context from the first prompt, it should surely be able to see and review it on the second prompt? It's like each prompt is its own entity, and it doesn't "remember" anything after. So if someone can test on their end to fix the original issue, but then also maybe show proper syntax for remembering and storing a conversation's history? Thanks.
The text was updated successfully, but these errors were encountered: