-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug Report the model often starts creating repetitive sequences of tokens #220
Comments
Thank you reporting this issue. |
Thank you but Unfortunately, I did not receive any response from any of them.
|
nbdh |
We are experiencing the same issue |
I posted the same issue on gemini forum. It would be nice if you could make some noise there too, to bring attention to the problem https://discuss.ai.google.dev/t/bug-report-the-model-often-starts-creating-repetitive-sequences-of-tokens/6445 |
@rossanodr Done. Did you manage to make any progress on this? |
No :( |
Description of the bug:
Summary:
When using the “gemini-1.5-flash” model for generating long texts, the model often starts creating repetitive sequences of tokens, leading to an infinite loop and exhausting the token limit. This issue is observed with both the Vertex and Gemini APIs.
Example: ```
“The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…”
Steps to Reproduce:
Use the "gemini-1.5-flash" model via Vertex or Gemini API.
Generate a long text (e.g., legal or technical document).
Observe the generated output for repetition of phrases or sentences.
Expected Behavior:
The model should generate coherent and non-repetitive text.
Actual Behavior:
The model begins to repeat sequences of tokens indefinitely, leading to the maximum token limit being reached.
Impact:
Wastes tokens and API usage limits.
Generates unusable text, necessitating additional requests and costs.
Reproduction Rate:
Occurs frequently with long text generation tasks.
Workaround:
Currently, there is no known workaround to prevent this issue.
Request for Resolution:
Investigate the cause of the repetitive token generation.
Implement a fix to prevent the model from entering a repetitive loop.
Provide a mechanism for users to request refunds or credits for tokens wasted due to this bug.
Actual vs expected behavior:
Actual: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…”
Expected: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. ”
Any other information you'd like to share?
No response
The text was updated successfully, but these errors were encountered: