Bug Report the model often starts creating repetitive sequences of tokens #220

rossanodr · 2024-06-26T22:53:15Z

Description of the bug:

Summary:
When using the “gemini-1.5-flash” model for generating long texts, the model often starts creating repetitive sequences of tokens, leading to an infinite loop and exhausting the token limit. This issue is observed with both the Vertex and Gemini APIs.

Example: ```
“The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…”
Steps to Reproduce:

Use the "gemini-1.5-flash" model via Vertex or Gemini API.
Generate a long text (e.g., legal or technical document).
Observe the generated output for repetition of phrases or sentences.
Expected Behavior:
The model should generate coherent and non-repetitive text.

Actual Behavior:
The model begins to repeat sequences of tokens indefinitely, leading to the maximum token limit being reached.

Impact:

Wastes tokens and API usage limits.
Generates unusable text, necessitating additional requests and costs.
Reproduction Rate:
Occurs frequently with long text generation tasks.

Workaround:
Currently, there is no known workaround to prevent this issue.

Request for Resolution:

Investigate the cause of the repetitive token generation.
Implement a fix to prevent the model from entering a repetitive loop.
Provide a mechanism for users to request refunds or credits for tokens wasted due to this bug.

Actual vs expected behavior:

Actual: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…”

Expected: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. ”

Any other information you'd like to share?

No response

singhniraj08 · 2024-06-27T06:04:01Z

@rossanodr,

Thank you reporting this issue.
This repository is for issues related to Gemini API Cookbook quickstarts and examples. For issues related to Gemini API, we would suggest you to use "Send Feedback" option in Gemini docs. Ref: Screenshot below. You can also post this issue on Google AI forum.

rossanodr · 2024-06-27T13:03:05Z

Thank you but Unfortunately, I did not receive any response from any of them.

@rossanodr,

Thank you reporting this issue. This repository is for issues related to Gemini API Cookbook quickstarts and examples. For issues related to Gemini API, we would suggest you to use "Send Feedback" option in Gemini docs. Ref: Screenshot below. You can also post this issue on Google AI forum.

ghost · 2024-06-29T11:36:29Z

Description of the bug:

Summary: When using the “gemini-1.5-flash” model for generating long texts, the model often starts creating repetitive sequences of tokens, leading to an infinite loop and exhausting the token limit. This issue is observed with both the Vertex and Gemini APIs.

Example: ``` “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…” Steps to Reproduce:

Use the "gemini-1.5-flash" model via Vertex or Gemini API. Generate a long text (e.g., legal or technical document). Observe the generated output for repetition of phrases or sentences. Expected Behavior: The model should generate coherent and non-repetitive text.

Actual Behavior: The model begins to repeat sequences of tokens indefinitely, leading to the maximum token limit being reached.

Impact:

Wastes tokens and API usage limits. Generates unusable text, necessitating additional requests and costs. Reproduction Rate: Occurs frequently with long text generation tasks.

Workaround: Currently, there is no known workaround to prevent this issue.

Request for Resolution:

Investigate the cause of the repetitive token generation. Implement a fix to prevent the model from entering a repetitive loop. Provide a mechanism for users to request refunds or credits for tokens wasted due to this bug.

Actual vs expected behavior:

Actual: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed…”

Expected: “The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. ”

Any other information you'd like to share?

No response

nbdh

mioruggieroguida · 2024-07-04T10:22:24Z

We are experiencing the same issue

rossanodr · 2024-07-04T12:08:51Z

We are experiencing the same issue

I posted the same issue on gemini forum. It would be nice if you could make some noise there too, to bring attention to the problem https://discuss.ai.google.dev/t/bug-report-the-model-often-starts-creating-repetitive-sequences-of-tokens/6445

mioruggieroguida · 2024-07-10T08:23:45Z

@rossanodr Done.

Did you manage to make any progress on this?

rossanodr · 2024-07-10T18:07:18Z

@rossanodr Done.

Did you manage to make any progress on this?

No :(
Unfortunately, I think the problem is with Gemini. It is happening with many different prompts. The main issue is the large context. Let's say your prompt is something like, "Read the document below and make a list of all dates of birthdays on it {list}". If the document is large, it has a chance of starting to repeat the same date until it reaches the token limit.

singhniraj08 added type:bug Something isn't working status:awaiting response Awaiting a response from the author component:other Issues unrelated to examples/quickstarts labels Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Report the model often starts creating repetitive sequences of tokens #220

Bug Report the model often starts creating repetitive sequences of tokens #220

rossanodr commented Jun 26, 2024

singhniraj08 commented Jun 27, 2024

rossanodr commented Jun 27, 2024

ghost commented Jun 29, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

mioruggieroguida commented Jul 4, 2024

rossanodr commented Jul 4, 2024

mioruggieroguida commented Jul 10, 2024

rossanodr commented Jul 10, 2024

Bug Report the model often starts creating repetitive sequences of tokens #220

Bug Report the model often starts creating repetitive sequences of tokens #220

Comments

rossanodr commented Jun 26, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

singhniraj08 commented Jun 27, 2024

rossanodr commented Jun 27, 2024

ghost commented Jun 29, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

mioruggieroguida commented Jul 4, 2024

rossanodr commented Jul 4, 2024

mioruggieroguida commented Jul 10, 2024

rossanodr commented Jul 10, 2024