瞭解及計算符記

Gemini 和其他生成式 AI 模型會精細處理輸入和輸出內容符記。

本指南說明如何取得特定模型的背景區間，以及如何針對文字輸入、聊天和多模態等用途計算符記輸入、系統指示和工具

關於權杖

符記可以是單一字元 (例如 z) 或完整的字詞 (例如 cat)。詳細字詞可分為多個符記模型使用的所有符記集將文字分割成符記的過程 符記化。

Gemini 模型的符記約為 4 個字元。 100 個符記約等於 60 到 80 個英文單字。

啟用帳單功能後，對 Gemini API 的呼叫費用為有些部分取決於輸入和輸出符記的數量，因此知道計算符記就能派上用場

前往 ai.google.dev 查看

在 Google Colab 中執行

前往 GitHub 查看原始碼

背景區間

Gemini API 提供的模型有脈絡窗口：計算符記背景期間定義您可以提供多少輸入內容以及模型可產生的輸出內容您可以決定執行 API，或前往「models」說明文件。

在以下範例中，您可以看到 gemini-1.0-pro-001 模型具有輸入上限約 3 萬個符記，且輸出上限約為 2, 000 個符記。也就是約 32,000 個符記的脈絡窗口。

model_info = genai.get_model("models/gemini-1.0-pro-001")

# Returns the "context window" for the model,
# which is the combined input and output token limits.
print(f"{model_info.input_token_limit=}")
print(f"{model_info.output_token_limit=}")
# ( input_token_limit=30720, output_token_limit=2048 )count_tokens.py

再舉一個例子，如果您要求模型的符記限制 gemini-1.5-flash-001，您會發現其中有 200 萬個符記的脈絡窗口。

計算符記數量

所有從 Gemini API 傳入和輸出的內容都會權杖化，包括文字、圖片檔案以及其他非文字形式的內容

您可以透過下列方式計算符記：

計算文字符記數量

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "The quick brown fox jumps over the lazy dog."

# Call `count_tokens` to get the input token count (`total_tokens`).
print("total_tokens: ", model.count_tokens(prompt))
# ( total_tokens: 10 )

response = model.generate_content(prompt)

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 11, candidates_token_count: 73, total_token_count: 84 )count_tokens.py

計算多輪 (聊天) 符記數量

model = genai.GenerativeModel("models/gemini-1.5-flash")

chat = model.start_chat(
    history=[
        {"role": "user", "parts": "Hi my name is Bob"},
        {"role": "model", "parts": "Hi Bob!"},
    ]
)
# Call `count_tokens` to get the input token count (`total_tokens`).
print(model.count_tokens(chat.history))
# ( total_tokens: 10 )

response = chat.send_message(
    "In one sentence, explain how a computer works to a young child."
)

# On the response for `send_message`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 25, candidates_token_count: 21, total_token_count: 46 )

from google.generativeai.types.content_types import to_contents

# You can call `count_tokens` on the combined history and content of the next turn.
print(model.count_tokens(chat.history + to_contents("What is the meaning of life?")))
# ( total_tokens: 56 )count_tokens.py

計算多模態符記

提供給 Gemini API 的所有輸入內容都會權杖化，包括文字、圖片檔等非文字形式的組合請注意以下關於權杖化的概略要點的多模態輸入：

系統會將圖片視為固定大小，因此可耗用一定數量的圖片符記 (目前 258 個符記)，不論其顯示或檔案大小。
影片和音訊檔案會按照下列固定費率轉換為權杖：每秒有 263 個符記，音訊則每秒 32 個符記

圖片檔

在處理過程中，Gemini API 會將圖片視為固定大小，使用固定數量的符記 (目前有 258 個符記) 螢幕或檔案大小

使用 File API 上傳圖片的範例：

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "Tell me about this image"
your_image_file = genai.upload_file(path="image.jpg")

# Call `count_tokens` to get the input token count
# of the combined text and file (`total_tokens`).
# An image's display or file size does not affect its token count.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens([prompt, your_image_file]))
# ( total_tokens: 263 )

response = model.generate_content([prompt, your_image_file])
response.text
# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 264, candidates_token_count: 80, total_token_count: 345 )count_tokens.py

以內嵌資料提供圖片的範例：

import PIL.Image

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "Tell me about this image"
your_image_file = PIL.Image.open("image.jpg")

# Call `count_tokens` to get the input token count
# of the combined text and file (`total_tokens`).
# An image's display or file size does not affect its token count.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens([prompt, your_image_file]))
# ( total_tokens: 263 )

response = model.generate_content([prompt, your_image_file])

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 264, candidates_token_count: 80, total_token_count: 345 )count_tokens.py

影片或音訊檔案

系統每個音訊和影片都會按照下列固定費率轉換為符記：

影片：每秒 263 個符記
音訊：每秒 32 個符記

import time

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "Tell me about this video"
your_file = genai.upload_file(path=media / "Big_Buck_Bunny.mp4")

# Videos need to be processed before you can use them.
while your_file.state.name == "PROCESSING":
    print("processing video...")
    time.sleep(5)
    your_file = genai.get_file(your_file.name)

# Call `count_tokens` to get the input token count
# of the combined text and video/audio file (`total_tokens`).
# A video or audio file is converted to tokens at a fixed rate of tokens per second.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens([prompt, your_file]))
# ( total_tokens: 300 )

response = model.generate_content([prompt, your_file])

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 301, candidates_token_count: 60, total_token_count: 361 )
count_tokens.py

系統操作說明和工具

系統操作說明和工具也會計入。

如果您採用系統指令，total_tokens 計數會增加以反映增加 system_instruction。

model = genai.GenerativeModel(model_name="gemini-1.5-flash")

prompt = "The quick brown fox jumps over the lazy dog."

print(model.count_tokens(prompt))
# total_tokens: 10

model = genai.GenerativeModel(
    model_name="gemini-1.5-flash", system_instruction="You are a cat. Your name is Neko."
)

# The total token count includes everything sent to the `generate_content` request.
# When you use system instructions, the total token count increases.
print(model.count_tokens(prompt))
# ( total_tokens: 21 )count_tokens.py

如果使用函式呼叫，total_tokens 計數會增加以反映已新增 tools。

model = genai.GenerativeModel(model_name="gemini-1.5-flash")

prompt = "I have 57 cats, each owns 44 mittens, how many mittens is that in total?"

print(model.count_tokens(prompt))
# ( total_tokens: 22 )

def add(a: float, b: float):
    """returns a + b."""
    return a + b

def subtract(a: float, b: float):
    """returns a - b."""
    return a - b

def multiply(a: float, b: float):
    """returns a * b."""
    return a * b

def divide(a: float, b: float):
    """returns a / b."""
    return a / b

model = genai.GenerativeModel(
    "models/gemini-1.5-flash-001", tools=[add, subtract, multiply, divide]
)

# The total token count includes everything sent to the `generate_content` request.
# When you use tools (like function calling), the total token count increases.
print(model.count_tokens(prompt))
# ( total_tokens: 206 )count_tokens.py