Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token counting for Audio input #161

Open
aalhayali opened this issue May 21, 2024 · 2 comments
Open

Token counting for Audio input #161

aalhayali opened this issue May 21, 2024 · 2 comments
Labels
component:quickstarts Issues/PR referencing quickstarts folder status:triaged Issue/PR triaged to the corresponding sub-team type:help Support-related issues

Comments

@aalhayali
Copy link

Description of the feature request:

It would be great to know a close estimate of how many tokens it costs per minute of audio. In this guide, it mentions "Audio and video are each converted to tokens at a fixed rate of tokens per minute." Using the audio example you used in the guide as a reference point, for ~44 minutes of audio, it would cost ~1,899 tokens/minute of audio (83552/44). Is my understanding correct? Also, would the number of tokens change based on the audio input type (eg. wav vs mp3)?

What problem are you trying to solve with this feature?

Estimating the count of tokens for audio input.

Any other information you'd like to share?

No response

@aalhayali aalhayali added component:examples Issues/PR referencing examples folder component:quickstarts Issues/PR referencing quickstarts folder type:feature request New feature request/enhancement labels May 21, 2024
@singhniraj08 singhniraj08 added type:help Support-related issues status:triaged Issue/PR triaged to the corresponding sub-team and removed type:feature request New feature request/enhancement component:examples Issues/PR referencing examples folder labels May 22, 2024
@SivaMalasani
Copy link

SivaMalasani commented May 22, 2024

@aalhayali
Token count depends upon the length of the audio rather than the size or type of the audio input
I investigated how audio format and file size affect the number of tokens generated from audio. I used audio clips of the same length (3.07 minutes) in various formats (mp3, wav, flac, aac and m4a). Interestingly, the token count did not depend on the format or file size of the audio. Instead, it solely relied on the audio's duration. In other words, clips with the same length resulted in the same number of tokens, regardless of format or file size.
Please find the gist

@lucianommartins
Copy link
Contributor

Hi @aalhayali, the current version of Audio.ipynb includes an example of how to use model.count_token() against an audio file stored at the File API.

Could you check it? I think it is exactly what you are looking for.

cheers,
Luciano Martins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:quickstarts Issues/PR referencing quickstarts folder status:triaged Issue/PR triaged to the corresponding sub-team type:help Support-related issues
Projects
None yet
Development

No branches or pull requests

4 participants