Gemini handle the pdf file? #158

helai78 · 2024-05-21T09:10:41Z

Description of the feature request:

https://ai.google.dev/gemini-api/docs/prompting_with_media?lang=python
based on the above link, it seems not to work on the pdf file?
is my understanding right?

What problem are you trying to solve with this feature?

No response

Any other information you'd like to share?

No response

singhniraj08 · 2024-05-22T05:07:36Z

@helai78, As shown in documentation, Supported text formats are noted here. Gemini API won't support PDF file, as application/pdf MIME type is not supported yet. Alternatively, you can use AI Studio to work with pdf files using Gemini. Thank you!

helai78 · 2024-05-22T06:31:29Z

Hello, @singhniraj08 Thank you for you clarfication.

AI Studio you mentioned is Vertex AI Gemini API which can handle pdf file. this Vertex AI is part of google could, which means 90 days free for me. is my undersanding correct?

could you tell me any alternatives to handle the pdf files with the use of gemini 1.5 pro?

thanks in adcance.

anusonawane · 2024-07-08T06:19:44Z

Hello @helai78 ,
Currently, there's no direct support for uploading PDF files, but we can work around this by converting the PDF to images and extracting text separately.
https://github.com/google-gemini/cookbook/blob/main/quickstarts/PDF_Files.ipynb

helai78 · 2024-07-10T03:28:21Z

Hello @helai78 , Currently, there's no direct support for uploading PDF files, but we can work around this by converting the PDF to images and extracting text separately. https://github.com/google-gemini/cookbook/blob/main/quickstarts/PDF_Files.ipynb

Hello, @anusonawane
I almost do the same thing as you mentioned, that i used the tesseract to OCR the text from the image..
but the problem is that
the image should be categorized to some types: text, data chart and picture. but the function of OCR is only good for the image with text, not good for data chart and picture. and while i just have the limited token. but it is very good challenge...

helai78 added component:examples Issues/PR referencing examples folder component:quickstarts Issues/PR referencing quickstarts folder type:feature request New feature request/enhancement labels May 21, 2024

singhniraj08 added status:awaiting response Awaiting a response from the author type:help Support-related issues and removed type:feature request New feature request/enhancement component:examples Issues/PR referencing examples folder labels May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini handle the pdf file? #158

Gemini handle the pdf file? #158

helai78 commented May 21, 2024

singhniraj08 commented May 22, 2024

helai78 commented May 22, 2024

anusonawane commented Jul 8, 2024

helai78 commented Jul 10, 2024

Gemini handle the pdf file? #158

Gemini handle the pdf file? #158

Comments

helai78 commented May 21, 2024

Description of the feature request:

What problem are you trying to solve with this feature?

Any other information you'd like to share?

singhniraj08 commented May 22, 2024

helai78 commented May 22, 2024

anusonawane commented Jul 8, 2024

helai78 commented Jul 10, 2024