Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

MediaPipe LLM Inference task for web

Overview

This web sample demonstrates how to use the LLM Inference API to run common text-to-text generation tasks like information retrieval, email drafting, and document summarization, on web.

Prerequisites

  • A browser with WebGPU support (eg. Chrome on macOS or Windows).

Running the demo

Follow the following instructions to run the sample on your device:

  1. Make a folder for the task, named as llm_task, and copy the index.html and index.js files into your llm_task folder.
  2. Download Gemma 2B (TensorFlow Lite 2b-it-gpu-int4 or 2b-it-gpu-int8) or convert an external LLM (Phi-2, Falcon, or StableLM) following the guide (only gpu backend is currently supported), into the llm_task folder.
  3. In your index.js file, update modelFileName with your model file's name.
  4. Run python3 -m http.server 8000 under the llm_task folder to host the three files (or python -m SimpleHTTPServer 8000 for older python versions).
  5. Open localhost:8000 in Chrome. Then the button on the webpage will be enabled when the task is ready (~10 seconds).