Hi there,
I noticed the tag „vllm“ on models recently and think it might lead to some confusion, as the tag „apparently“ serves to highlight models that can be served using the vllm package (as in „virtual LLM“, I guess).
However this takes away the functional aspect of the model, ie as a „visual LLM“ that can take images or video as input and reason about those input types.
While there‘s a clear task-tag around for both types, maybe a serving-type tag would be beneficial (besides vllm ollama comes to mind)? Let me know what you think.
Best,
M
Examples:
Task-Text-gen, vllm tag:
No-task-tag but vllm tag:
Task-img-Text-gen (+ multimodal, probably the best way w/o a extra serving-tag around):