view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 2 days ago • 28
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 561
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 203k • 1.58k
view article Article Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers Feb 1, 2022 • 15