MiniCPM-V 4.0: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
-
Updated
Aug 12, 2025 - Python
MiniCPM-V 4.0: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
ModelScope: bring the notion of Model-as-a-Service to life.
Start building LLM-empowered multi-agent applications in an easier way.
a state-of-the-art-level open visual language model | 多模态预训练模型
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Represent, send, store and search multimodal data
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
GPT4V-level open-source multi-modal model based on Llama3-8B
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."