๐Ÿค– InternVL2.5-4B Multimodal Chat

Welcome to the InternVL2.5-4B chat interface! This AI assistant can:

  • ๐Ÿ’ฌ Have conversations with text
  • ๐Ÿ–ผ๏ธ Analyze and describe images
  • ๐ŸŽฅ Process and understand videos
  • ๐Ÿ“ Extract text from images (OCR)
  • ๐ŸŽฏ Answer questions about visual content

Instructions:

  1. Type your message in the text box
  2. Optionally upload an image or video
  3. Click Send to get a response
  4. Use "Clear" to reset the conversation

๐Ÿ“Ž Upload Media

Supported formats:

  • Images: JPG, PNG, WEBP, GIF
  • Videos: MP4, AVI, MOV, WEBM

Tips:

  • For images: Ask about content, extract text, or describe what you see
  • For videos: Ask for descriptions, analysis, or specific details
  • You can upload one media file at a time

๐Ÿ’ก Example Prompts


About InternVL2.5-4B: A powerful multimodal AI model developed by Shanghai AI Lab, Tsinghua University and partners.

API Usage: This interface supports API calls. The chat endpoint accepts JSON with message, image, and video fields.