Cradawx@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 10 months ago

ShareGPT4V - New multi-modal model, improves on LLaVA

sharegpt4v.github.io

1

ShareGPT4V - New multi-modal model, improves on LLaVA

sharegpt4v.github.io

Cradawx@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 10 months ago

ShareGPT4V

sharegpt4v.github.io

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

Chat

M0ULINIER@alien.topB
link
fedilink
English
arrow-up
1·
10 months ago
https://preview.redd.it/vnony8f0ax1c1.png?width=1080&format=pjpg&auto=webp&s=dc261252751a0a1e209d9049854895688de25fa4

Benchmark in their GitHub, even if it’s hard to be sure in current times
- lakolda@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  10 months ago
  This isn’t comparing with the 13B version of LLAVA. I’d be curious to see that.
- justletmefuckinggo@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  10 months ago
  im new here. but is this true multimodality, or is it the llm communicating with a vision model?
  
  and what are those 4 models being benchmark tested here for exactly?