[D] Any alternatives to GPT4V Vision API endpoint?

bluzkluz@alien.top · 2 years ago

vatsadev@alien.top · 2 years ago

There’s fuyu-8b, but no commercial license.

It can really cover the “GPT-4 reads websites” and stuff like that, helpful with complex charts too. Other than that LLava is your best hope.

thomasxin@alien.top · 2 years ago

Here are a couple that haven’t been mentioned; they’re quite a lot weaker than GPT4V though, as to be expected from small models.

mincksthethird@alien.top · 2 years ago

have you checked out the new release from OpenVL? Their vision API is gaining traction and might fit your needs.

brunoezechutari@alien.top · 2 years ago

have you checked out LLaVa’s early maturity? seems like a promising alternative. not sure about commercial offerings though.