I keep diving and finding GPT-4V prototypes shared on X: e.g. narration for videos (source), posture correction (source), etc.

As foundation models in computer vision become even more accessible, will the field recover some attention (wrt to LLMs hype)?

  • glitch83@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Maybe? Vision has been around a lot longer than NLP in industry. It’s permeated into some challenging areas like embedded and edge spaces due to privacy and requirements. If the foundation models can’t run on the edge then I can imagine foundation models only affecting a small portion of vision applications.