I am happy OpenAI just joined the RAG game in terms of user interface.
The “backend” offering from OpenAI for API embedding models has always been quite underwhelming, to the point that multiple models outperform OpenAI latest models. Their old v1 embedding models where so bad, that any sentence-transformer would win against them, while being 10-15x cheaper.
I am happy OpenAI just joined the RAG game in terms of user interface.
The “backend” offering from OpenAI for API embedding models has always been quite underwhelming, to the point that multiple models outperform OpenAI latest models. Their old v1 embedding models where so bad, that any sentence-transformer would win against them, while being 10-15x cheaper.
Here is a post, showing e.g. Benchmarks from LLamaIndex last week https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83, showing how OpenSource models from Jina and BGE are on par or better, than offerings from Google, OpenAI, and even Cohere.
The only thing missing was a piece of OSS infrastructure, and that is what https://github.com/michaelfeil/infinity is here for.
Obviously, you can run all things at OpenAI (user interface, embeddings, vector db, llm) - but I guess that’s not what r/LocalLLaMA stands for.