Relevance Extraction in RAG pipelines

SatoshiNotMe@alien.top · 2 years ago

Relevance Extraction in RAG pipelines

SatoshiNotMe@alien.top · 2 years ago

Here is the comparison for that specific example.

https://preview.redd.it/60yx347rkexb1.png?width=1126&format=png&auto=webp&s=9aeb12c48a85aee87c51ec94373afb9782cce200

spirobel@alien.top · 2 years ago

just to double check: you embed the sentence numbers into the context, right?

so the llm will see: “1: Giraffes have long necks. 2: They eat mostly leaves….”

or does the llm learn by itself what sentence is what number?

The general optimization behind this is to reduce the number of tokens to generate even at a slight increase in context size, correct?

Wonder where the trade off here is … there are probably more tricks like this, but I assume at some point there will be diminishing returns, where the added context size makes it not worth it …

PopeSalmon@alien.top · 2 years ago

yeah i thought of numbering sections too, i agree that is or should be obvious , just now it occurred to me what if you took an embedding of each sentence & compared those, intuitively it seems like you might be able to avoid calling a model at all b/c shouldn’t the relevant sentences just be closer to the search

SatoshiNotMe@alien.top · 2 years ago

> intuitively it seems like you might be able to avoid calling a model at all b/c shouldn’t the relevant sentences just be closer to the search

Not really, as I mention in my reply to u/jsfour above: Embeddings will give you similarity to the query, whereas an LLM can identify relevance to answering a query. Specifically, embeddings won’t be able to find cross-references (e.g. Giraffes are tall. They eat mostly leaves), and won’t be able to zoom in on answers -- e.g. the President Biden question I mention there.