[D] - What is the latest in tree-based approaches for LLMs? Has there been any significant research using RL for this?

30299578815310@alien.top · 2 years ago

[D] - What is the latest in tree-based approaches for LLMs? Has there been any significant research using RL for this?

30299578815310@alien.top · 2 years ago

Yeah, I also notice there are two types of ways to implement trees being researched.

One is at a sequence / thought level, like tree of thoughts / chain of thoughts, where the model talks to itself in order to find the best solution. The other is at the decoding / token level, where the tree is used to search for the optimal next set of tokens. In principle you could put these both together and have nested trees.

But yeah I think the alpha-go style self-learning is what is really missing here. In principle, even without a tree, nothing stops us from putting an LLM in an environment where it gets positive feedback from rewards (like solving math problems), and then just let it rip.