Gaussian_Kernel@alien.topB to Machine Learning@academy.gardenEnglish · 1 year ago

[R] Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

2

1

[R] Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

Gaussian_Kernel@alien.topB to Machine Learning@academy.gardenEnglish · 1 year ago

2

Paper link http://arxiv.org/abs/2310.18338

Description We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. These subproblems are answered by a solver. We use a relatively small (13B parameters) LM as the decomposition generator, which we train using policy gradient optimization to interact with a solver LM (regarded as black-box) and guide it through subproblems, thereby rendering our method solver-agnostic. Evaluation on multiple different reasoning datasets reveal that with our method, a 175 billion parameter LM (text-davinci-003) can produce competitive or even better performance, compared to its orders-of-magnitude larger successor, GPT-4. Additionally, we show that DaSLaM is not limited by the solver’s capabilities as a function of scale; e.g., solver LMs with diverse sizes give significant performance improvement with our solver-agnostic decomposition technique.

Chat

Single_Ring4886@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
This is thing I was talking about some months ago after I first saw GPT4. Glad someone actually worked in this direction as ideas are somehow cheap these days :)

Machine Learning@academy.garden

machinelearning@academy.garden

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !machinelearning@academy.garden

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
1 user / 6 months
1 local subscriber
1 subscriber
786 Posts
3.03K Comments
Modlog

mods:
communick@academy.garden