@Forsaken-Data4905 - Communick News

0 Posts
1 Comment

Joined 2 years ago

Cake day: November 1st, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

Forsaken-Data4905@alien.topBtoMachine Learning@academy.garden•[D] What is the motivation for parameter-efficient fine tuning if there's no significant reduction in runtime or GPU memory usage?
link
fedilink
English
arrow-up
1·
2 years ago
The point is that the adapted layers have a significantly higher parameter count in the freezed model, leading to huge savings of memory. You never take your gradient with respect to adapted layers, only to adaptor layers and whatever is left of the original model.

This is of course not necessarily true for smaller models.

link
fedilink