ninjasaid13@alien.topB to LocalLLaMA@poweruser.forumEnglish · 1 year agoLoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70Barxiv.orgexternal-linkmessage-square13fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70Barxiv.orgninjasaid13@alien.topB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square13fedilink
minus-squaresquareOfTwo@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoThey and their made up pseudo-scienfific pseudo “alignment” piss me so off. No, a model won’t just have a stroke of genius and decide to hack into a computer. For many reasons. Halluscination is one of them. Guessed a wrong token for a program? Oops the attack doesn’t work. Oh and don’t forget that tokens don’t fit into ctx.
They and their made up pseudo-scienfific pseudo “alignment” piss me so off.
No, a model won’t just have a stroke of genius and decide to hack into a computer. For many reasons.
Halluscination is one of them. Guessed a wrong token for a program? Oops the attack doesn’t work. Oh and don’t forget that tokens don’t fit into ctx.