Recently came across this AI Safety test report from LinkedIn: https://airtable.com/app8zluNDCNogk4Ld/shrYRW3r0gL4DgMuW/tblpLubmd8cFsbmp5
From this report it seems Llama 2 (7B version?) lacks some safety checks compared to OpenAI models. Same with Mistral. Did anyone find the same result? Has it been a concern for you?
Looks like you’ve now made some changes. Columns now read “Llama2-7b-chat” instead of “llama2.” Also, chat responses below the completions, chastising the inappropriate messages. However, a completion was generated, first, and the item is still marked as “fail.” Very poor show