Is it just me or is LLM still not good enough to translate from English to a more gendered language like Portuguese?

Ok_Shape3437@alien.top · 1 year ago

Is it just me or is LLM still not good enough to translate from English to a more gendered language like Portuguese?

bot-333@alien.top · 1 year ago

Don’t use an LLM for translation, use an MT model for that.

which as far as I know, isn’t an AI.

Google Translate uses an AI, but not an LLM. It uses a “neural machine” so basically an MT model.

_-inside-_@alien.top · 1 year ago

I also have plans for generating Portuguese contents, and yes they are also quite poor on translating, even worse when we’re talking about European Portuguese. I’m considering using specialized models for translation, such as Argos translate, but I didn’t test it yet, I’m poor in hardware resources and this is a pet project.

PopeSalmon@alien.top · 1 year ago

google translate is an ai, it works in a fairly similar way to llms, it’s just a little less general

trevr0n@alien.top · 1 year ago

ChatGPT and GPT 4 have been significantly more accurate than google translate for Norwegian.

CedricLimousin@alien.top · 1 year ago

As far as I tried this kind of things, it works well in French.

And I think there is more French than Portuguese in the majority of training data sets, so it might be related?

staviq@alien.top · 1 year ago

Edit: Compare the grammar and spelling of properly uncensored models with censored ones, when it comes to “gendered” grammar and natural language concepts. There is a huge difference.

That’s not the question of language, that’s the result of promoting diversity through censorship, and the worst part is, none of the big companies give a faintest whiff of a fuck about diversity or minorities, they are just crossing things off of their marketing bucket list.

Censorship causes massive performance degradation. If you mindlessly force a model to have a bias, no matter the subject, the model will propagate this bias throughout its entire base knowledge.

Good or bad, a bias is a bias, because we are talking about computers, which are deterministic and literal and computer code only ever takes things literally, and LLMs are barely the first step towards generalisation and unassisted extrapolation.

Even when the general concept of say, fighting gender discrimination, is good in its core, force-feeding that to an LLM which is a computer code after all, will do stuff like making it completely loose the concept of genders, including linguistics concepts, solely because they share the word “gender” through literature and so the training dataset.

Yes, discrimination is stupid, racism is stupid, forcing everybody to live exactly the one “right” way is stupid.

But using censorship, to fight this, is hands down the dumbest, and laziest way possible.

PopeSalmon@alien.top · 1 year ago

i really feel like you should find a different word for bots not being as dirty as you want than “censorship”, a bot deciding to be circumspect isn’t what “censorship” means, there are lots of countries where you can get killed for being a journalist, there’s a lot of places where it’s really hard to get news out b/c of direct government control of all of the media, so i don’t think it’s polite to use the same word for you wish robots were more racist

Anxious-Durian1773@alien.top · 1 year ago

ChatGPT suggested the term ‘ideological engineering’ but I’m not sure that fully captures the nature of the problem.

Barafu@alien.top · 1 year ago

When you ask bot who is Jim Hopkins, and it says it will not answer this question because it respects the private life of persons, then yes, I want bots and people to be more racist.

Distinct-Target7503@alien.top · 1 year ago

Claude 2 and gpt4 do a pretty good job for English to Italian and vice versa. Gpt3.5 seem a little more “robotic” for English > italian, but work fine with italian > English.

Obviously that’s only feeling…i haven’t done any objective test.