An MIT study finds non-clinical information in patient messages, like typos, extra whitespace, or colorful language, can reduce the accuracy of a large language model deployed to make treatment recommendations. The LLMs were consistently less accurate for female patients, even when all gender markers were removed from the text.
It’s not an assumption it’s just a matter of practical reality. If we’re at best a decade off from that point why pretend it could suddenly unexpectedly improve to the point it’s unrecognizable from its current state? LLMs are neat, scientists should keep working on them and if it weren’t for all the nonsense “Ai” hype we have currently I’d expect to see them used rarely but quite successfully as it would be getting used off of merit, not hype.