Basically, what happened was the government said that they wanted greater accountability on models and AI so they added ambiguous requirements that are kind of impossible to enforce all the way, but also create other problems for the current generation of AI.
For example a model must comply by not being a national security threat, and that’s mostly to deal with misinformation, but it’s open ended enough that any model can suddenly become noncompliant with a court order overnight.
Citations are a big one that people don’t like but I actually like it because I feel like the models should be able to tell where they got the data from, to limit hallucinations. It’s just not something that the current technical frameworks really make easy or even possible because the training process itself is designed to make this impossible in most cases; companies don’t like it because it gets rid of the ambiguity that they previously had on the data said that they were using in brings to light the fact that now companies that make these models have to explicitly, make sure that every single bit of data they use for training allows them to do so it is something that they can see is OK being presented as a citation.
This brings back all the intellectual property stuff that is currently being ignored by the groups that are going around and stealing data for training; it also cuts the shadow elements out of the market and they don’t like that but I think it’s a good thing.
Most of the people that are complaining about this are just angry because they understand this also widens the moat, while at the same time directly harms openAI as the current incumbent that does not use citations the correct way.
Q* was completely explained, and openAI explained what it was. I was even able to make a YouTube video about it because they’re explanation was so clear, so I was able to explain it as if you were five years old.
I don’t understand how people believe this is a secretive thing and I don’t understand why people aren’t talking about how simple it is.
Everybody is talking about this like it’s some grand secret, why?
I mean, the algorithm is expensive to run, but it’s not that hard to understand.
Can somebody please explain why everybody’s acting like this is such a big secret thing?