@mr_stargazer

mr_stargazer@alien.top · 11 months ago

Yes. While we all wait, question:

Did you submit your code (data+model+results) along with the manuscript? If not, why?

Thank you.

mr_stargazer@alien.top · 1 year ago

I mean…ain’t most of articles like that, lately? We just go on about our business…

mr_stargazer@alien.top · 1 year ago

Hm…? Probabilistic Graphical Models were used as Expert Systems for a long time. Koller’s (Stanford), Bishop’s work at Microsoft…

mr_stargazer@alien.top · 1 year ago

Ok, in one article:

https://news.mit.edu/2020/artificial-intelligence-identifies-new-antibiotic-0220

Developing effective drugs is a highly dimensional problem. Now, think of other equally difficult problems in other fields that aren’t being tackled.

That’s the potential.

mr_stargazer@alien.top · 1 year ago

It is absolutely valuable. But the mainstream is more interested in beating the next metric, rather than investigating why such phenomena happens. But being fair there are quite of researchers trying to do that. I’ve read a few papers in such direction.

But the thing is, in order experiment with it you need 40 GPUs and the people with 40 GPUs available are more worried about other things. That was the whole gist of my rant…

mr_stargazer@alien.top · 1 year ago

You don’t. The process is broken, but nobody cares anymore.

Big names and labs want to maintain the status quo = churning paper out (and fighting on Twitter…erm X, of course).
If you’re a Ph.D. student, you just want to get the hell out of there and hopefully try to ride a bit the wave and make some = trying to along and churn some papers out.
If you’re a researcher in a lab, you don’t really care as long as you try something that works and, eventually you have to prove in the yearly/bi/x review that you actually did some work = churn whatever paper out there.

Now, if by any chance, any absolutely crazy reason, you’re someone who’s actually curious about understanding the foundations of ML, deeply reason about why “ReLU” behaves like so over “ELU”, or, I don’t know, you question why some models with 90 billion parameters behave almost the same as a model that was compressed by a factor of 2000x and only lose 0.5% of accuracy, in brief, the science behind it all, then you’re absolutely doomed.

In ML…(DL, since you mention NLP), the name of the game is improving some “metric” with an aesthetically appealing name, but not so strong underlying development (fairness, perplexity). All, of course using 8 GPU’s, 90B parameters and zero replications of your experiment. Ok, let’s be fair, there are some papers indeed that replicate their experiments in a total of…10…times. "The boxplot shows our median is higher, I won’t comment on the variance of of it, we will leave it for future work. "

So, yes…that’s the current state of affairs right there.

mr_stargazer@alien.top · 1 year ago

Honestly, at this stage, many things unfortunately is a zoo…