[D] How do you keep up ?

CursedCrystalCoconut@alien.top · 2 years ago

[D] How do you keep up ?

eh-tk@alien.top · 2 years ago

There’s an online group called Transferred Learnings that holds monthly sessions on the latest developments. They are private though, and vet everyone to make sure you’re actually working in ML.

CursedCrystalCoconut@alien.top · 2 years ago

Thanks ! When I get back (soon) in a full-time ML position I’ll be sure to check it out.

eh-tk@alien.top · 2 years ago

I believe you can join as long as you’ve been working in ML historically.

They just want to avoid non technical folks more generally.

bgighjigftuik@alien.top · 2 years ago

No one does. Do you really think that engineers/researchers at OpenAI, Google Brain/DeepMind, MS Research, Meta Research etc are up to date in all topics?

We’re not. We just focus on our current field of expertise/daily job for the most part. Professors at university usually have a wider (but not deeper) view, but only top ones.

CursedCrystalCoconut@alien.top · 2 years ago

Then it’s kind of sad, because a lot of discoveries have been made by looking at what other disciplines were doing and cross-pollinating (genetic algorithms, attention, etc.). Plus then how does one know of they want to branch to another domain? But you’re right there is too much…

Tommassino@alien.top · 2 years ago

Idk, depends on your standards of what it means to keep up. I skim, pick out things that seem relevant/useful to whatever I focus on right now, and put more time in that paper/blog/whatever. I think everybody does the same.

CursedCrystalCoconut@alien.top · 2 years ago

Yes, it seems from all the answers that I just try to go too deep. Unfortunately it feels like nowadays it’s just tweaking and trying architectures, but there is no “red line” or big mechanism to know about, like there was kernels or attention.

mofoss@alien.top · 2 years ago

I don’t - I rather focus on drilling every nook and cranny of the attention mechanism so that reading any of these papers becomes easier.

I’d say if you truly understand Transformers both theoretically and intuitively, you’re already in the top 10% of MLEs. Though I’d imagine most PhDs understand it.

Zywoo_fan@alien.top · 2 years ago

“truly understand Transformers theoretically”!? Could you please share references which explain the theory around transformers.

new_name_who_dis_@alien.top · 2 years ago

It’s basically impossible to be completely caught up. So don’t feel bad. I am not really sure it’s all that useful either, you should know of technologies / techniques / architectures and what they are used for. You don’t need to know the details of how they work or how to implement them from scratch. Just being aware means you know what to research when the appropriate problem comes your way.

Also a lot of the newest stuff is just hype and won’t stick. If you’ve been in ML research since 2017 (when transformers came out) you should know that. How many different CNN architectures came out between Resnet in 2016 (or 15?) and now? and still most people simply use Resnet.

NinthImmortal@alien.top · 2 years ago

The Last Week in AI podcast is great for keeping up to date on information and you can do a deep dive on whatever catches your interest.

Gramious@alien.top · 2 years ago

My tactic is to start by checking the papers that actually GET IN to major conferences (Neurips, ICLR, ICML are a good start). This narrows the search considerably. Doing a Google scholar search, for example, will just yield an insurmountable number of papers. This is, in part, due to the standard “make public before it is accepted” methodology (arXiV preprints are fantastic but they also increase the noise level dramatically).

Now, having been burnt by the chaos of the review processes of the aforementioned conferences, I am certainly aware that their publications are by no means the “Gold standard” but the notion of peer review, including the intended outcomes of improvement therethrough, is powerful nonetheless.

CursedCrystalCoconut@alien.top · 2 years ago

That helps narrow it down. Though, many discoveries are not published anymore. Reminds me of Mikolov, who was rejected pretty much everywhere and word vectors ended up being such a big deal. Or that OpenAI does not publish their models.

Gwendeith@alien.top · 2 years ago

Honestly, I mostly just follow hugging face’s blog and articles. I know there are some latest fancy attention improvements, alternatives for RLHF, GPU whatever optimization, etc, but I’m not going to implement those myself. If it’s not in hugging face’s ecosystem, then I most likely wouldn’t use it in my daily work/production code anyway.

CursedCrystalCoconut@alien.top · 2 years ago

Hugging face is for sure a godsend, even though I’m still at a semi-loss with their API. It changed so much, and there is so much more now that it has become a little confusing. Nothing a little work can’t fix ! But that raises the question to me : how do these people manage to get out every model so fast ?

Gwendeith@alien.top · 2 years ago

Yeah, reading all their latest releases is already taking me a lot of time so I just mostly stop there. They also don’t have a lot of documentations for their latest stuffs, so it takes a bit time to figure things out. I think their packages will settle down to a more stable state after a year or two, after the NLP hype cooldowns a bit.

mr_stargazer@alien.top · 2 years ago

You don’t. The process is broken, but nobody cares anymore.

Big names and labs want to maintain the status quo = churning paper out (and fighting on Twitter…erm X, of course).
If you’re a Ph.D. student, you just want to get the hell out of there and hopefully try to ride a bit the wave and make some = trying to along and churn some papers out.
If you’re a researcher in a lab, you don’t really care as long as you try something that works and, eventually you have to prove in the yearly/bi/x review that you actually did some work = churn whatever paper out there.

Now, if by any chance, any absolutely crazy reason, you’re someone who’s actually curious about understanding the foundations of ML, deeply reason about why “ReLU” behaves like so over “ELU”, or, I don’t know, you question why some models with 90 billion parameters behave almost the same as a model that was compressed by a factor of 2000x and only lose 0.5% of accuracy, in brief, the science behind it all, then you’re absolutely doomed.

In ML…(DL, since you mention NLP), the name of the game is improving some “metric” with an aesthetically appealing name, but not so strong underlying development (fairness, perplexity). All, of course using 8 GPU’s, 90B parameters and zero replications of your experiment. Ok, let’s be fair, there are some papers indeed that replicate their experiments in a total of…10…times. "The boxplot shows our median is higher, I won’t comment on the variance of of it, we will leave it for future work. "

So, yes…that’s the current state of affairs right there.

CursedCrystalCoconut@alien.top · 2 years ago

You managed to put into words what bugs me with the field nowadays. What kills me most is that third paragraph you said : no-one cares what the model does IRL but how it improves a metric on a benchmark task and dataset. When the measure becomes the objective, you’re not doing proper science anymore.

Western-Image7125@alien.top · 2 years ago

Hold on, why is it useless to understand why a model which is 2000x smaller has only a 0.5% reduction in accuracy? Isn’t that insanely valuable?

mr_stargazer@alien.top · 2 years ago

It is absolutely valuable. But the mainstream is more interested in beating the next metric, rather than investigating why such phenomena happens. But being fair there are quite of researchers trying to do that. I’ve read a few papers in such direction.

But the thing is, in order experiment with it you need 40 GPUs and the people with 40 GPUs available are more worried about other things. That was the whole gist of my rant…

uday_@alien.top · 2 years ago

The doomed student is me :’(

wantondevious@alien.top · 2 years ago

Try having done your PhD before SVMs were well known… yeah, the struggle is real…

koolaidman123@alien.top · 2 years ago

Read noam shazeers work, you have now caught up

Kitchen_Tie_9695@alien.top · 2 years ago

Know some general Ideas like Attention, Diffusion, Vector DB, Backprop, Dropout, Unet etc and where they work best on. Additionally know SOTA models for general use cases. When a new use case arises you should know where to dig and how to code. Have strong understanding on all these new concepts and feel free code them yourself. Most papers are just combinations of these general Ideas. If you are on a project, only then you should read in depth papers on that use case.

Vityou@alien.top · 2 years ago

The main trick is learning to filter out the bs “attention aware physics informed multimodal graph centric two-stage transformer attention LLM with clip-aware positional embeddings for text-to-image-to-audio-to-image-again finetuned representation learning for dog vs cat recognition and also blockchain” papers with no code.

That still leaves you with quite a few good papers, so you need to focus down into your specific research area. There’s no way you can keep caught up in all of ML.

CursedCrystalCoconut@alien.top · 2 years ago

Yeah, those bs ones pop up everywhere. If only there was some model to sort between those and the good ones… And I’m kind of giving up on being caught up, seeing g all the answers.

KelseyFrog@alien.top · 2 years ago

I spend an hour each morning scanning the preprints on arxiv.org, scanning a half dozen or so and selecting perhaps one to save for weekly symposium( I like to have at least one really good paper a week to share).

It’s usually easy to tell if a paper is a follow-up or response to another and if that’s the case,.I might skim those too. These get supplemented with what pops up here and HN which might extend back a few months (higher signal, less noise).

This is enough to feel like I have my finger on the pulse of one topic within ML.

CursedCrystalCoconut@alien.top · 2 years ago

Wow, that is a lot of work. It’s awesome that you manage to have the latest and the pulse of AI as you said. That is the kind of discipline I cannot follow. Just one hour at work in the morning would destroy the rest of my day ^^

Josaton@alien.top · 2 years ago

This website helps me:

https://hype.replicate.dev/?filter=past\_day