With the advent of LLMs, multimodality and “general purpose” AIs which seat on unimaginable amounts money, computing power and data. I’m graduating and want to start a PHD, but feel quite disheartened given the huge results obtained simply by “brute-forcing” and by the ever-growing hype in machine learning that could result in a bubble of data scientists, ML researchers and so on.
The same as always, using lightgbm.
Joke aside, you can train a LLM to give the result of 1+1 and it can sometimes be wright. That’s an expensive way of solving that problem.
You can also develop a simple calculator, that will always get an accurate awnser.
My point being that simply because the algorithms you mentioned ‘can’ solve a problem, doesn’t mean they are the best solution for that. That are a bunch of NLP problems that LLMs are supbar for exemple.
The future of ML in startups will be the same as it currently is: find the best solution to the problem given the particularities of the problem and the business constraints (i.e. money).
I think the future will be about applying deep learning to all different domains to boost the advancement of humanity so being a ML researchers or startup still has a lot of potential.
The most important thing is not to be afraid of applying deep learning to different domains and simply try it out.
When I did my PhD, starting around 20 years ago, things were different. Feature engineering within a data domain was a pretty common way to specialize. Neural networks were old fashioned function approximators that went out with shoulder pads. The future was structred Bayes nets, or whatever was going to replace conditional random fields.
I’d listen to my PIs talk about how things were different - how I was leaning too much on the power of models to just learn things, and I had to focus more on the precise choice of model. When I pointed out that given the right feature engineering, the models basically performed the same, they’d dismiss that, saying I was leaving a lot on the table by not deriving a more fit-to-purpose model.
These days, I look at the work the junior modelers I supervise perform, and I urge them to at least look at the data, because you can’t really understand the problem you’re working without getting your hands on the data. But they’d rather focus on aggregate performance metrics than examine their outliers. After all, with large datasets, you may not be able to even look at all your outliers to recognize patterns. And how could you do as good a job as one of today’s giant networks?
Then there’s LLMs, where you may be handed something that can already basically solve your problem. That flips even more ways of working on their ear.
But the fact is, these patterns repeat.
You’re going to study something that won’t be as relevant in 20 years. That’s the way of all rapidly moving fields. And it’s okay. All code ever written will one day be dust. Even if we don’t like to admit it, the same is true of every publication. Industry or Academia, you build things that move us forward, slowly, in fits and starts. That’s the broader game you opt to play in this field.
ML will change. So will everything else.
Did the advent of “general purpose” computers result in a bubble of software engineers?
Stop following the hype. The LLM are not that impressive. People are not asking hard hitting questions or using it for serious projects.
It is literally pure hype.
Explainability and interpretability of ML is going to become more and more important as models are used in applications where there can be serious consequences when things don’t go right.
I mean there are plenty of major areas in ML that LLMs cannot even begin to address (e.g. processing time-series data - XGBoost still reigns supreme, edge ML, etc.). Also, keep in mind that most of the people at major LLM groups are PhD so chances are if you wanna work even on LLMs, having a PhD will help. Afterall, scaling is good but if your research shows more efficient training pathways, the difference can be 9-figure sums for these companies.
The bubble is about to burst. Six more months, tops. At that point people will realize that next word prediction while impressive is just not very useful in most business contexts, and that the use cases that might be interesting don’t need a universe-sized model. Useful applications of natural language processing are quite limited despite what everyone says right now. The idea to replace workers in large numbers is a pipedream, if the idea has merit at all it will be for those who promote it the most (talking heads).
!RemindMe 6 months
Interesting take, but 180 degrees from what I believe. Seeing how amazing transformers are and for the first time carving a path to AGI, I think we’ll have the ability to replace the average office worker within 5 years. Any excel and PowerPoint task, phone call, contracting and negociation with clients, controlling and finance, coding task and analytic task can be automated. Its easier than you make it out to be. Most people do disappointingly simple stuff on a daily basis and spend most time with politics and dicking around. A migration project where you move data from a legacy system to a new system sometimes takes 20 people half a year. It’s like reinventing wheels every single time, simply because the systems are different and unique, not because the work itself is particularly difficult. That’s something that could be done in an hour.
I hope you’re right, but I’m betting on a different horse. I think many people will be blindsided. And I think politics will have a hard time handling the transition to a UBI or whatever solution they come up with fast enough.
Transformers are not a path to AGI. They’re too dumb and static. Active Inference is where it’s at.
Not so sure about this timeline, we kept the social media company delusion going from ~2016-2021
Depends on if the bubble you are referring to is the current crop of Transformers. Which will keep running for a good while. However, the bubble will be made larger once Karl Friston’s group start releasing and low power Transformer-optimised optical chips start production next year. I can’t see an end to it as most current issues are in the process of being solved, including smaller, faster, less hallucinatory low compute models. We’re only just hitting the multimodal on-ramp and that journey has far to go.
All technology will have a natural language interface within 10 years. We will have embodied reasoning machines at most a decade later. Your perspective is skewed by your business context perhaps
Divide those timescales by 4 and you are on the mark.
I come from a chemistry standpoint. Sometimes the data set is so small it doesn’t matter how big of a model you throw at it. It’s not gonna work. Instead you need to get creative. Employing techniques such as active learning or delta ML. I think that’s still very much in infancy as a field. I’m still an undergrad tho so I might be wrong.
The bubble is already there. Since the end of the pandemic, huge layoffs happenend in the ML department and the market is flooded with job-seekers right now.
My friend’s startup - a small no-name ML startup that pays 50k canadian dollars a year - has posted a job offer recently and received more than 1000 applications in 2 days.
Well guess it’s not only laid off people but just everyone and their dog wants into the field. We also had one ML job listed and got tons of CVs (not thousands, more like 200 even though we’re a public company with remote first and pay for the role more in the 150-200k US$ range).
There was really everything there, from carpenters who studied physics 10 years ago over accountants up to a guy with 20 YoE on everything from spaceships to submarines.
But yeah definitely much more qualified folks than for developer roles. Quite a few I would have hired if I could have
But surely not 1000 PhDs?
Actually he told me they got many PhDs from non-relevant fields like biology etc.
Why are ml departments laying off?
I am not sure, I work in academia and we didn’t have this problem.
I suppose it is the backlash from the hype that started in 2015 when random companies started hiring data science teams and later realized they didn’t need them, but most notably the big tech companies fired many ML people at the end of the pandemic, which flooded the market with highly-skilled job-seekers and made it hard for newcomers up to this day, as far as I have been told.
Join a startup to work on these things. You’ll very quickly realize why people are still pursuing PhDs in the field.
Shall you explain why people keep pursuing PHDs? Is it because this is broad field and still a lot things can be discovered or some other reasons?
Nationalized infrastructure built by megacorp contractors is my predicted future for AI research the 2030’s.
I don’t doubt there are multi-trillion parameter multi-modal models being dreamed up right now by the US DoD and OpenAI to run psyops online, detect and recruit agents, push against foreign propaganda, etc…
And that’s okay I’d rather that stable organizations had the reins rather than some l33Tcode bro (and in US’s case the org is controlled by a democratically ellected person).
Are you one of these so called psyop models?
They forgot to append “Imagine you are a decent human being” to their prompt.