The absolute basic mathematics that is required to understand basic ML/DL are calculus, linear algebra, probability and some convex optimisation. We are all aware of that.
But ML and DL has become a vast field both in breadth and depth. A single person can’t understand the field entirely. There are specialistions and sub-specialisations and further more.
If you work in a branch of ML/DL research where some other math fundamentals are needed to understand research papers and do innovative research, can you mention your field of work and the math fundamentals that are required to gain entry into your field?
how about nonlinear optimization? Best way to go if you want convergence guarantees of algorithms. Concepts like duality, conjugate functions, convexity + dealing with constraints etc. are all over the place. Bertsekas is a good reference.
Measure theory, differential geometry and optimal transport are great fields if you are going for theoretical ML.
How exactly is differential geometry applicable in ML?
Maybe if you could be more specific…
Oh, almost forgot, this kind of questions should be posted on r/learnmachinelearning or r/mlquestions or the sticky thread
Geometric deep learning is a relatively small but growing field heavily based on group theory and representation theory. My own research on the subject was quite foundational/general and also required differential geometry, gauge theory, harmonic analysis, and functional analysis. Everything centered around equivariance; bulding problem-dependent local/global symmetries into the network architecture in order to make use of weight sharing and reduce the amount of data needed for the network to learn.
I’m looking into a problem in this area now. I’m currently looking at the paper Equivariant Neural Rendering, but it doesn’t seem very sophisticated. Can you recommend any better geometrical approaches to the novel view synthesis problem? Over the past few days I have been reading a lot by Hinton about how CNNs are bad at geometry, but his own preferred solution of Capsule Networks doesn’t seem to scale very well.
I’m an undergraduate and I was first introduced to this field 2 years ago through a blog post on gauge equivariant cnn’s. I used to work at the time as a SE but the elegance of it all made me go back to college. You have any recommendations for projects at the undergrad level or people/programs to reach out to? (I have a thesis class next semester and I’d really love to do it on GDL)
Measurement theory, causal inference, Bayesian stats, the whole megillah.
Density estimation in general. Optimal transport for path wise gradients, and virtually all stuff related to maximum likelihood estimation
Signals and systems and differential equations for CV and audio
I do scientific machine learning, with a particular focus on numerical methods and computational biology.
The other big piece of fundamental mathematics needed is differential equations – ODEs at least, but ideally also SDEs+PDEs+numerics. (Soapbox moment: I find it kind of unusual how poorly this is taught outside of math/engineering courses, given that it’s easily the single most successful modelling paradigm of the past few century.)
Just Know Stuff also has a short list of things I consider worth knowing.
Ok this is… a lot of stuff. Understand probability through measure theory?
Sounds freaking fun to me!
Try not to get to focused on knowing all of such lists but try to skim at least what seems possible because it’s nice to have a toolbox in your head.
Most students that still get hired don’t know much of proper code architectures, patterns or code decoupling that is very much essential for proper development but still get to learn on the job. Having been a ML engineer for a couple of years I still haven’t picked up a lot of statistics or sometimes even architectures because they have never been relevant to our use cases.
At most companies I have been and interviewed at you are expected to learn, not know. You need to be over a base line for the jobs essentially but you should be substantiate why you are a good learner that can pick up anything. One note to this though is if you only limit your search to the biggest companies with unlimited applicant pools. The baseline will definitely rise for minimum requirements and arbitrary filters will be set up just to get rid of the masses and just interviewing the most notable outliers.
What about automatic differentiation too!
This isn’t really a fundamental piece of mathematics, it’s just an algorithm built on the chain rule.
Information theory is another field of math/ECE that is heavily used in theoretical ML.
A little bit of functional analysis wouldn’t hurt. Knowing your fourier / laplace operators could help clarify some of the logic underlying CNNs / Neural ODEs.
Group theory seems to be ramping up with invariance / equivariant properties of some specialized neural networks (i.e. rotation equivariant nets, gauge equivariant nets).
Causal inference is another emerging hot topic - but it is a bit scatterbrained, since there are multiple competing schools of thought surround this topic.
Tbh nothing outside of some basic knowledge of what y = f(x) looks like. It’s really more about coding ability and an understanding of the machine. This is coming from someone with both a math and physics degree. I wish my mathematical skills were more relevant than my coding skills, but they simply aren’t. Just know how to optimize code and you’re good