I’m a data engineer who somehow ended up as a software developer. So many of my friends are working now with the OpenAI api to add generative capabilities to their product, but they lack A LOT of context when it comes to how LLMs actually works.
This is why I started writing popular-science style articles that unpack AI concepts for software developers working on real-world application. It started kind of slow, honestly I wrote a bit too “brainy” for them, but now I’ve found a voice that resonance with this audience much better and I want to ramp up my writing cadence.
I would love to hear your thoughts about what concepts I should write about next?
What get you excited and you find hard to explain to someone with a different background?
Why does gradient descent have good inductive biases? Do the inductive biases of non gradient based optimizers differ?
Causality
I feel like I’m seeing a lot on causality these days, for example from Schölkopf’s lab.
its somewhere in the neurons. would cost the company alot to get to it though. best not to worry about it /s
learning dynamics and geometry. this definitely gets some attention, but almost always in the context of scaling. it’s a pretty interesting topic in its own right.
C* algebra and its influence on the topology of a neural network
bahaha if operator algebras actually made their way to into deep learning that would be awesome! I say that as an ex operator algebraist :P
Some areas that I hope to make time to explore someday, that are relatively obscure (to my knowledge); are text to knowledge graph, cross domain input generalization, and schematic synthesis from 3d models/point clouds.
VC Dimension
Kolmogorov Complexity
Bayesian optimization for hyperparameter search.
CV applications for Home, life, and IoT.
Manifold learning! It seems so cool, but every time I dig into it, I feel like I need a PhD in math to understand the theory.
Yes! I think/hope ultimately this will unlock AGI.
State space models and their derivatives.
They have demonstrated better performance that Transformers on very long sequences, and that with linear instead of quadratic Computational costs, and on paper also generalize better to non-NLP tasks.
However, training them is more difficult, so they perform worse in practice outside of these few very long sequence tasks. But with a bit more development, they could become the most impactful AI technology in years.
do you have anything in particular you think is worth sharing? I‘m trying to implement model predictive control in matlab and i‘m working on a lstm surrogate model. Just last week i‘ve found matlab tools for neural state space models, and i‘ve been wondering if i just uncovered a big blindspot of mine.
It sounds like you‘ve come across exactly what I meant.
I have a couple of papers on the topic if you’re interested in those. There’s also a PyTorch implementation of a neural state space model by the authors of the original paper: https://github.com/HazyResearch/state-spaces
Gonna toot my own research direction: artificial intelligence x complex systems. I’m talking differentiable self-organization (e.g., neural cellular automata), interacting particle systems (e.g., particle Lenia), and other neural dynamical systems where emergent behaviour and self-organization are key characteristics.
Other than Alex Mordvintsev and his co-authors, Sebastian Risi and his co-authors, and I suppose David Ha with his new company, I don’t see much work in this intersection of fields.
I think there’s a lot to unlock here, particularly if the task at hand benefits greatly from a decentralized and/or a compute-adaptive approach, with robustness requirements. Swarm Learning already comes to mind. Or generative modelling with/of complex systems, like decentralized flow (or Schrödinger) matching for modelling interacting particle systems (e.g., fluids, gasses, pedestrian traffic).
Could you please point to a few recent research papers in this area?
Last year when I heard from one of my friends, who is a medical data researcher from Harvard, that he and his colleagues were doing researches related to federated learning, I knew this topic gotta be trendy for the recent years
I think this is the most important topic in this thread so far.
Why is that? You’ve got me curious
Morality training environments, design game theory environments so that multiple RL agents end up with a strong bias towards cooperation.
AI explainability and why it’s garbage
I’m particularly excited about AI accelerating theorem provers and optimization problems (think: traveling salesman). These problems are NP-hard and scale very poorly. We would see huge efficiency gains in most industries if they scaled better. Recently there has been some very exciting research in using neural networks to accelerate and scale MILP and LP solvers.
For reference, optimization problems include:
- SpaceX rocket landing
- Car navigation systems
- Electric grid operations/markets
- Portfolio optimization
- Stock and options trading
- Airline fleet operations
- Ship/Truck logistics
Form parsing. Hugely important topic and the only approach I know about is Microsoft’s LayoutLM model.
Probabilistic programming is also an interesting topic (used for Bayesian Statistics/Probabilistic ML/Probabilistic Graphical Models/Cognitive AI/etc.)