I’m a data engineer who somehow ended up as a software developer. So many of my friends are working now with the OpenAI api to add generative capabilities to their product, but they lack A LOT of context when it comes to how LLMs actually works.

This is why I started writing popular-science style articles that unpack AI concepts for software developers working on real-world application. It started kind of slow, honestly I wrote a bit too “brainy” for them, but now I’ve found a voice that resonance with this audience much better and I want to ramp up my writing cadence.

I would love to hear your thoughts about what concepts I should write about next?
What get you excited and you find hard to explain to someone with a different background?

  • bestgreatestsuper@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Why does gradient descent have good inductive biases? Do the inductive biases of non gradient based optimizers differ?

    • RandomTensor@alien.top
      cake
      B
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I feel like I’m seeing a lot on causality these days, for example from Schölkopf’s lab.

    • samrus@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      its somewhere in the neurons. would cost the company alot to get to it though. best not to worry about it /s

  • DigThatData@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    learning dynamics and geometry. this definitely gets some attention, but almost always in the context of scaling. it’s a pretty interesting topic in its own right.

    • AbjectDrink3276@alien.top
      cake
      B
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      bahaha if operator algebras actually made their way to into deep learning that would be awesome! I say that as an ex operator algebraist :P

  • michelin_chalupa@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Some areas that I hope to make time to explore someday, that are relatively obscure (to my knowledge); are text to knowledge graph, cross domain input generalization, and schematic synthesis from 3d models/point clouds.

  • -Django@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Manifold learning! It seems so cool, but every time I dig into it, I feel like I need a PhD in math to understand the theory.

  • General_Service_8209@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    State space models and their derivatives.

    They have demonstrated better performance that Transformers on very long sequences, and that with linear instead of quadratic Computational costs, and on paper also generalize better to non-NLP tasks.

    However, training them is more difficult, so they perform worse in practice outside of these few very long sequence tasks. But with a bit more development, they could become the most impactful AI technology in years.

    • BEEIKLMRU@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      do you have anything in particular you think is worth sharing? I‘m trying to implement model predictive control in matlab and i‘m working on a lstm surrogate model. Just last week i‘ve found matlab tools for neural state space models, and i‘ve been wondering if i just uncovered a big blindspot of mine.

  • tesfaldet@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Gonna toot my own research direction: artificial intelligence x complex systems. I’m talking differentiable self-organization (e.g., neural cellular automata), interacting particle systems (e.g., particle Lenia), and other neural dynamical systems where emergent behaviour and self-organization are key characteristics.

    Other than Alex Mordvintsev and his co-authors, Sebastian Risi and his co-authors, and I suppose David Ha with his new company, I don’t see much work in this intersection of fields.

    I think there’s a lot to unlock here, particularly if the task at hand benefits greatly from a decentralized and/or a compute-adaptive approach, with robustness requirements. Swarm Learning already comes to mind. Or generative modelling with/of complex systems, like decentralized flow (or Schrödinger) matching for modelling interacting particle systems (e.g., fluids, gasses, pedestrian traffic).

    • lunaticAKE@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Last year when I heard from one of my friends, who is a medical data researcher from Harvard, that he and his colleagues were doing researches related to federated learning, I knew this topic gotta be trendy for the recent years

  • PrincessPiratePuppy@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Morality training environments, design game theory environments so that multiple RL agents end up with a strong bias towards cooperation.

  • PhilsburyDoboy@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’m particularly excited about AI accelerating theorem provers and optimization problems (think: traveling salesman). These problems are NP-hard and scale very poorly. We would see huge efficiency gains in most industries if they scaled better. Recently there has been some very exciting research in using neural networks to accelerate and scale MILP and LP solvers.

    For reference, optimization problems include:

    • SpaceX rocket landing
    • Car navigation systems
    • Electric grid operations/markets
    • Portfolio optimization
    • Stock and options trading
    • Airline fleet operations
    • Ship/Truck logistics
  • Exotic_Zucchini9311@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Probabilistic programming is also an interesting topic (used for Bayesian Statistics/Probabilistic ML/Probabilistic Graphical Models/Cognitive AI/etc.)