I’m a data engineer who somehow ended up as a software developer. So many of my friends are working now with the OpenAI api to add generative capabilities to their product, but they lack A LOT of context when it comes to how LLMs actually works.

This is why I started writing popular-science style articles that unpack AI concepts for software developers working on real-world application. It started kind of slow, honestly I wrote a bit too “brainy” for them, but now I’ve found a voice that resonance with this audience much better and I want to ramp up my writing cadence.

I would love to hear your thoughts about what concepts I should write about next?
What get you excited and you find hard to explain to someone with a different background?

  • rejectedlesbian@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    optimizer OMG no one touched optimizes for decades.
    we basically figure its ADAM/SGD and there wasnt really any improvement on it.

    I tried finding an improvement to it myself for a few months but failed miserably

    • charlesGodman@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      There has been LOADS of research on deep learning optimisation in recent years. However, TLDR nothing beats ADAM.

    • currentscurrents@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      Learned optimizers look promising - training a neural network to train neural networks.

      Unfortunately they’re hard to train and nobody has gotten them to really work yet. The two main approaches are meta-training or reinforcement learning, but meta-training is very expensive and RL has all the usual pitfalls of RL.

    • satireplusplus@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      Because its super hard to build something that works better than ADAM across many tasks. There’s probably no shortage of people trying to come up with something better.

    • koolaidman123@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      Nothing beats adamw + compute. Plus with the current data centric approach everything kinda converges at scale