Google released T5X checkpoints for MADLAD-400 a couple of months ago, but nobody could figure out how to run them. Turns out the vocabulary was wrong, but they uploaded the correct one last week.

I’ve converted the models to the safetensors format, and I created this space if you want to try the smaller model.

I also published quantized GGUF weights you can use with candle. It decodes at ~15tokens/s on a M2 Mac.

It seems that NLLB is the most popular machine translation model right now, but the license only allows non commercial usage. MADLAD-400 is CC BY 4.0.

  • lowkeyintensity@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Meta’s NLLB is supposed to be the best translator model, right? But it’s for non-commercial use only. How does MADLAD compare to NLLB?

    • HaruSosake@alien.top
      cake
      B
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      NLLB has horrible performance, I’ve done extensive testing with it and wouldn’t even translate a children’s book with it. Google Translator does a much better job and that’s saying something. lol

    • jbochi@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      The MADLAD-400 paper has a bunch of comparisons with NLLB. MADLAD beats NLLB in some benchmarks, it’s quite close in others, and it loses some. But the largest MADLAD is 5x smaller than the original NLLB. It also supports more 2x more languages.