I’m using a100 pcie 80g. Cuda11.8 toolkit 525.x

But when i inference codellama 13b with oobabooga(web ui)

It just make 5tokens/s

It is so slow.

Is there any config or something else for a100???

  • hudimudi@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Uhmmm where did you buy that a100? Was it a good deal? lol. Just kidding, you probably set sth up wrong or the drivers are messing up. Is the card working fine otherwise in benchmarks?