ChatGPT 4 is estimated to use 700GB of “High Bandwidth Memory”.
… which will set you back about half a million dollars at current prices (which are high, because the manufacturers can’t keep up with demand). Or, you could just pay 20 bucks a month.
To put some numbers on it - RAM runs at tens of gigabytes per second (bytes, not bits). High Bandwidth Memory runs at several hundred or sometimes terabytes per second (OpenAI is likely using the latter, and that memory isn’t just expensive it’s also supply constrained, so the prices are astronomically high right now).
You can buy HBM, and you can use it as your main system RAM, but it’s painfully expensive. The actual amount of bandwidth also scales linearly with with the amount of memory you buy as well. So a 500GB is 10x faster than 50GB - because it write to all of the chips simultaneously (and then read from all of them when you access the data back).
It’s pretty standard on high end GPUs these days. Apple also uses it on all their computers (if you buy a Mac with 64GB of RAM, it’ll run at 800MB/s - which isn’t quite as fast as a high end GPU but it’s close and it is HBM). It’s part of why Macs are so expensive (and also why the cheaper ones have very little RAM).
ChatGPT 4 is estimated to use 700GB of “High Bandwidth Memory”.
… which will set you back about half a million dollars at current prices (which are high, because the manufacturers can’t keep up with demand). Or, you could just pay 20 bucks a month.
Doesn’t that mean RAM?
To put some numbers on it - RAM runs at tens of gigabytes per second (bytes, not bits). High Bandwidth Memory runs at several hundred or sometimes terabytes per second (OpenAI is likely using the latter, and that memory isn’t just expensive it’s also supply constrained, so the prices are astronomically high right now).
You can buy HBM, and you can use it as your main system RAM, but it’s painfully expensive. The actual amount of bandwidth also scales linearly with with the amount of memory you buy as well. So a 500GB is 10x faster than 50GB - because it write to all of the chips simultaneously (and then read from all of them when you access the data back).
It’s pretty standard on high end GPUs these days. Apple also uses it on all their computers (if you buy a Mac with 64GB of RAM, it’ll run at 800MB/s - which isn’t quite as fast as a high end GPU but it’s close and it is HBM). It’s part of why Macs are so expensive (and also why the cheaper ones have very little RAM).
I highly doubt that, there are comparable models that are way smaller than that. No way they would waste that much money.
There are comparable models to GPT 3.5 “Turbo”, which is faster and 30x cheaper than GPT 4 (if you pay OpenAI’s regular API prices).
I suspect that’s because GPT-4 needs 30x more memory than 3.5.
I’m not aware of any other model that performs as well as GPT-4. In fact I suspect even 3.5 Turbo is the second best model.