New APU’s close to Gpu processing, but with unlimited memory?

bkm_s@alien.top · 11 months ago

New APU’s close to Gpu processing, but with unlimited memory?

ccbadd@alien.top · 11 months ago

It AMD would put out an APU with 3D VCache and quad channel memory that lets you use all four slots at full speed (6000 mt/s or better) and not cripple it in the bios they could be kicking Apple tail.

he29@alien.top · 11 months ago

I’m not sure if 3D cache would help in this case, since there isn’t a particular small part of the model that could be reused over and over: you have to read _all_ the weights when inferring the next word, right?

But I’m definitely looking forward to the 8000 series, since AM5 boards should get even cheaper by the time it comes out, and support for faster DDR5 should get better as well. And I really need to move on from my 10 years old Xeon haha…

ccbadd@alien.top · 11 months ago

I didn’t think so either about the 3d vcache until the article about getting 10X the performance from a ramdrive that came out a few days ago. If it works for ramdrives then surely we can figure a way to use that performance for inferencing.

FlishFlashman@alien.top · 11 months ago

It’s not going to help because the model data is much larger than the cache and the access pattern is basically long sequential reads.

rarted_tarp@alien.top · 11 months ago

It might help for LLMs since a lot of values are cached after each loop, but still highly unlikely to make a difference.