I want to keep my options open, and potentially have a large context, which can add up to 100GB to memory requirements.
I’m considering 1x genoa CPU with 12 channels. Something like the 9354 would be more than enough cores. I might start with a cheaper DDR4 machine first though.
How was it getting the Epyc machine set up? Are you using windows? What about a GPU?
Not true from what I’ve read here.
Aside from repetition, isn’t this effectively a new sampling method? You could call it Fuzzed Greedy Sampling.
I meant in total, but there do seem to be models with up to 100GB for context, like 01-ai/Yi-34B-200K.
A valid option. I haven’t looked into prices for renting but it could make sense unless I will use it a lot.
Thanks. I would guess the seqlen is the sum of the input and output length as it feeds back on itself.
Thanks. Yes, a 2kW heater pc would only be welcome in the winter, and could get pricy to run.
I haven’t tried Mac and don’t know what the software ecosystem is like. Have you tried it or seen it working?
It looks like it doesn’t have dedicated VRAM, but shared memory. I would guess this is slower than dedicated GPU memory but faster than RAM sticks on a normal PC?
Thanks. I can’t find ‘qualification samples’ on ebay in the UK, unless you just find them through a serial number or something.
The DDR5 ram is more expensive, but it should hold value fairly well. I’ll look for a 12 channel board.