@lakolda

lakolda@alien.top · 1 year ago

Ever hear the term might?

lakolda@alien.top · 1 year ago

Some of the content also seems to allude to what Q* might be…

lakolda@alien.top · 1 year ago

GPT-4 turbo only speeds things up by 3x…

lakolda@alien.top · 1 year ago

This isn’t comparing with the 13B version of LLAVA. I’d be curious to see that.

lakolda@alien.top · 1 year ago

In context learning allows the model to learn new skills to a limited degree.

lakolda@alien.top · 1 year ago

GPT-3.5 turbo apparently has 20 billion parameters, significantly less than the previous best Phind models. Given how bad GPT-3.5 is, I think it was more likely just fine tuned some other base model on GPT-3.5 outputs.

lakolda@alien.top · 1 year ago

The original LLMZip paper mainly focused on text compression. A later work (I forget the name) used an LLM trained on byte tokens. This allowed it to compress not just text, but any file format. I think it may have been Google who published that particular paper… Very impressive though.

lakolda@alien.top · 1 year ago

LLMZip achieves SOTA compression by a large margin.