@_Lee_B_

_Lee_B_@alien.top · 11 months ago

You should understand that Python was a leader in the data manipulation and statistics and scientific workloads and unix pipeline/glue spaces (having largely supplanted Perl and Awk and R) before becoming a leader in AI. AI was just a natural extension, because it had all the right stuff for manipulating data and running numbers, and manipulating data is really the bigger part of AI, aside from developing the NNA (neural network architecture) itself (but that is a specialised job for a handful of people, and not constantly reworked in the same way as training data. Python is not really slower for this kind of work, because of the accelerated underlying libraries for the NNA’s, and usually being I/O bound in the data manipulation part anyway. In short, Python is the right tool for the job, rather than the wrong one, if you understand the actual problems that AI researchers face.

Inference is just running the models to do useful work, rather than training them. Rust can be used for that too. I do plan to use rust for this as well, but not in abandonment of python: in a different use case, where I want to be able to build executables that just work. Since python is interpreted, it’s harder to just ship a binary that will work on any system. That matters for AI-based end-user mass-market applications far more than for AI-based training / inference. Rust can deploy almost anywhere, from servers to android to the client side of web browsers. That said, I’m concerned about the libraries that Rust might have available for AI and the other stuff my app will need, even though candle looks great so far.

Data prep is more like cleaning the input/training data before training on it.

The vector part that you’re starting to get a sense of is not a data prep thing; it’s much closer to how transformers work. They transform vectors in a hyperspace. So you throw all of the words into the space, and the AI learns the right vectors to represent the knowledge about how all those words relate.

A vector database is different: my understading is that you basically you load data, break it into chunks, project each chunk into a hyperspace (maybe the SAME shape of hyperspace by necessity, not sure), and store that (vector, data) key-value information as tokens in your LLM’s context, like giving the AI an index card for reference, and it’s the librarian, and then you ask it a question. It might know, or it can look to its card index, and dig out the information.

_Lee_B_@alien.top · 11 months ago

Go is neither as simple as python, nor as powerful. In fact, I don’t know of any modern general-purpose language that’s more limited. It’s faster, and produces native code, and it’s type-safe to an extent, but that’s about it. In almost every way, it’s a bad excuse for a modern language.

_Lee_B_@alien.top · 11 months ago

Different trade-offs. Go is not python, and Rust is not Python, nor Go.

If you want raw CPU performance or very solid, reliable, production code that’s maintainable and known-good, AND/OR you want code that is native, systems-level, and can be deployed on many devices and operating systems or even without an operating system, then some of the rust-based libraries might be the way to go.

If you’re purely obsessed with CPU performance, assembly is the way to go, but using assembly optimally for machine learning on a modern CPU is a whole heap of study and work in its own right.

Arguably, but very importantly, any time you spend obsessing over such high-performance code for months could be obsolete by the time you’re done coding it.

If you want easy, rapid development where you can focus on what the code DOES at a high level, with very cool meta-programming rather than being down in the weeds of how to move bytes around or who owns what piece of memory, python makes a lot more sense.

Honestly, I don’t see much practical reason to go with a language like Go, though. It’s a half way house that is neither one nor the other.

_Lee_B_@alien.top · 11 months ago

And voila! You work for investors now.

_Lee_B_@alien.top · 1 year ago

You DO NOT NEED TO LOAD AND RUN MODELS to use AI. Run a server like text-generation-webui, then use its API.

_Lee_B_@alien.top · 1 year ago

“World = Some_Pile + Some_SlimPajama + Some_StarCoder + Some_OSCAR + All_Wikipedia + All_ChatGPT_Data_I_can_find”

“some” as in customized.

_Lee_B_@alien.top · 1 year ago

Learn docker compose. Run ollama as one of your docker containers (it’s already available as a docker container). Run your website server as another docker container. Deploy it securely, and you’re done. If/when you want to scale it and make it more enterprisey, upgrade from docker compose to kubernetes.

_Lee_B_@alien.top · 1 year ago

That’s in the training of the model. You need a different model.

_Lee_B_@alien.top · 1 year ago

No, we’re not. Not really.

You could call this “open source”, yes, but by a very narrow and worthless definition of that, which has always been controversially narrow and abusive. What people MEAN when they say open source is “like Linux”. Linux is based on, and follows the principles of Free Software:

0) The freedom to run the program as you wish, for any purpose.
1) The freedom to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
2) The freedom to redistribute copies so you can help others.
3) The freedom to distribute copies of your modified versions to others
-- gnu.org/philosophy

When an LLM model’s weights are free, but it’s censored, you have half of freedom 0.

When an LLM model gives you the weights, but doesn’t give you the code or the data, AND it’s an uncensored model, you have freedom 0, but none of the others.

When you have the source code but no weights or data, you only have half of freedom 1 (you can study it, but not rebuild and run it, without a supercomputer and the data).

When you have the source code, the weights, AND the data, you have all four freedoms, assuming that you have the compute to rebuild the weights, or can pool resources to rebuild them.

_Lee_B_@alien.top · 1 year ago

The source is actually available (which is good), but sadly the dataset is not (which is bad, and makes it not truly open, since you can’re reliably reproduce it).

_Lee_B_@alien.top · 1 year ago

One day they will be excellent at this. Right now, I think hallucinations are too much of a concern to rely on them for education, in a language you don’t know.

_Lee_B_@alien.top · 1 year ago

Strange, I thought they would naturally be rewarding the process, by rewarding each word that’s generated by the sequence to sequence model, rather than the final words, for example. Maybe they over-optimised and skipped training on all output.

_Lee_B_@alien.top · 1 year ago

There are 30,000 on huggingface? Is that what you’re saying?

I wonder how many of those are truly open source, with open data? I only know of the OpenLlama model, and the RedPajama dataset. There are a bunch of datasets on huggingface too, but I don’t know if any of those are complete enough to train a major LLM on.

_Lee_B_@alien.top · 1 year ago

The blue text is what this method improved the speed of (I think by parallelizing the inference similarly to CPU pipelining), and so what contributed to the overall text being produced more quickly.

_Lee_B_@alien.top · 1 year ago

Probably the line number, range on the line, the CWE ID, and, to help the AI understand and link the CWE to the code, the description from the CWE too.

_Lee_B_@alien.top · 1 year ago

One GPU per layer is an interesting approach ;)

_Lee_B_@alien.top · 1 year ago

Hy is an interesting choice :)

_Lee_B_@alien.top · 1 year ago

The letter from that triggered it all is here. Nothing named Q* was mentioned.

https://www.tweaktown.com/news/94521/elon-musk-shares-letter-by-ex-openai-employees-revealing-damning-allegations/index.html

_Lee_B_@alien.top · 1 year ago

Hmm, it looks like such a standard linear algebra optimisation that I’m surprised GPUs don’t do it automatically. But yep, looks good, either way.

_Lee_B_@alien.top · 1 year ago

Open Source isn’t the same thing as “free”. Open Source means that all of the input resources (the source, or in this case, the training data and training scripts and untrained model itself are provided, if you want to change anything. We have very few models and datasets like that. Llama isn’t one of them. OpenLlama and RedPajama are.