If LLMs can be taught to write assembly (or LLVM) very efficiently, what would it take to create a full or semi-automatic LLM compiler from high languages or even from pseudo-code or human language.
The advantages could be monumental:
- arguably much more efficient utilization of resources on every compile target
- compilation is flexible and not rule based. an LLM won’t complain over a missing “;” as it can “understand” the intent
- it can rewrite many of the software we have today just based on the disassembled binaries to squeeze more out of HW
- can we convert an assembly block from ARM to RISC? and vice versa?
- potentially, iterative compilation (ala open interprator) can also understand the runtime issues and exceptions to have a “live” assembly code that changes as issues arise
>> Any projects exploring this?
>> I feel it is an issue of dimensionality (ie “context” size), very similar to having a latent space for entire repos. Do you agree?
Can’t believe you quoted Yudkowsky at me, that’s offensive :)
I don’t mind asking my compiler nicely…
I think it would prove much harder if you try to limit the token vocabulary. We want to preserve the ability to understand english comments and potentially ask clarifying question when you see ambiguity.
Something like:
“Dude, stop using this old AMD frameqwork, Intel just released a new architecture and I can get you 20% discount on Amazon. I’ll even rewrite your entire shitty code base to work with it. {Affiliate_link} click here to order and recompile.”