If LLMs can be taught to write assembly (or LLVM) very efficiently, what would it take to create a full or semi-automatic LLM compiler from high languages or even from pseudo-code or human language.
The advantages could be monumental:

- arguably much more efficient utilization of resources on every compile target

- compilation is flexible and not rule based. an LLM won’t complain over a missing “;” as it can “understand” the intent

- it can rewrite many of the software we have today just based on the disassembled binaries to squeeze more out of HW

- can we convert an assembly block from ARM to RISC? and vice versa?

- potentially, iterative compilation (ala open interprator) can also understand the runtime issues and exceptions to have a “live” assembly code that changes as issues arise

>> Any projects exploring this?

>> I feel it is an issue of dimensionality (ie “context” size), very similar to having a latent space for entire repos. Do you agree?

  • BouncyBear2@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    If LLMs can be taught to write assembly (or LLVM) very efficiently

    That’s a big if, not compared to human written but compared to optimized code.

    • arguably much more efficient utilization of resources on every compile target

    That is an interesing angle, if you could build in concerns that aren’t currently taken into consideration

    • compilation is flexible and not rule based. an LLM won’t complain over a missing “;” as it can “understand” the intent

    I think that’s a separate issue, and is closer to code completion than compilation. I don’t know why there aren’t automatic linters for the specific problem you mentioned.

    I feel it is an issue of dimensionality (ie “context” size), very similar to having a latent space for entire repos. Do you agree?

    You could probably get the behaviour you want from fine-tuning/RAG on a specific codebase. It will still require large context size.