While I have not tried this in azure, my understanding is that you can deploy a Linux vm with A100 in azure (T4or V100 may not work for all use cases, but will be a cheaper option). Once you have a Linux vm with GPU, you can choose how you would like to host the model(s). You can write some code and expose the LLM via an API ( I like Fast chat, but there are other options as well). Heck you can even use ooba if you like. Just make sure to check the license for what you use.
I have heard that podcast and I don’t think he explicitly said that the Grok will be open sourced. He said open sourcing models with a 6 month delay is perhaps a good idea (paraphrasing). Yes, I would love to try Grok open source, but right now it’s more closed source than GPT. Only available to some users. I wouldn’t hold my breath for this one.