Not sure, but it seems they finetuned gpt-3.5-turbo-16k, which is faster than GPT-4, hence the claim of GPT-3.5 speed with 16K context limit.
They’re dubiously naming it Phind V7. Also, they’ve ripped off WizardLM’s code in the past and rebranded it to secure seed funding.
I doubt it’s based on CodeLlama 34B. Unless they trained on a specific dataset that makes the model hallucinate as if it’s GPT-3.5 Turbo.
They trained their model using synthetic GPT-3.5-turbo data + a mix of their data. It is normal that V7 says “I am gpt-3.5”, but it is not normal that Phind uses synthetic OpenAI GPT data because it violates OpenAI terms.
OpenAI’s terms only mean that they might ban your account if they catch you gathering it. The data itself is not copywritable in any way, OpenAi has no legal right to control its use.