PHIND V7: Red Flags

Xhehab_@alien.top · 11 months ago

PHIND V7: Red Flags

ambient_temp_xeno@alien.top · 11 months ago

If it’s not local they go in the bin anyway. Don’t worry about it.

Soc13In@alien.top · 11 months ago

There is so much investor money flowing into AI startups that it is completely not surprising that somebody would do that.

throwaway_ghast@alien.top · 11 months ago

Aren’t language models well known for boldfaced bullshitting?

donotdrugs@alien.top · 11 months ago

Yeah and it baffles me how many people, even in the tech community, take LLM output as hard facts.

Xhehab_@alien.top · 11 months ago

Yeah but at least PHIND should have cleaned the training dataset rows which mentions gpt-3.5-turbo/gpt-3 words…lol

lakolda@alien.top · 11 months ago

GPT-3.5 turbo apparently has 20 billion parameters, significantly less than the previous best Phind models. Given how bad GPT-3.5 is, I think it was more likely just fine tuned some other base model on GPT-3.5 outputs.

marcus__-on-wrd@alien.top · 11 months ago

isn’t it 175B?

silentsnake@alien.top · 11 months ago

The recent Microsoft paper on codefusion leaked it.

kristaller486@alien.top · 11 months ago

They trained their model using synthetic GPT-3.5-turbo data + a mix of their data. It is normal that V7 says “I am gpt-3.5”, but it is not normal that Phind uses synthetic OpenAI GPT data because it violates OpenAI terms.

cuyler72@alien.top · 11 months ago

OpenAI’s terms only mean that they might ban your account if they catch you gathering it. The data itself is not copywritable in any way, OpenAi has no legal right to control its use.

ciaguyforeal@alien.top · 11 months ago

you cant ask an llm about itself.

throwaway_ghast@alien.top · 11 months ago

People seem to forget that language models are text prediction engines, not actual intelligence.

api@alien.top · 11 months ago

If the training data contains statements to the effect that the model was extracted from the brain of a living walrus, that’s what it will tell you when you ask where it came from. These things aren’t self-aware in any sense. They don’t contemplate themselves or ask “who am I?”