I tried running this with some output from a Wizard-Vicuna-7B-Uncensored model and it returned
('Human', 0.06035491254523517)
So I don’t think that this hits the mark, to be fair, I got it to generate something really dumb but a perfect LLM detection tool will likely never exist.
Good thing is that it didn’t false positive my own words.
Below is the output of my LLM, there’s a decent amount of swearing so heads up
Edit:
Tried with a more sensible question and still got a false negative
I tried running this with some output from a Wizard-Vicuna-7B-Uncensored model and it returned
('Human', 0.06035491254523517)
So I don’t think that this hits the mark, to be fair, I got it to generate something really dumb but a perfect LLM detection tool will likely never exist.
Good thing is that it didn’t false positive my own words.
Below is the output of my LLM, there’s a decent amount of swearing so heads up
Edit:
Tried with a more sensible question and still got a false negative
('Human', 0.03917657845587952)