For the last couple of months, I along with my team invested lots of effort into building a solution that can help users evaluate and monitor the performance of their LLM and AI apps.
If you’re a ChatGPT (or any other LLM :)) user and are integrating it into your apps, and if, by any chance, it ever happened to you that the outputs you received weren’t exactly the ones you were hoping for… You should find this useful 😃
Today, we are releasing it publically and launched it on ProductHunt. I would be very thankful to hear your thoughts and if you can support the launch. 🙏
https://www.producthunt.com/posts/deepchecks-llm-evaluation?r=h
Ha! Just as I started writing my own thing for that. Will def take a look! :)