

1·
2 years agoSad part is that we need to train a generative model from scratch to use this one; i.e., we can’t fine-tune current models to use FFF.
Hope someone does it soon.
Sad part is that we need to train a generative model from scratch to use this one; i.e., we can’t fine-tune current models to use FFF.
Hope someone does it soon.
I could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?