• 0 Posts
  • 5 Comments
Joined 1 year ago
cake
Cake day: October 30th, 2023

help-circle


  • I was struggling at first and had that American twang coming through…

    But I managed to get a very clear, short clip of an English actor from an interview. There was no background noises, it was very clear. I made sure to clip out any non speech from the start or end of the audio, then saved it as a 22050HZ mono 16bit wav.

    That seems to have done it! I get a pretty good representation of the voice and it 99% seems to stay in character with the occasional slight slip.

    I also occasionally get a little gibberish, which seems to be when my model is trying to say somehthing like " ’ " (which occasionally slips through when its generating text and I look at the backend of whats being sent for audio processing). Im guessing its possible to filter this out with a regex or something.