Ideally, a person’s voice is recorded before speech has become affected. People are asked to read aloud around 400 sentences (which takes about an hour) whilst being recorded in a quiet room. This allows us to get the best quality recording possible.
The sentences have been chosen to capture all the speech sounds of English in all the different possible combinations. While 400 sentences is an ideal number, we can create a synthetic voice from as little as 100 sentences if people aren’t able to manage the 400. This voice recording is then “banked” and stored ready to create a synthetic voice for a communication aid if, and when, that person needs one. Using software developed by speech scientists, all the unique elements of the voice can be automatically analysed and synthetically reproduced in a process called “voice cloning”.
This is where “donor” voices come into play. During the voice cloning process the synthetically reproduced parameters of a patient’s voice are combined with those of healthy donor voices. Features of donor voices with the same age, sex and regional accent as the patient are pooled together to form an “average voice model”, which acts as a base on which to build the synthetic voice.
“It’s a bit like going to the paint-mixing counter in a DIY shop. You give the assistant a swatch of your personal colour of choice (Sumptuous Plum for example…), tell them what quantity and finish you require, and a small amount of colour is added to a tin of the closest matching base paint. It is the use of donor voices that means we can use just a short recording from the patient, as the bulk of the speech data has been collected in the ‘base paint’.”
Phillipa Rewaj, Speech and Language Therapist
If a person comes in to record his or her voice once they are already starting to notice changes in their speech, it is possible for us to “repair” the voice in the synthesis process. We can use more of the donor average voice model to patch the damaged elements of the voice (adding more of the “base paint” to the personal colour), whilst still retaining the identity of the individual. Other synthetic speech methods cannot do this.
Voices for all
Even if someone can’t record their own voice, we can use a blend of our donor voices, or a recording of a specially selected donor (such as a relative or friend) to create a unique voice for anyone.
To date, we have recorded the voices of over 1200 donors – old and young, male and female, and with a range of regional accents. We are always trying to expand our bank of voices, as the bigger the pool of donors, the closer we can get to creating a truly unique voice for an individual.