Ideally, we record a person’s voice before speech has become affected. People are asked to read aloud around 400 sentences (which takes about an hour) whilst being recorded in a quiet or sound-proofed room. This allows us to get the best quality recording possible.
The sentences have been chosen to capture all the speech sounds of English in all the different possible combinations. While 400 sentences is an ideal number, we can create a synthetic voice from as little as 100 sentences if people aren’t able to manage the 400. This voice recording is then “banked” and stored ready to create a synthetic voice for a communication aid if, and when, that person needs one. Using software developed by speech scientists, all the parameters of that unique voice can be automatically analysed and synthetically reproduced in a process called “voice cloning”.
This is where “donor” voices come into play. During the voice cloning process the synthetically reproduced parameters of a patient’s voice are combined with those of healthy donor voices. Features of donor voices with the same age, sex and regional accent as the patient are pooled together to form an “average voice model”, which acts as a base on which to generate the synthetic voice.
“It’s a bit like going to the paint-mixing counter in a DIY shop. You give the assistant a swatch of your personal colour of choice (Sumptuous Plum for example…), tell them what quantity and finish you require, and a small amount of colour is added to a tin of the closest matching base paint. It is the use of donor voices that means we can use just a short recording from the patient, as the bulk of the speech data has been collected in the ‘base paint’.”
Phillipa Rewaj, Speech and Language Therapist
If a person comes in to record his or her voice once there is already mild to moderate impairment, it is possible for us to “repair” the voice in the synthesis process using more of the donor average voice model to patch the damaged elements of the voice (adding more of the “base paint” to the personal colour).
To date, we have recorded the voices of around 1200 healthy individuals – old and young, male and female, and with a range of regional accents. We are always trying to expand our bank of voices, as the bigger the pool of donors, the closer we can get to creating a truly unique voice for an individual.