Voice command testing – “Do you understand me, machine?” Part 2
Decoding the voice
It’s important to consider that testing voice recognition clients differs from any other type of testing. Unlike testing a regular mobile application, the tester has an endless scope of data that could be entered. If you want a good client, you do not limit a person to just 10 words the system recognizes. Modern voice recognition clients should decode as many commands as possible, presenting developers and testers with a very challenging task. But still, even the best voice recognition system doesn’t guarantee correct decoding 100% of the time. The job of the developers and testers is to make the correct decoding percentage as high as possible.
If we talk about voice typesetting, the stage where mistakes are most likely to occur is the decoding stage. Let’s look at the architectural level of the process, when sounds captured by the microphone are going through frequency analysis. First, the voice is converted into graphical wave motion; then it is transformed into characters that build the word. The searching tool in Google Chrome mobile version, and predictive typing on mobile phones and tablets are good examples of this process.
However, it gets more complex when you deal with multi-functional applications, where the voice recognition system consists of two stages. First, the client decodes the voice and forms the whole phrase. Second, a complex algorithm switches on and starts analyzing each word separately and the whole phrase together. This is where the biggest amount of mistakes occur. Those voice recognition systems are pretty bulky, so they are installed on servers, while the mobile device has just a small client to record the voice, send it to servers and receive the commands back to perform them. To optimize testing and the fixing of bugs on the server and the client, mistakes should be strictly differentiated.
Testing the client: with female voice and in the pub
The way we speak and the way we pronounce words – these are the types of factors that have an impact on voice recognition systems. The voice pitch and timbre could be recognized by the system differently. Also, every person has his own voice speed. This should be taken into consideration while working on choosing testing scenarios. It is recommended to choose a quality assurance engineer with average pitch, timbre and voice speed. Ideally, the same functions are tested with both male and female voices. In testing a client for a foreign language, it’s good to have a tester able to speak without an accent, so you don’t end up like the guys in this video clip.
The tester should forecast different environments; it’s not enough to make a test just for a quiet room. Noisy streets, crowded pubs and public transport – the voice client should be adjusted to decode the human voice anywhere.
What else can undermine voice client performance? If technical support, such as headsets, Bluetooth and other accessories don’t function correctly, the client can fail in accomplishing the task at hand. The need for an instant and reliable connection challenges developers and testers to diminish the impact of Internet connection quality. It also helps if the tester emulates other user scenarios, such as playing music on his phone, incoming calls and other interruptions.
It’s not so easy to imitate a user while testing voice recognition clients. However, this is the very case when the “do like a user does” approach is the key to success. An experienced tester can think of many users’ testing approaches to ensure high quality of the final product.
Currently at its peak of popularity, voice clients still have a huge niche in which to be developed and adopted. This gives developers a lot of room in improving current software and creating new ones. At the same time, it is a great responsibility to be involved in this process. Every tester should keep in mind the millions of people using the voice recognition software they have tested and improved. Using correct approaches and optimal strategies in testing will allow every user to be satisfied with the communications channel you have enabled for them.
Read the full version here.
The article was published in RCRWirelessNews.