As a matter of choice, not necessity, I try from time to time to use the various speech and voice input systems in operating systems. My ideal scenario is still to be able to use the computer by voice entirely as well as running a screen meter. I’ve not found a reliable solution as of yet that meets my needs completely.
I know there are combinations of solutions that have made great strides in this area largely using Dragon products and screen readers but as the basis of what I use, I try to use either Voice Access on windows or Voice Control on the Mac. Both platforms also have solutions, as I expect many know, for strictly text input.
I no longer recall how long ago this was but the Voice Access product on Windows did make one change that helps with using screen readers. As a start, Voice Access produces notifications of what Voice Access has heard so that screen readers can echo this back. It is fairly basic and in need of much refinement it’s at least a start.
I am mentioning this here because in trying voice access this week, I noticed a change that is another step in helping Improve the experience. I do not know when this change was made to be clear. It is just that I noticed it this week. I also run Insider builds of Windows so if this does not work for you, that may be why.
When you’re trying to control the computer by voice, it is common to issue commands such as click and then an item that you want to activate. The challenge becomes that if there is more than 1 item with the same name, you are usually presented some experience to disambiguate what you want to click on.
When I first tried voice access, to the best of my recollection, the experience of identifying what you wanted to activate was not usable with a screen reader. It has been enhanced a bit so that now when that list of choices comes up, the list of choices is echoed back similar to how what Voice Access heard is repeated. Again this needs extensive refinement because it is kind of like a one time listen or read and Braille experience with no way to have the list repeated, step through the list in item at a time or otherwise understand what was said.
As an example of using the feature to identify what I want to click, here was what was read when I asked for the word paste to be clicked.
click paste. Which one?
There are 2 options available. (1) Paste, (2) Paste
Here is another example when I said “click login” on the Fidelity home page.
Click login. Which one?
There are 2 options available. (1) LOG IN, (2) Open login link
It is also worth noting that these disambiguation choices if using Braille appear as flash messages. For those unfamiliar with how Braille displays and screen readers work, this means that the messages stick around for a set period of time and then disappear from the display.
. Here is one last example when I tried to activate the OK button with my voice after running a spell check on an email message. Note, I intentionally replaced the actual email address with email@provider.com.
Click ok. Which one?
There are 2 options available. (1) OK, (2) Sent – email@provider.com – Outlook – 2 running windows
The experiences I’ve described work independent of what screen reader is being used.
Again this experience overall of using the computer with a screen reader and voice on Windows as far from finished. In fact one of the key experiences for correcting words that have not been recognized correctly does not work at all with screen readers. Voice access in fact gives the following notification when you try and correct something and a screen reader is running:
Alert: This experience is not optimized for use with screen readers. Say “Cancel” to exit.
Microsoft has a document on using Voice Access in general. If they have screen reader-specific documentation, I wasn’t able to find it.
If you do try Voice Access, two important hotkeys to know are Alt+Shift+b for toggling the microphone between sleep and awake and Alt+shift+c for toggling the Microphone off and on. When sleeping, the microphone remains on to listen for certain words. See the support article or say, “what can I say,” when Voice Access is running for a full list of commands.
Comments