Skip to content →

Category: Accessibility

A Nice Enhancement for Voice Access on Windows

As a matter of choice, not necessity, I try from time to time to use the various speech and voice input systems in operating systems. My ideal scenario is still to be able to use the computer by voice entirely as well as running a screen meter. I’ve not found a reliable solution as of yet that meets my needs completely.

I know there are combinations of solutions that have made great strides in this area largely using Dragon products and screen readers but as the basis of what I use, I try to use either Voice Access on windows or Voice Control on the Mac. Both platforms also have solutions, as I expect many know, for strictly text input.

I no longer recall how long ago this was but the Voice Access product on Windows did make one change that helps with using screen readers. As a start, Voice Access produces notifications of what Voice Access has heard so that screen readers can echo this back. It is fairly basic and in need of much refinement  it’s at least a start.

I am mentioning this here because in trying voice access this week, I noticed a change that is another step in helping Improve the experience. I do not know when this change was made to be clear. It is just that I noticed it this week. I also run Insider builds of Windows so if this does not work for you, that may be why.

When you’re trying to control the computer by voice, it is common to issue commands such as click and then an item that you want to activate. The challenge becomes that if there is more than 1 item with the same name, you are usually presented some experience to disambiguate what you want to click on.

When I first tried voice access, to the best of my recollection, the experience of identifying what you wanted to activate was not usable with a screen reader. It has been enhanced a bit so that now when that list of choices comes up, the list of choices is echoed back similar to how what Voice Access heard is repeated. Again this needs extensive refinement because it is kind of like a one time listen or read and Braille experience with no way to have the list repeated, step through the list in item at a time or otherwise understand what was said.

As an example of using the feature to identify what I want to click, here was what was read when I asked for the word paste to be clicked.

click paste. Which one?
There are 2 options available. (1) Paste, (2) Paste

Here is another example when I said “click login” on the Fidelity home page.

Click login. Which one?
There are 2 options available. (1) LOG IN, (2) Open login link

It is also worth noting that these disambiguation choices if using Braille appear as flash messages. For those unfamiliar with how Braille displays and screen readers work, this means that the messages stick around for a set period of time and then disappear from the display.

. Here is one last example when I tried to activate the OK button with my voice after running a spell check on an email message. Note, I intentionally replaced the actual email address with email@provider.com.

Click ok. Which one?
There are 2 options available. (1) OK, (2) Sent – email@provider.com – Outlook – 2 running windows

The experiences I’ve described work independent of what screen reader is being used.

Again this experience overall of using the computer with a screen reader and voice on Windows as far from finished. In fact one of the key experiences for correcting words that have not been recognized correctly does not work at all with screen readers. Voice access in fact gives the following notification when you try and correct something and a screen reader is running:

Alert: This experience is not optimized for use with screen readers. Say “Cancel” to exit.

Microsoft has a document on using Voice Access in general. If they have screen reader-specific documentation, I wasn’t able to find it.

If you do try Voice Access, two important hotkeys to know are Alt+Shift+b for toggling the microphone between sleep and awake and Alt+shift+c for toggling the Microphone off and on. When sleeping, the microphone remains on to listen for certain words. See the support article or say, “what can I say,” when Voice Access is running for a full list of commands.

Leave a Comment

No Big Finish, Giving Microsoft’s Recall a Try

Microsoft released the Recall feature for users of Windows Insider builds. I gave the feature a try and made an audio recording with a roughly edited transcript. The short version of my experience is that I can get around Recall but am still trying to figure out how to read search results effectively with a screen reader, hence the title of No Big Finish.

Since this blog post is about an AI feature, I’ll offer an unedited version of Copilot’s summarization of my transcript as well. Overall the summary is reasonable but in my opinion misstates what I said about the Click To Do feature. I said I hadn’t figured out how it is supposed to work.

Perhaps because it was at the end, the summary also missed my biggest point entirely about still trying an effective way to use the search functionality of Recall and reading the meaningful text easily. Here is the Copilot summary:

The document is a transcript of a demonstration by Kelly on the Microsoft feature called “Recall.” This feature captures screenshots of your screen and allows you to search for information within those screenshots. Kelly explains that Recall was released as a test version for Microsoft Insiders on November 22nd. The demonstration focuses on the accessibility aspects of Recall, particularly how it works with screen readers like NVDA, Narrator, and JAWS.

Kelly describes the setup process, which includes signing in with Windows Hello, and demonstrates how to navigate the Recall interface using keyboard commands and screen reader feedback. The document details various functionalities of Recall, such as saving snapshots, opening Recall from the system tray, and navigating through captured images. Kelly also mentions some issues and limitations encountered during the demonstration, such as certain screen reader commands not working as expected and the need for further development to improve accessibility.

Kelly starts by explaining the initial setup, which involves a brief opt-in experience where the user grants permission for the computer to take screenshots. This process takes about 20 to 30 seconds and requires signing in with Windows Hello. Kelly notes that each time Recall is opened or used, the user is prompted to sign in with Windows Hello.

The demonstration continues with Kelly navigating the Recall interface. Kelly shows how to open Recall from the system tray and describes the various options available, such as saving snapshots, opening Recall, pausing the feature, and accessing more settings. Kelly highlights that the timeline feature in Recall displays the captured images, referred to as “memories,” and sessions, which represent the time periods during which the computer was used.

Kelly also points out some issues with the screen reader’s ability to read the titles of the snapshots automatically. Kelly demonstrates how to navigate through the list of snapshots and mentions that sometimes the titles do not read correctly on the first attempt. Kelly suggests using the home and end keys to navigate through the list and get the titles to read properly.

Additionally, Kelly explores the commands available for each snapshot, such as opening the associated application (e.g., Outlook), copying the snapshot, deleting it, and accessing more options. Kelly notes that some features, like the “Click to do” button, are not fully functional yet and require further development.

Kelly also demonstrates how to get more details about a captured image using Narrator’s command for image descriptions. However, the output is often gibberish due to the limitations of the OCR (Optical Character Recognition) technology used. Kelly suggests using other solutions like JAWS, Google’s Gemini, or Be My Eyes for better image descriptions.

Overall, the document provides an insightful look into the Recall feature, its current capabilities, and areas for improvement from an accessibility perspective. Kelly’s demonstration highlights the potential of Recall while also acknowledging the need for further development to enhance its functionality and accessibility.

Leave a Comment

Unlock Windows Efficiency with PowerToys Run

Sometimes you make a handy discovery that makes you wonder why you did not know about it long before. I just had that happen with PowerToys Run, one of the toys included in the full suite of items in the electronic toybox known as Windows PowerToys.

PowerToys are a set of utilities that allow you to customize different aspects of Windows and offer functionality that is not directly built in. You can learn more in general and find details on installation in a handy Microsoft Learn article.

I installed the most recent version of PowerToys because I had read about something new known as Advanced Paste. That is not the discovery though.

After installing PowerToys, I used one of my more common shortcuts in Windows Alt+Space to bring up the system menu for an application. That menu is where you find entries including restore, close and the one I often use, Maximize. My web browsing windows in particular often get sized quite small and with responsive design much of the content I’m expecting has disappeared so maximizing the browser window is necessary.

Imagine my surprise when instead of what I was expecting, my screen reader communicated, “query, edit.”

It turns out this is the default shortcut for the aforementioned PowerToys Run. In short, this is like having a command line to do everything from choosing between open windows on your computer to performing quick calculations, file, and web searches, browsing the Windows registry and more.

Using PowerToys run

Using PowerToys Run is fairly straight forward. Press Alt+Space, enter some text and arrow through the results. You can start your text with various text characters to direct PowerToys Run what you want to do. The full list of text to use here is detailed in a Microsoft Learn article.

Some examples I am finding I have already incorporated into my daily use include:

  • <, that is the less than symbol: Window Walker, to browse through all the open windows on your computer. Start pressing down arrow after entering the less than character, or add part of the window name and then down arrow to shorten the list.
  • $, that is the dollar sign character: Windows Settings to, as you would expect, browse through all the different settings for Windows. As with Window Walker, just start pressing down arrow after entering the dollar sign, or add some of the text from the setting you want and then press down arrow. In either case, when you reach the item you want, press enter and that settings page will open.
  • =, that’s the equals sign: Calculator for performing calculations. Just enter your calculation and the result will be displayed. If, as I do, you are using a screen reader, the fastest way I have found to read the result is to press down arrow after entering my calculation. Note that you do not need to press enter after you have input your calculation. Also, again if using a screen reader, if you are comfortable with advanced screen reading techniques such as the JAWS Touch Cursor, NVDA Object Navigation or Narrator navigation commands, the result of the calculation and more can be read with these techniques. Last, after your result is displayed, you can press enter on the result and have it put on the Windows Clipboard.
  • !!, that is two presses of the exclamation mark key: History, quickly browse through your history with PowerToys Run with presses of down arrow.

Some Notes

PowerToys Run documentation indicates you can change the default shortcut for launching from Alt+Space.

According to PowerToys documentation, pressing tab is supposed to move you through search results and any buttons or context menus that exist. I am not finding anything being communicated by multiple screen readers as of now when using Tab. I still need to figure out if this is a case of the key simply not working or items taking focus not being communicated properly.

Leave a Comment

Accessible Entertainment in the Air

Flying home from a recent vacation, I had a first for myself. I independently used the in-flight entertainment system to track my flight, listen to music, and watch a movie with audio descriptions. I even played a bit of trivia for a bit. How fun!

I’m not sure when United Airlines added accessibility to their in-flight technology, but it was available on the return flights of my trip. The system used TalkBack and a two-finger triple-tap to start the screen reader. There was a video offered to show you how to use the system. I was in the proverbial cheap seats, so I used just the touch screen option for control. Apparently, premium seats get physical buttons in the arm of the seat as well.

Aside from the map showing you flight progress and some games, I found all the other experiences worked well with TalkBack. Those that didn’t were indicated by a message talking about not being available with TalkBack. In the case of the flight map, the alternative to tracking flight distance, elevation, and such did work with TalkBack. I do wish that display had a compass option as well, but the experience just worked, so what more can you ask for when it comes to accessibility? Picking my own movie, having audio descriptions, and being able to check on my flight independently was pretty sweet!

Leave a Comment

The Good and Bad of Accessibility in Two Minutes with the Olympics

Tuning into a bit of the Olympics this morning, within two minutes the reality of accessibility in 2024 is on display. Audio description for a channel showing multiple sports is impressive but the schedule view of the Olympics iOS app not so much. The progress is appreciated here but the gaps are still far too many.

NBC has taken a page from the NFL’s Red Zone and introduced a channel called Olympic Gold Zone. It provides whip-around coverage of events and live audio description of a two-box split screen for two sports mixing between live commentary is impressive as an example.

The Gold Zone channel is part of the coverage available on the Peacock streaming service. Scott Hanson of NFL Red Zone fame is one of the channel hosts. Coverage runs for 10 hours a day, starting at 6A central.

Downloading the Olympics iOS app and choosing schedule though, shows accessibility that would not make it to the metal round. My experience with VoiceOver was a jumble of words, untagged images and more. I had no success understanding the actual schedule.

Back to the Gold Zone, the live audio description is excellent. As you’d expect, you get details that are just not included in the standard TV broadcast. Player reactions, details about the stadiums, surroundings for events in the city and more. And all of that is mixed in with both the Gold Zone host and announcing from the sports. It will be a fun two weeks of athletic competition.

Leave a Comment

Audio Ducking Enhancements in iOS 18

If you use a screen reader, the concept of audio ducking is likely not new to you. For those unfamiliar with the term, it refers to a concept where any audio outside of a screen reader’s speech is lowered automatically when a screen reader is communicating.

Apple has made some changes in this area that for me have been quite positive. You can now adjust both when audio ducking is applied and have greater control over the level of ducking. This is done with settings changes for both audio ducking and volume in iOS 18.

To make these adjustments, ensure that both volume and audio ducking are items you have added to the VoiceOver rotor. Then use those options as described here.

Previously audio ducking was strictly an on/off choice. In iOS 18, on has been replaced with two choices. You can now have audio ducking set to always or only when speaking. This leaves you with a total of three settings, off, only when speaking and always.

The ability to adjust the amount of ducking is a bit more subtle to discover. In fact, originally when I discovered the option to adjust volume to percentages above 100%, I thought it was a bug. Setting a volume of greater than 100% was not producing any detectable change for me and in fact it seemed odd to be able to set volume to more than 100%.

The way this all works together is that setting a volume of greater than 100% is actually lowering the volume of audio when ducking but controlling the amount of ducking applied. For example, when setting a volume of 105%, the audio that is ducked, is ducked by 5% from the original volume. Likewise, set a volume of 150% and audio when ducked, and audio is half of the original volume.

The ability to adjust the amount of ducking is a welcome enhancement. Depending on the source audio, the amount you want it lowered to still be audible but not impact screen reader speech can vary greatly. The result here is like having a specific volume control for the ducked audio.

Again to use these features, ensure that both volume and audio ducking are added to the VoiceOver rotor. Go into Settings:Accessibility:VoiceOver:Rotor and select those options along with anything else you want to use.

Leave a Comment

If You Tag, I Will Read

In my experience, a fundamental disconnect exists between accessibility and the investing world when it comes to a statement that is blasted all over every investment web site when you are about to invest. You will read the statement that you should carefully read the prospectus before making any investment. Good luck with that as in my experience these documents are rarely, if ever, properly tagged for accessibility.

As just one example, typically deep within the multipage documents are tables of the individual investments the mutual fund or ETF holds. Yet the tables in every prospectus I’ve tried to read, more than 50 in the last few weeks, from at least 20 different companies, fail to tag tables properly.

This is just the tip of the iceberg when it comes to accessibility of these documents. Should we talk about the charts and graphs in the same documents?

The Securities and Exchange Commission should mandate that to sell securities in the U.S. at least, all investment materials need to be WCAG 2.1/2.2 AA conformant and give the industry one year to comply. If legislation is needed to make such a mandate enforceable, then congress should craft and enact such.

Given the number of employers who include 401K programs as part of employment, every company who offers this to employees should be holding the investing world accountable today for this.

If anyone knows of an investment company or ETF or mutual fund provider who actually does these documents correctly today, I’d love to hear about it.

Leave a Comment

Web Accessibility Failures and a Basic Financial Task of ETF Screening

What often gets lost in all the talk of accessibility, conformance, WCAG standards and more is how challenging task completion can be for an end-to-end experience. I recently celebrated another birthday and have been doing part of my overall evaluation of financial tools and services I use as a part of my financial life.

I did this back in 2017 when I moved back to Wisconsin and again in late 2022 when I changed jobs. This time the driving factor is nothing quite so dramatic but rather a departure from one financial services company that simply wasn’t making progress on some accessibility basics where I honestly thought they might be different.

I’m certainly not here to give anyone investment advice but for my purposes have been evaluating ETF screeners from several of the leading financial services companies and industry resources. I’ve yet to find one that I’d consider truly usable with a screen reader and just about everyone I tried had at some point a blocking accessibility issue.

Frankly, the interactions required here are relatively few to complete the task. Choose from various criteria for the evaluation, get a list of results to interpret the data and be able to take action on items of interest.

The disappointing thing is that most of the accessibility failures are still the same basic challenges of poorly named controls, custom web controls that fail accessibility, poorly constructed tables, charts that fail to address accessibility and more.

None of the tools I tried were completely broken. But kind of works doesn’t cut it for this kind of task. Nor does mix and match between providers to work around issues in one tool.

What is also disappointing here is that the biggest workaround for these sorts of challenges is often downloading the results for processing in Excel or another application. However, every experience I tried puts a restriction on the number of results you can download to the point that downloading the data is not a realistic option.

Unlike my previous forays into this task, a new wrinkle about being able to download the results has also emerged and was present on more than one site. By this I mean that you actually have to interact with some control to choose the type of file you want and those experiences fail web accessibility basics.

These sorts of tasks need to be models of web accessibility. I’ve long said that it is a struggle to try and accomplish a task, learn the experience and wrestle with accessibility issues all at the same time.

Add in the fact that in an area such as financial tools, it isn’t enough to be accessible. A financial tool that is very accessible but works poorly at the financial task is not helpful. Additionally, assuming there was something that stood out in this area, changing financial service providers isn’t always a straight forward option. It can be costly, has potential tax and other consequences and may not even be possible depending on the type of investment account you are trying to move.

Finally, much like healthcare, financial management is a highly personal task. Many of the last-resort strategies of asking a friend or family member to assist are not good options here.

In just about every case, the tools I’ve tried come from industry players who have ongoing accessibility efforts with people who I am sure are as dedicated to accessibility as I know I am. Given my own employment and as I wrote in my post on Ethical Accessibility I understand the challenges for everyone involved in these situations.

Several of the same companies here are listed as corporate partners for Disability:IN. Perhaps that organization needs to expand what they measure as one example.

Those more familiar with the legislative and executive processes of government may know if an agency such as the Security and Exchange Commission (SEC) could adopt a requirement for any organization involved in the buying and selling of financial instruments having to meet certain accessibility criteria to conduct such business. Legal requirements tend to bring a level of clarity, for better or worse, that is otherwise hard to achieve in many organizations.

Many of these organizations already make claims of acting as a fiduciary, especially if you use their advising services. I find it hard to call this an accurate statement with the cornerstone of fiduciary responsibility to put a client’s need ahead of your own interest. Comprehensive accessibility would seem to be a must if you were serious about this.

Writing 101 would say I should close this post with some call to action or other engagement strategy. Well, there is no great close here just now beyond a call for the financial industry to step up efforts because what is happening today is broken.

Leave a Comment

Trying Apple’s Personal Voice

Apple recently introduced Personal Voice to newer devices on various hardware in their lineup. I have had a little experience with the basic concept behind this sort of technology from my time at Microsoft where I dabbled with one of Microsoft’s Azure cognitive services to do something similar.

The basic concept behind these experiences is that you record some set of known text and then software converts that into a synthetic version of your voice. In Apple’s case it is 150 phrases ranging from just a few words to maybe at most 20 words in a single phrase.

After you finish recording, there is some processing time and then your voice is ready to use. On an iPhone 15 Pro, my voice was ready in about five hours. You are not able to do anything else with the phone while this is happening. On an M1 MacBook Air from 2020, processing took about two hours and I was able to do other tasks at the same time, such as writing this blog post.

Once your voice is created, you can use it as one of the voices available with Apple’s Live Speech feature. This allows you to type in various apps where you would typically use your voice and have the synthetic voice used. It compliments the existing voices Apple makes available and has the added benefit of allowing you to have some relationship to your own voice used in the experience. In situations where a person may know that they are going to lose their voice ahead of time, it does offer some ability to preserve your own speech.

Multiple factors influence the quality of the end result here—Microphone, recording environment and more, just to name a few. For short phrases it likely is not noticeable but in my sample, even the pace at which I appear to have read the samples was different. There is a 21 second difference in having the voice read back the same text.

I made two voices in trying this experience. The first was recorded using the default Apple headphones on an iPhone 15 Pro. The second using an Arctis 7 headset. Both samples are my Apple Personal Voice reading my blog post on Accessibility Island.

I have also made a sample of my original voice sample of three phrases and then Apple’s Personal Voice speaking those phrases from my recording with the Arctis 7 device. The Personal Voice speaking the phrases is the result of my typing them into an edit box and asking for them to be spoken using my newly created voice. The phrases are in this recording and have the original voice sample followed immediately by the Personal Voice speaking the phrase. After all three phrases are played, the entire series is duplicated once. The phrases are:

can you call me in an hour

Did you remember to take out the trash?

Is she going to the grocery store now or in the morning?

Creating a personal voice is straight forward. On whatever device you are using, go to Settings:Accessibility:speech:Personal Voice. You’ll be prompted to record a short phrase to test your recording environment and advised of any changes you should make, such as too much background noise. You then start the process of recording 150 phrases. They do not all need to be recorded at once. When you are finished, you’ll be advised to lock your phone if doing this on an iPhone or just ensure your computer is charged if using a Mac.

When the voice is created, you can start using it with Live Speech by going to the same Speech area of Accessibility settings and going into Live speech. Turn Live Speech on and then pick from the list of voices. Your personal Voice should be listed.

If you are doing all of this with VoiceOver, Apple’s screen reader, as I did, the process of creating a voice works well with VoiceOver. You can use VoiceOver to read the phrase to be read, then activate a record button and repeat the phrase. Recording stops when you stop speaking. If you turn on a setting for continuous recording, you will advance to the next phrase automatically and can repeat the process. I did notice that sometimes VoiceOver automatically read the next phrase but not always. Focus seems to go to the Record button and I suspect there is a timing issue between the phrase being spoken and VoiceOver announcing the newly focused button.

Having created two voices, I would say it is probably a good idea to take a short break during the reading of the 150 phrases from time to time. I found myself not speaking as clearly as I wanted once in a while as well as having sort of the same singsong phrasing. Listening to my voice samples and how the voice came out, I would also say the microphone used has a big impact on the voice quality. This isn’t surprising but is made apparent to me comparing the samples of what my recordings sounded like and how that turns out when the same text is spoken by Personal Voice. I don’t think either microphone that I used would be what I would recommend for creating a voice to be used permanently.

I was curious if Apple would allow the personal voice you create to be used with VoiceOver. I didn’t expect it would be possible and that does seem to be the case.

As with pretty much anything in AI, synthetic speech is a rapidly changing technology. There are certainly higher quality voices in the arena of synthesized speech but Apple has done a good job at allowing you to tap your own voice on consumer hardware in an easy to use process. Listening to my own voice, it is clear it isn’t me and I wasn’t expecting it to be. But even on the basic hardware I used, there are characteristics of my voice present and if I were in a situation where I was going to lose my physical voice permanently, this is one option I would definitely explore further.

One Comment

A Simple Example, Avoid Breaking What Works in Table Functionality

Opinions may differ on this, but I am of the opinion that you should not add extra instructions on the names of column headers in tables on the web. If you are going to do so, ensure it is done in a fashion that allows a screen reader to avoid announcing those instructions if the user desires not to have them communicated.

I recently encountered an experience with one of the financial services I use where some excellent table functionality is ruined by breaking this simple rule. It makes getting the actual data from the tables much more difficult. The tables properly use both column and row headers and have good keyboard navigation even when not using a screen reader as just two examples of what works well.

In the case of my financial service, an example column name is now:

Last Price, (press Enter to sort)

Because this is part of the column header, albeit hidden visually, you must now hear this or read it in braille before you get to the details for a cell when moving through a row of information with a screen reader’s table reading commands.

Instead of just hearing the column name and the value, I must hear the column name, these instructions, and then the value. This is now how the result is communicated when moving to a given cell with a screen reader’s table navigation commands.

Last Price, (press Enter to sort) $174.01-$0.81

Examining the HTML, I find this is part of the column header name.

<span class=​"screen-reader-only">
", (press Enter to sort)"

I would suspect this was added in an attempt to be helpful. It is complete speculation on my part but it is even entirely possible that a usability study was done on this table and one of the questions asked was if the users knew they could sort the table. I would be willing to bet, continuing my speculation, that the answer was no and this extra text for screen reader users was added.

The problem is that this breaks the actual functionality of the table. Reading through the row, you are trying to study the details of the data. That flow is interrupted by the instructions on sorting being inserted between the column header and the data. You either have to learn to tune it out or some other strategy of ignoring the instructions. Again it is inserted as part of the column name so it isn’t as if the screen reader can ignore half the column name here.

It is also interesting that prior to the table, there is a full paragraph marked up with the same “screen-reader-only” class giving all sorts of instructions on reading the table with a screen reader.

There are a range of options to improve on this in my opinion. At minimum, given the way the full site has been constructed, move these sorting instructions into the other instructions you already have for getting information from the table.

Other solutions are possible of course and my point here is more to point out how in trying to be helpful, you can easily break what works well with screen readers and other assistive technology.

Leave a Comment