Skip to content →

Variations on an Automatic Image Description

Reading through Twitter today, the following tweet showed up on the timeline of one of the people I follow as a retweet.

Doc🐕 – @DocAtCDI: A truck loaded with thousands of copies of Roget’s Thesaurus spilled its load leaving New York

Witnesses were stunned, startled, aghast, stupefied, confused, shocked, rattled, paralyzed, dazed, bewildered, surprised, dumbfounded, flabbergasted, confounded, astonished, and numbed.

I found the tweet amusing and was going to retweet but noticed it had a picture without any alt text. This lead me to be curious what was in the picture. From the tweet text, I’m assuming some form of vehicles on a road with a bunch of books scattered about is most likely.

I suspect most reading this know that iOS has the ability to automatically describe pictures. This functionality started in iOS 14. When using VoiceOver you can have a short machine-generated description of pictures such as the one attached to the tweet here.

Newer versions of iOS extended this functionality to include a feature called Explore Image. That allows you to use VoiceOver to step through individual objects recognized in the image. It can be accessed with a rotor option when focussed on the image. Here is where the experience gets a bit interesting.

My go to Twitter app on the iPhone is Twitterific. The accessibility of the app has been outstanding for years and the maker has been highly responsive if issues to creep in.

I’ve also been exploring another highly accessible Twitter app named Spring. So far I’ve had a great experience with this app as well.

As one would expect, both Twitterific and Spring offer the ability to view images included with tweets. When images are viewed in either app, the VoiceOver automatic image description and Explore Image functionality work. Differences in the same picture viewed in two different apps using the same automatic image description and exploration technology are plainly obvious though.

First off, the automatic description when viewing the image in Twitterific says:

an illustration of vehicles on a road X. VETERAN’S PLUMRNO. Rall

That same image viewed in Spring yields the following automatic description:

a group of cars driving on a highway ETERAN ‘S PLUMPING

Both descriptions mention that the picture deals with vehicles on a road in some fashion. and include what I’d suspect is the text of a sign on a van or truck in the picture from a plumbing company. Again the descriptions come from Apple, not the individual apps.

A picky point but cars do not drive, people drive them. I might not know what is in the photo for certain but I am quite confident it isn’t a bunch of Teslas with the self-driving mode engaged.

It is also interesting how the image description when using Spring is a bit more detailed. It uses the terms highway and cars, whereas the Twitterific version is more generic in nature. The detail about cars when using Spring is even more interesting when using the Explore Image feature to review the individual objects in the picture.

Again, the newest versions of iOS added a feature called Explore Image to VoiceOver. Focus an image, change the VoiceOver rotor to Actions and one of the choices will be Explore Image. This opens a pop-over experience with individual objects from the picture. You can use VoiceOver previous and next commands to move from object to object and have them highlighted visually in the picture.

Here are the objects from the picture in the tweet I mentioned when explored with Twitterrific:

  • Automobile near left edge
  • Automobile Centered
  • Automobile near right edge

Recall how the automatic description for Spring talked about cars driving on a highway? One can only wonder where the cars went and where the train came from when using the Explore Image feature. Here is what is reported when exploring the image in Spring.

  • Van near bottom-left edge
  • Van near right edge
  • Van near bottom-left edge
  • Train near top edge

Automatic image descriptions are another helpful tool for shaping the accessibility landscape. They’ll be even more impactful if the technology continues to advance to reduce the variability of something as simple as viewing an image in a different program seems to introduce and the accuracy and detail of what is described improves.

Published in Accessibility

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.