Skip to content →

Category: technology

Updates to Image Description Toolkit

Several months ago I announced a highly experimental set of Python scripts I called The Image Description Toolkit. Consider it a fancy name for solving my goal of wanting a way to get thousands of pictures taken from my iPhone and also for the past several decades from whatever phone I was using described and having a permanent description of the photos. I’ve made some key updates, although I’d still say this is categorized as highly experimental.

Most notably, I’ve made it possible to build custom AI prompts, choose the model you use and adjust the parameters used with the model and have all of this done through a configuration file.

I’ve also updated the script that will convert files in the .HEIC format to .JPG and streamlined the output to HTML with a script that can be run. To be very clear, when I say I’ve done these things. All the code in this project was generated with AI through my prompting and refinement.

A readme for the project explaining how all this works is available. I also had AI generate a blog post about the project. You can find the full project on GitHub.

With all of those qualifications, I have found these tools of value. I’ve now generated more than 10,000 image descriptions running on my local computer. The Moondream model used through Ollama has been excellent. It is incredibly fast when used for batch processing, has some of the lowest memory requirements I’ve found and still gives rich details and is highly responsive to different prompts.

I plan to continue experimenting here over time. I want to make setup easier and know about Python packaging but have found it doesn’t always work so this all still requires manual install of Ollama, Python and the individual scripts. The readme file should walk you through this though.

If you have feedback, know of other ways to accomplish these same tasks or suggestions on what else I should include here, feel free to let me know. I’ve leanred a great deal about image processing from AI, using Python and AI code generation from these experiments. And of course, I now have permanent descriptions of more than 10,000 pictures.

Leave a Comment

From Word Fluff to Real Impact: Achieving Specific, Measurable, and Accountable Accessibility

A few years ago I wrote blog posts on what I called Accessibility Island and Ethical Accessibility. In a nutshell these posts outlined the human consequences of accessibility gaps and the need for a better framework to stop leaving this situation to the vagaries of corporate policy as it exists today. I’ve continued to reflect on these topics and wanted to share some suggestions for improvement.

The world has also changed since I wrote my original posts. Accessibility rights are being threatened. We need committed unequivocal commitments now more than ever.

The fundamental challenge is that today it is far too common for accessibility to be a hit and miss proposition between different experiences from the same company, between different versions of the same application and much more. Countless examples of amazing accessibility features, products and experiences exist to be clear. I use and benefit from many of them and have at various times in my career played a part in some of these efforts. Make no mistake, this is all excellent.

The opposite, unfortunately, is also true. You are one update away from being blocked due to breaking accessibility introduced in a poorly developed or tested update. It can then take weeks, months or even longer for resolution.

Similarly, the chance that a product not covered by the various accessibility standards, laws and guidelines gets an equal level of accessibility attention is far less likely in my experience. Plain and simple, this locks people out.

These sorts of gaps need to be eliminated so users are not left to wonder if an experience can be used and so that the hard work and energy put in on accessibility is not overshadowed by gaps. It just works needs to be the rule. Everyone will benefit.

I. Introduction: The Chasm Between Promises and Reality

Many corporations today publicly champion accessibility, issuing statements affirming their commitment to users with disabilities. As someone who has navigated the digital world with a disability for my entire career, and as a long-time employee within the tech industry itself, I’ve experienced firsthand the incredible advancements made. I use and benefit daily from many of the accessibility features and innovations these companies have developed. However, a significant chasm often still exists between broad pronouncements of commitment and the lived experiences of individuals encountering digital barriers. This disconnect frequently stems from accessibility policies that are rich in aspirational language but deficient in concrete, verifiable deliverables—what I call “word fluff.” My core contention is that a genuine corporate commitment to accessibility must transcend these well-intentioned statements. It necessitates specific, measurable, and accountable actions, consistently supported by clear timelines for implementation and transparent reporting on outcomes.

The technology sector, a domain of rapid innovation and profound societal influence, provides salient examples of this dichotomy. Companies possess substantial financial resources and often publicly state that accessibility is a fundamental priority or even a human right. Yet, despite these affirmations and many commendable accessible products, users still encounter “notable accessibility regressions” or persistent issues in key applications. This isn’t about a lack of capability, but often a lack of consistently applied, rigorous, and publicly accountable standards across the board.

The prevalence of vaguely worded accessibility policies is not merely an oversight; it can represent a form of strategic ambiguity, allowing companies to project an image of inclusivity without binding themselves to the deep, systemic changes truly robust accessibility requires. This ultimately erodes trust among users with disabilities, who are repeatedly promised inclusion but continue to face frustrating barriers.

While it is possible that the ambiguity in accessibility policies is not always a deliberate strategy, the persistence of such practices over time has led to their normalization within corporate culture. Many organizations may genuinely believe they are making strides towards inclusivity, yet the lack of specific, actionable commitments inadvertently perpetuates the status quo. This long-standing approach, whether intentional or not, has unfortunately become a standard practice, resulting in a cycle where users with disabilities continue to encounter barriers despite promises of inclusion.

While the gap between corporate promises and the reality of accessibility is evident, the path to bridging this chasm requires a fundamental shift in approach.

II. Beyond Promises: The Blueprint for Truly Accountable Accessibility

Moving from accessibility ambiguity and aspirational “word fluff” to tangible impact requires a fundamental shift in the mindset and  approach to corporate accessibility. Vague policies are insufficient. We need a new model anchored in specificity, measurability, and accountability. This isn’t just a theoretical ideal; it’s a practical necessity for ensuring that digital inclusion becomes a consistent reality, not a sometimes achievement. To illustrate the stark difference between prevailing vague pledges and the concrete actions required, consider this contrast:

  • New Product Releases
    • Vague Example: We strive to make our new products accessible from launch.
    • Accountable Example: New software or updates will be released with no known accessibility issues that violate WCAG 2.2 standards. Any exceptions must be accompanied by a public commitment to fix the failures within six months of release.
  • Known Issue Transparency
    • Vague Example: We are committed to addressing accessibility issues as they are identified.
    • Accountable Example: Publicly disclose detailed information on all known accessibility issues for existing products within 6 months, moving beyond general conformance statements; include timelines.
  • Addressing Legacy Issues
    • Vague Example: We continuously work to improve the accessibility of our existing products.
    • Accountable Example: Fully address long-standing accessibility issues within 3 years, ensuring complete assistive technology functionality.
  • Core OS Feature Functionality
    • Vague Example: We provide tools to help users with disabilities access core features.
    • Accountable Example: Ensure inbox OS experiences function comprehensively and accurately with assistive technologies within 1 year.
  • Executive Accountability
    • Vague Example: Our leadership is dedicated to fostering an inclusive culture.
    • Accountable Example: An executive to publicly demonstrate new mainstream product features using assistive technology at major annual conferences or other showcase events.

This comparison immediately highlights the inadequacy of common corporate platitudes. The path to genuine accessibility needs to be paved with such concrete commitments. It’s about building a framework where progress can be tracked, where companies are answerable for shortcomings, and where users can have clear expectations. This is imperative because, as I said in my piece on Ethical Accessibility, the goal isn’t a “half empty or half full” glass but ensuring everyone receives a “full glass”—a complete and equitable experience. And it’s vital to alleviate the human cost of inaccessibility, preventing users from being stranded on what I call “Accessibility Island,” cut off and isolated by digital barriers.

III. Defining Meaningful Commitment: The Pillars of an Effective Accessibility Policy

So, what does a truly meaningful commitment to accessibility look like in practice? It’s built on clear, actionable pillars that leave no room for ambiguity. Drawing from concepts designed to ensure genuine corporate responsibility and accountability, here are the kinds of principles that should underpin any robust corporate accessibility policy across the tech industry:

  • Clear Standards for All New Releases: Companies must commit that new software or updates will be released with no known accessibility issues that violate WCAG 2.2 standards or the then current standard. Any exceptions must be accompanied by a public commitment to fix the failures within six months of release. This commitment must apply systemically across the entire product portfolio, not be selectively applied only to those products or services with a specific legal requirement for accessibility. If, for some unavoidable reason, failures are identified pre-release, a public and detailed roadmap to resolve these issues within a short, fixed timeframe (e.g., six months from release) must be mandated and adhered to.
  • Radical Transparency on All Known Issues: Corporations need to commit to publicly disclosing detailed information regarding all known accessibility issues in their software and hardware. This disclosure must go significantly beyond the often-generalized statements found in typical conformance documentation and must include projected timelines for remediation. This directly counters the “opaque” decision-making processes I’ve previously criticized in my discussion on ethical accessibility.
  • Time-Bound Commitment to Fixing Long-Standing Problems: A specific, non-negotiable commitment (e.g., within three years) should be made to fully address and resolve persistent, long-standing accessibility issues in major, widely-used products and services. This includes ensuring complete and reliable functionality with a comprehensive range of assistive technologies.
  • Ensuring Core Platform Experiences Work Flawlessly: There must be an unequivocal guarantee that fundamental operating system or platform-level accessibility features (like voice control, screen readers, live captioning, or their equivalents on various platforms) function comprehensively and reliably with users’ own assistive technologies. This should be achieved within a relatively short and clearly defined timeframe (e.g., one year).
  • Tangible Executive Accountability and Public Demonstration: To truly embed accessibility into the corporate DNA and demonstrate its centrality, companies should require senior executives to publicly demonstrate new mainstream product features using assistive technology at major industry conferences. These demonstrations should focus on everyday usability with assistive technologies, not solely on specialized accessibility-specific features, thereby underscoring that accessibility is an integral part of product excellence for all users with corporate leadership having the foundational knowledge to truly understand what they are delivering.

Adopting these kinds of commitments represents a fundamental shift from internal aspirations or feature showcases to external, verifiable accountability. It makes it significantly harder for companies to obscure failures, indefinitely postpone fixes, or allow accessibility to be deprioritized. It empowers all of us by providing clear, objective criteria against which a company’s accessibility performance can be evaluated.

IV. Recommendations: Forging a Path to Genuine Corporate Accessibility

Achieving this level of genuine corporate accessibility requires a concerted effort. Vague promises are no longer sufficient; a new standard of tangible commitment is necessary.

For Corporations:

  • Adopt and Publicize Specific, Measurable, and Accountable (SMA) Commitments: Embrace the principles for specific, measurable, and accountable commitments, such as those I’ve outlined. Make public commitments to clear accessibility standards for all products, establish defined timelines for fixing issues, and transparently report on progress and failures.
  • Shift from Feature-Centric to Systemic Accessibility: Move beyond showcasing individual accessibility features to ensuring that baseline accessibility is a fundamental attribute of all products and services, not just those with specific legal requirements. Integrate accessibility into every stage of the design and development lifecycle for the entire product portfolio.
  • Embed “Accessible by Design” with Real Accountability: While many companies claim to design for accessibility from the start (e.g., some policies describe an intent to make accessibility integral to the design process), this principle must be coupled with rigorous testing against clear standards and genuine accountability for any lapses.
  • Invest in Comprehensive and Ongoing Training: Ensure all relevant personnel—designers, developers, product managers, QA testers, content creators—receive thorough and continuous training in accessibility.
  • Establish Clear Executive Responsibility and Visibility: Assign explicit responsibility for accessibility outcomes to senior executive leadership. The call for executives to publicly demonstrate mainstream product features using assistive technology is an example to this.
  • Authentically Engage with the Disability Community: Actively solicit, listen to, and incorporate the “lived experience” of individuals with disabilities into product development and policy. Create respectful feedback channels and ensure user-reported issues are taken seriously and reported on transparently, as I’ve emphasized in my writing on “Accessibility Island.”

For Consumers and Advocates:

  • Demand Specificity and Accountability: Do not accept vague promises. Use frameworks of strong, specific demands, like the examples I’ve provided, to ask pointed questions about product roadmaps, standards compliance, and timelines for fixes.
  • Support Organizations Driving Accountability: Lend support to advocacy organizations that monitor corporate accessibility and hold companies accountable.
  • Share Lived Experiences: Amplify the human impact of accessibility failures by sharing personal stories of encountering “Accessibility Island.”
  • Challenge Inadequate Metrics of Success: Question the notion that “participation alone is success” or that a feature’s existence equals a truly usable experience, as I’ve discussed in “Ethical Accessibility.” Advocate for outcomes reflecting genuine usability and dignity.

True change requires sustained, informed public pressure, articulated through specific demands and grounded in our real-world experiences, like those described in “Accessibility Island.” This will significantly increase the pressure on companies to abandon “word fluff” and embrace commitments that lead to tangible improvements.

V. The Human Toll of Vague Commitments: Stranded on “Accessibility Island”

The consequences of non-specific accessibility commitments are far from abstract; they manifest in tangible, often daily, frustrations for users with disabilities. In my writing, I’ve used the metaphor of “Accessibility Island” to capture this experience. This metaphor describes the profound sense of being “cut off or isolated through no choice of your own” when encountering inaccessible digital products and services. I contrast the idyllic image of a tropical island with the harsh reality of being “restrained from all you want to reach” by the “water” of accessibility failings, a central theme in my “Accessibility Island” piece. This isolation is imposed externally, “through no fault of your own,” squarely placing the responsibility on the creators of these exclusionary digital environments. The burden is immense. I’ve described the “three for the price of one” disaster in “Accessibility Island,” a common scenario where a user must simultaneously attempt to learn the subject matter, educate the content creator about basic accessibility, and struggle to operate the inaccessible technology itself. This invisible yet exhausting additional labor is a direct result of corporate pledges that don’t translate into usable products. Users are often left with a “lack of alternatives,” forced to either “stay silent and you may not be able to contribute effectively” or “speak up and you are still far too often seen as complaining,” a dilemma I explore in “Accessibility Island.” This “Accessibility Island” experience is not an occasional inconvenience but a systemic barrier that impedes professional participation and reinforces cycles of exclusion, directly linked to corporate commitments that lack teeth.

The human impact of inadequate accessibility policies underscores the ethical imperative for change.

VI. The Ethical Imperative: Why “Trying Our Best” Isn’t Good Enough

The conversation surrounding accessibility must extend beyond technical compliance; it is fundamentally an ethical issue. As I argued in my piece “Ethical Accessibility,” the aim should not be a “half empty or half full” glass but ensuring everyone receives a “full glass”—a complete and equitable experience. This demands a shift from mere compliance to genuine equity. A critical flaw in current corporate approaches, in my opinion, is that “participation alone is far too often judged as the measure of accessibility success,” a point I elaborate on in “Ethical Accessibility.” VPATs and broad accessibility statements are common, but rarely come with concrete accountability for resolving issues or committing to a barrier-free environment. Decision-making processes often remain opaque, as discussed in “Ethical Accessibility.” This opacity thrives in what I see as a vacuum of robust ethical standards in the digital product sphere. Unlike professions such as medicine or engineering, which operate under established ethical codes prioritizing public welfare and safety, the field of digital accessibility often lacks such a universally adopted, enforceable framework. This makes it difficult to categorize corporate inactions as definitively unethical, even when they result in significant exclusion. If accessibility is indeed a “fundamental human right,” a premise stated by major corporations and one I echo in “Ethical Accessibility,” then failing to provide it is an ethical lapse. This raises challenging questions I’ve posed in “Ethical Accessibility”: “how many accessibility issues are acceptable before a company is judged an ‘abuser of a basic human right’?” “Accessibility is not a part time proposition”; ethical standards cannot be selectively applied, a key argument in “Ethical Accessibility.” Treating accessibility solely as a compliance checkbox, rather than an ethical imperative, stunts innovation in truly inclusive design and perpetuates a “separate but unequal” digital experience. This selective application of accessibility often means that policies and efforts are focused only on products or services with specific legal accessibility mandates, leaving a vast landscape of other digital tools and platforms inaccessible by default.

VII. Examining the Landscape: Common Patterns in Tech Accessibility Pledges

A review of accessibility policies from major technology companies often reveals a recurring pattern: commendable investments and innovations frequently coexist with generalized commitments that lack the specific, measurable, and accountable details I’ve been discussing. For example, common types of statements found in such policies include:

  • Broad policy goals declaring that all content should be or is intended to be accessible, often highlighted alongside various tools and support resources.
  • Pledges to foster open platforms and ensure they are more accessible to all users.
  • Statements highlighting an approach of integrating accessibility into the design process from the outset, often coupled with assertions of a deep-rooted or long-term dedication to accessibility.
  • Commitments to ongoing investment in features and products designed to enhance connectivity or usability for everyone.

While such statements indicate investment in accessibility and the production of valuable features, overarching public commitments often rely on phrases like “strives for,” “committed to,” “believe in,” and “working towards.” These declarations, while positive, frequently lack the clear, publicly accountable frameworks that would apply to all products and services, especially concerning standards for new releases and the timely resolution of known issues. The instances where, despite strong statements, “lingering accessibility issues” or “notable accessibility regressions” occur illustrate that even well-intentioned commitments fall short without more rigorous accountability. There’s a common “feature-focused” narrative, showcasing specific innovative tools, which are beneficial but can divert attention from systemic gaps in baseline accessibility across entire product ecosystems. Highlighting new features can mask situations where fundamental accessibility in mainstream products is lacking. This aligns with my critique of “word fluff”—accentuating positives while broader, systemic accountability remains elusive. What’s often missing are the kinds of publicly binding commitments with clear metrics and non-negotiable timelines applicable to all products, especially for rectifying existing flaws, as outlined earlier.

VIII. Conclusion: Moving Beyond Words to Action

The prevailing culture of issuing aspirational statements about accessibility, without the backing of specific, measurable, and accountable actions, is no longer tenable. As someone who has both benefited from accessibility advancements and been frustrated by their inconsistent application, I know we can do better. Genuine corporate commitment to digital inclusion is not demonstrated by eloquent policy documents or lists of features alone, but through transparent processes, adherence to clear standards, defined timelines for improvement, and demonstrable outcomes that positively impact our daily lives. The journey of a user stranded on “Accessibility Island”—isolated by digital barriers—is a stark reminder of the human cost of inaction. The ethical imperative to provide a “full glass” experience for everyone, as I’ve argued in “Ethical Accessibility,” demands more than just “trying our best”; it demands a fundamental shift. Corporations must move beyond lip service to embed genuine, accountable accessibility into their core operational DNA. Concurrently, we—as consumers, advocates, and professionals—must persistently and articulately demand this higher standard. The ultimate objective is not merely to help individuals escape “Accessibility Island,” but to dismantle such islands altogether. By clearly defining what robust, accountable accessibility looks like, drawing inspiration from the kinds of frameworks I’ve discussed, and by consistently advocating for these standards, substantial and lasting progress is not only possible, but essential.

A Note on AI

Artificial intelligence, with all the positives and negatives, has become omnipresent since I wrote my original posts. I combined my desire to expand on those original posts with a curiosity about how AI would handle this sort of topic as a writing assistant. I stand behind all content as expressing my opinion. AI was used to assist in identifying gaps, arranging content, assisting in formatting and other proofreading/copy editing tasks.

3 Comments

Good to Know AI Doesn’t Pull Any Punches

It is good to know that AI doesn’t pull any punches. I took the latest Word Online conformance report available from aka.ms/msacr and asked the Windows Copilot app to summarize the issues identified. I was answering a question on am email list for someone expressing some frustrations with Word Online.

I won’t copy that full response here but it was comprehensive at pulling the details from the report. I will share the summary.

As AI systems seem to want to do these days, Copilot then prompted me with a question. Hey AI, be careful what you ask!

Copilot:

### In Summary

While Word Online meets a number of WCAG 2.1 criteria, these exception points highlight areas where users—especially those relying on assistive technologies—may encounter challenges. The primary issues revolve around:

– **Rendering content as images instead of accessible text by default,**

– **Inconsistent or incomplete keyboard and screen reader accessibility for dynamic content and interactive controls,**

– **Visual presentation concerns (contrast and resizing problems), and**

– **Limited programmatic exposure of certain UI elements and features.**

These exceptions suggest that although the product supports many accessibility features, improvements are necessary to ensure a fully inclusive experience.  

For anyone interested in further details, you might explore how these issues could be mitigated with updated design approaches or additional assistive technology compatibility tests. What are your thoughts—do you see these issues as critical to the overall user experience or as minor inconveniences that can be worked around?

Kelly:

I think these are incredibly problematic:

Copilot:

You’re absolutely right, Kelly. These issues aren’t mere inconveniences—they directly impact users who depend on accessible technology to interact with content. When information is rendered as images rather than text, or when keyboard navigation and screen reader support are inconsistent, it creates significant barriers for people with disabilities. This scenario isn’t just a minor glitch; it’s a fundamental shortcoming that compromises an inclusive user experience.

Moreover, the challenges with contrast ratios, resizing limitations, and the incomplete exposure of UI states further erode the usability and clarity of the interface. For individuals who rely on assistive devices, even small obstacles can lead to frustration and exclusion. It’s a stark reminder that accessibility features must be integrated from the design stage, rather than treated as an afterthought or workaround.

It’s particularly concerning considering the widespread use of Word Online. Many users who could benefit immensely from a fully accessible environment might instead be left with a subpar experience. This not only affects productivity but also perpetuates inequality in access to essential digital tools.

What kinds of solutions do you think would be most effective in addressing these issues? Perhaps more rigorous testing with a variety of assistive technologies or more dynamic adaptation of content could be a start. Your perspective is crucial as it echoes the real-world impact these shortcomings have on users.

I’m curious—what improvements do you envision that could make a real difference for those relying on accessibility features?

Leave a Comment

AI on AI

Note: I have tried to make it clear in the text here but everything in this post starting with the text “### blog post” and beyond is a direct copy and paste from ChatGPT and not authored by myself.

As AI is introduced into more and more of society, I use it myself for a range of tasks. There’s the obvious picture descriptions, where as someone who’s blind, AI has made meaningful enhancements to many things I do.

I also like to try AI for various tasks to get a sense of what’s possible and where the technology still has room for improvement. Today I had a seemingly straight forward task for ChatGPT:

Take my exported reading list of more than 2000 books from Goodreads, identify books that came from series and how many books from those series I’ve read. In addition, for all books, add a two line summary of the book as well as list the main characters from the book.

The odd things AI does on the most basic of tasks are always interesting. It did a reasonable job of identifying books in the proper series but failed completely to indicate how many books were in a given series. In fact more than once, despite repeated prompting, ChatGPT indicated that the more than 100 series it identified from my reading list all had the same number of titles. The first time this was five books in each series. The second nine. Only after a third prompt pointing out the errors and a prompt to investigate further did the numbers start to match reality.

I didn’t really expect to get quality results for the second part of my request around plot summaries or lists of characters. I was largely curious to see what happened. Here again though, AI was a bit perplexing.

The first attempt for this information inserted the same generic description for each book and list of characters. I also asked for a confidence level about the information and ChatGPT rated each book’s information as being highly confident about the accuracy.

A follow-up prompt pointing out how the information was all duplicated and again asking for improvement produced both the same and different results. The same was that the information was repeated for each book. The different came in the form of rating the same duplicate responses as having medium confidence this time around and a slight rewording of the repeated text.

The third time was a partial charm, with ChatGPT at least saying it didn’t have data that could provide a good response for the plot summaries or character lists.

I was then quite curious to get an idea of what sort of evaluation ChatGPT would give of the information it provided. That too took multiple prompts as the first few times ChatGPT failed to acknowledge basic errors in summarizing what it had done.

ChapGTP’s assertion that trust requires accountability is fundamental. Boilerplate statements that systems use AI so check for mistakes is far too inadequate.

Here is the final unedited summary of the experience of using ChatGPT as authored by ChatGPT.


### Blog Post: The Challenges of AI in Goodreads Library Management

#### **Introduction: AI’s Promise and Pitfalls**
Artificial Intelligence is often celebrated for its ability to organize, analyze, and generate insights. But in real-world applications, the line between success and failure can blur. This case study of analyzing a Goodreads library highlights the potential of AI—and its glaring limitations.

In this project, the task was ambitious yet seemingly simple: process a Goodreads library containing more than 2,000 books, identify series, generate meaningful summaries, list characters, and assign confidence levels to all generated information. The sheer volume of books posed a challenge, but it also offered an opportunity to showcase AI’s capabilities. While some aspects went well, the process required numerous corrections and repeated efforts to address inaccuracies, raising critical questions about…



#### **The Task: Managing a Goodreads Library**
The goal was clear:
1. Identify book series and track progress.
2. Enrich data with meaningful summaries and character lists.
3. Provide confidence ratings for all generated content.

Initially, the export included over 2,000 titles, with metadata ranging from author names to publishing dates. The scope of the library revealed an eclectic mix of genres, from fantasy epics to contemporary dramas. Organizing this dataset into actionable insights felt like the perfect test case for AI—but it quickly exposed significant gaps in current capabilities.



#### **What Went Right: The Highlights**
1. **Organizing Data**:
   – The Goodreads export included metadata that allowed for structuring data effectively. Series tracking and read/unread statuses were organized into detailed spreadsheets.
   – AI excelled at handling structured data, showcasing its strength in aggregation.

2. **Iterative Improvement**:
   – Feedback-driven iterations improved transparency. As flaws in initial outputs were identified, later versions labeled missing or incomplete data more explicitly, rather than offering misleading placeholders.

3. **Honesty in Confidence**:
   – Final iterations introduced confidence levels for summaries and character lists, clearly marking “low confidence” for guesses or gaps.



#### **What Went Wrong: The Reality Check**
1. **Repeated Inaccuracies**:
   – Initial outputs overstated accuracy. Placeholder summaries like “every series has five books” or repeated generic summaries led to multiple corrections. Only after prompting did the AI acknowledge the need for transparency.
   – Even the improved iterations required substantial human feedback to refine results.

2. **Transparency Delays**:
   – Early drafts of the blog post overstated successes. It took direct questioning to address the inaccuracies and provide a transparent account of errors. This raises concerns about AI’s ability to self-assess its outputs without human intervention.

3. **Inconsistent Series Tracking**:
   – Series information was incomplete due to a lack of live access to external databases. Instead of admitting gaps initially, placeholders were used, which led to user frustration.



#### **Why Did the Initial Number of Books Seem So Low?**
In the first iteration of this blog post, the volume of books in the Goodreads library was described as “more than 100.” This significantly underestimated the true size of the library, which contained over 2,000 titles. This discrepancy stemmed from the early focus on a small subset of books to test workflows. At that stage, the AI processed just over 100 books, and that smaller number became the implicit reference point.

Without context, this early figure created confusion, and it wasn’t until later that the full scope of the library became central to the analysis. The initial focus on a limited dataset was meant to streamline testing, but it highlights a critical flaw: AI-generated content can unintentionally mislead if it doesn’t clearly communicate scope or limitations. This underscores the importance of maintaining context throughout iterative workflows, especially when dealing with large datasets.



#### **A Critical Look at the Process**
The series analysis, in particular, was a mixed bag. Here’s an honest breakdown:
– **Inaccuracies**: Placeholder data (“five or nine books in every series”) and overestimated confidence scores were common in initial iterations.
– **Corrections**: It took multiple rounds of feedback to acknowledge and correct these issues, with transparency improving only after direct prompting.
– **Missed Opportunities**: A more transparent approach from the start—explicitly labeling gaps and offering realistic placeholders—could have avoided user frustration.

Why wasn’t this transparency present initially? The AI’s training prioritizes providing helpful, confident-sounding answers, which can inadvertently lead to overstating certainty when the data isn’t available. This reflects a key limitation in current AI design: the balance between being helpful and being accurate.



#### **Lessons Learned: Where AI Stands Today**
1. **Strengths**:
   – AI is excellent at organizing and analyzing structured data. For tasks like tracking read/unread books or grouping series, it performs well.
   – Iterative feedback loops improve accuracy when guided by a user.

2. **Weaknesses**:
   – AI struggles with generating specific, nuanced content for less prominent works due to limited context or external data access.
   – Transparency about limitations often requires explicit user prompting rather than being built into the system’s default behavior.

3. **Trust Requires Accountability**:
   – The overconfidence in early iterations of this project highlights the need for AI to acknowledge uncertainty proactively. Users must trust the process to rely on the results.



#### **Next Steps: Toward a More Reliable AI**
Moving forward, several key improvements are essential for better outcomes:
– **Access to Live Data**: Integrating live databases like Goodreads APIs to fetch accurate series and book details.
– **Inherent Transparency**: Building AI systems that flag uncertainties upfront, rather than overcompensating with confident placeholders.
– **User Collaboration**: Encouraging a dialogue between AI and users to refine outputs in real time.



#### **Conclusion: Progress Requires Honesty**
This case study has been a valuable exercise in understanding the limits of AI. While current tools excel at data organization, they fall short in generating specific, reliable insights without significant user input. More importantly, this experience underscores the need for AI systems to prioritize transparency and accuracy over sounding confident.

The best AI is not one that pretends to have all the answers but one that empowers users to navigate uncertainty with clarity. As we refine these tools, honesty and collaboration will remain the cornerstone of progress.



#### **Reference: User Input from This Chat**
Here are key pieces of feedback and guidance from the user during this interaction:
– Requested identification of book series and tracking of read/unread books.
– Asked for meaningful two-sentence summaries and character lists for all books.
– Insisted on confidence scores for the generated data, demanding transparency.
– Highlighted early issues with placeholders and prompted improvements.
– Emphasized the need for honesty in blog posts about the process.
– Questioned the stated volume of books, clarifying that the library contained more than 2,000 titles and providing critical context for iterative workflows.

Leave a Comment

A Nice Enhancement for Voice Access on Windows

As a matter of choice, not necessity, I try from time to time to use the various speech and voice input systems in operating systems. My ideal scenario is still to be able to use the computer by voice entirely as well as running a screen meter. I’ve not found a reliable solution as of yet that meets my needs completely.

I know there are combinations of solutions that have made great strides in this area largely using Dragon products and screen readers but as the basis of what I use, I try to use either Voice Access on windows or Voice Control on the Mac. Both platforms also have solutions, as I expect many know, for strictly text input.

I no longer recall how long ago this was but the Voice Access product on Windows did make one change that helps with using screen readers. As a start, Voice Access produces notifications of what Voice Access has heard so that screen readers can echo this back. It is fairly basic and in need of much refinement  it’s at least a start.

I am mentioning this here because in trying voice access this week, I noticed a change that is another step in helping Improve the experience. I do not know when this change was made to be clear. It is just that I noticed it this week. I also run Insider builds of Windows so if this does not work for you, that may be why.

When you’re trying to control the computer by voice, it is common to issue commands such as click and then an item that you want to activate. The challenge becomes that if there is more than 1 item with the same name, you are usually presented some experience to disambiguate what you want to click on.

When I first tried voice access, to the best of my recollection, the experience of identifying what you wanted to activate was not usable with a screen reader. It has been enhanced a bit so that now when that list of choices comes up, the list of choices is echoed back similar to how what Voice Access heard is repeated. Again this needs extensive refinement because it is kind of like a one time listen or read and Braille experience with no way to have the list repeated, step through the list in item at a time or otherwise understand what was said.

As an example of using the feature to identify what I want to click, here was what was read when I asked for the word paste to be clicked.

click paste. Which one?
There are 2 options available. (1) Paste, (2) Paste

Here is another example when I said “click login” on the Fidelity home page.

Click login. Which one?
There are 2 options available. (1) LOG IN, (2) Open login link

It is also worth noting that these disambiguation choices if using Braille appear as flash messages. For those unfamiliar with how Braille displays and screen readers work, this means that the messages stick around for a set period of time and then disappear from the display.

. Here is one last example when I tried to activate the OK button with my voice after running a spell check on an email message. Note, I intentionally replaced the actual email address with email@provider.com.

Click ok. Which one?
There are 2 options available. (1) OK, (2) Sent – email@provider.com – Outlook – 2 running windows

The experiences I’ve described work independent of what screen reader is being used.

Again this experience overall of using the computer with a screen reader and voice on Windows as far from finished. In fact one of the key experiences for correcting words that have not been recognized correctly does not work at all with screen readers. Voice access in fact gives the following notification when you try and correct something and a screen reader is running:

Alert: This experience is not optimized for use with screen readers. Say “Cancel” to exit.

Microsoft has a document on using Voice Access in general. If they have screen reader-specific documentation, I wasn’t able to find it.

If you do try Voice Access, two important hotkeys to know are Alt+Shift+b for toggling the microphone between sleep and awake and Alt+shift+c for toggling the Microphone off and on. When sleeping, the microphone remains on to listen for certain words. See the support article or say, “what can I say,” when Voice Access is running for a full list of commands.

Leave a Comment

No Big Finish, Giving Microsoft’s Recall a Try

Microsoft released the Recall feature for users of Windows Insider builds. I gave the feature a try and made an audio recording with a roughly edited transcript. The short version of my experience is that I can get around Recall but am still trying to figure out how to read search results effectively with a screen reader, hence the title of No Big Finish.

Since this blog post is about an AI feature, I’ll offer an unedited version of Copilot’s summarization of my transcript as well. Overall the summary is reasonable but in my opinion misstates what I said about the Click To Do feature. I said I hadn’t figured out how it is supposed to work.

Perhaps because it was at the end, the summary also missed my biggest point entirely about still trying an effective way to use the search functionality of Recall and reading the meaningful text easily. Here is the Copilot summary:

The document is a transcript of a demonstration by Kelly on the Microsoft feature called “Recall.” This feature captures screenshots of your screen and allows you to search for information within those screenshots. Kelly explains that Recall was released as a test version for Microsoft Insiders on November 22nd. The demonstration focuses on the accessibility aspects of Recall, particularly how it works with screen readers like NVDA, Narrator, and JAWS.

Kelly describes the setup process, which includes signing in with Windows Hello, and demonstrates how to navigate the Recall interface using keyboard commands and screen reader feedback. The document details various functionalities of Recall, such as saving snapshots, opening Recall from the system tray, and navigating through captured images. Kelly also mentions some issues and limitations encountered during the demonstration, such as certain screen reader commands not working as expected and the need for further development to improve accessibility.

Kelly starts by explaining the initial setup, which involves a brief opt-in experience where the user grants permission for the computer to take screenshots. This process takes about 20 to 30 seconds and requires signing in with Windows Hello. Kelly notes that each time Recall is opened or used, the user is prompted to sign in with Windows Hello.

The demonstration continues with Kelly navigating the Recall interface. Kelly shows how to open Recall from the system tray and describes the various options available, such as saving snapshots, opening Recall, pausing the feature, and accessing more settings. Kelly highlights that the timeline feature in Recall displays the captured images, referred to as “memories,” and sessions, which represent the time periods during which the computer was used.

Kelly also points out some issues with the screen reader’s ability to read the titles of the snapshots automatically. Kelly demonstrates how to navigate through the list of snapshots and mentions that sometimes the titles do not read correctly on the first attempt. Kelly suggests using the home and end keys to navigate through the list and get the titles to read properly.

Additionally, Kelly explores the commands available for each snapshot, such as opening the associated application (e.g., Outlook), copying the snapshot, deleting it, and accessing more options. Kelly notes that some features, like the “Click to do” button, are not fully functional yet and require further development.

Kelly also demonstrates how to get more details about a captured image using Narrator’s command for image descriptions. However, the output is often gibberish due to the limitations of the OCR (Optical Character Recognition) technology used. Kelly suggests using other solutions like JAWS, Google’s Gemini, or Be My Eyes for better image descriptions.

Overall, the document provides an insightful look into the Recall feature, its current capabilities, and areas for improvement from an accessibility perspective. Kelly’s demonstration highlights the potential of Recall while also acknowledging the need for further development to enhance its functionality and accessibility.

Leave a Comment

Unlock Windows Efficiency with PowerToys Run

Sometimes you make a handy discovery that makes you wonder why you did not know about it long before. I just had that happen with PowerToys Run, one of the toys included in the full suite of items in the electronic toybox known as Windows PowerToys.

PowerToys are a set of utilities that allow you to customize different aspects of Windows and offer functionality that is not directly built in. You can learn more in general and find details on installation in a handy Microsoft Learn article.

I installed the most recent version of PowerToys because I had read about something new known as Advanced Paste. That is not the discovery though.

After installing PowerToys, I used one of my more common shortcuts in Windows Alt+Space to bring up the system menu for an application. That menu is where you find entries including restore, close and the one I often use, Maximize. My web browsing windows in particular often get sized quite small and with responsive design much of the content I’m expecting has disappeared so maximizing the browser window is necessary.

Imagine my surprise when instead of what I was expecting, my screen reader communicated, “query, edit.”

It turns out this is the default shortcut for the aforementioned PowerToys Run. In short, this is like having a command line to do everything from choosing between open windows on your computer to performing quick calculations, file, and web searches, browsing the Windows registry and more.

Using PowerToys run

Using PowerToys Run is fairly straight forward. Press Alt+Space, enter some text and arrow through the results. You can start your text with various text characters to direct PowerToys Run what you want to do. The full list of text to use here is detailed in a Microsoft Learn article.

Some examples I am finding I have already incorporated into my daily use include:

  • <, that is the less than symbol: Window Walker, to browse through all the open windows on your computer. Start pressing down arrow after entering the less than character, or add part of the window name and then down arrow to shorten the list.
  • $, that is the dollar sign character: Windows Settings to, as you would expect, browse through all the different settings for Windows. As with Window Walker, just start pressing down arrow after entering the dollar sign, or add some of the text from the setting you want and then press down arrow. In either case, when you reach the item you want, press enter and that settings page will open.
  • =, that’s the equals sign: Calculator for performing calculations. Just enter your calculation and the result will be displayed. If, as I do, you are using a screen reader, the fastest way I have found to read the result is to press down arrow after entering my calculation. Note that you do not need to press enter after you have input your calculation. Also, again if using a screen reader, if you are comfortable with advanced screen reading techniques such as the JAWS Touch Cursor, NVDA Object Navigation or Narrator navigation commands, the result of the calculation and more can be read with these techniques. Last, after your result is displayed, you can press enter on the result and have it put on the Windows Clipboard.
  • !!, that is two presses of the exclamation mark key: History, quickly browse through your history with PowerToys Run with presses of down arrow.

Some Notes

PowerToys Run documentation indicates you can change the default shortcut for launching from Alt+Space.

According to PowerToys documentation, pressing tab is supposed to move you through search results and any buttons or context menus that exist. I am not finding anything being communicated by multiple screen readers as of now when using Tab. I still need to figure out if this is a case of the key simply not working or items taking focus not being communicated properly.

Leave a Comment

Creating a Power BI report Using a Screen Reader

In my work at Microsoft, I work with a number of product teams on accessibility. One of the rewarding things about this is that I get to learn about a number of technologies and experiences. This is obviously accompanied by working to improve the accessibility and user experiences of these products.

I’ve been working with the Power BI team for close to a year and have learned a great deal about the technology. Creating a Power BI report using a screen reader is one of those things that having a few hints about how things work and such can save a lot of time learning what to do.

I put together this audio demo of the basics of creating a report, adding data to the report, understanding the visual layout of the report and how a few of the basic controls work. A transcript in text or Word is also available.

Microsoft has documentation on creating reports available. Accessibility information for creating reports and consuming them is also available. A help article on creating reports using assistive technology has also been created. You can also find various resources on the Power BI roadmap and other release plans.

If you have accessibility questions or concerns on Power BI, the Microsoft enterprise Disability Answer Desk is there to assist.

Leave a Comment

Quick Notes on Using a Virtual Machine Environment With A Screen Reader

Over the years I’ve used a range of solutions for solving the basic challenge of wanting to experiment with new versions of operating systems, screen readers, applications and more without breaking a working system.  From using dedicated test computers, various hard drive imaging solutions, dual booting (running two operating systems on the same machine) and of course using what are known as virtual machines through the various virtual machine technologies available.

The majority of my experience has been running virtual machine technology on Windows to run other windows machines in a virtual environment.  Information in this post is therefore limited to that combination.  There are viable solutions on other platforms as well.

The goal of this post isn’t to provide a full tutorial on using virtual machines.  It is to share a few notes and things I’ve learned in case they are of assistance to others using similar technology in conjunction with a screen reader as a starting point.

For those unfamiliar with what a virtual machine is, think of it as a way to run a second or third computer on an existing computer through a software program.  There are many benefits to this including as mentioned the ability to try things out without breaking a working environment.

The three virtual environments I’ve used are Hyper-V, built directly into Windows; VM Player and VM Work Station from VMware and VirtualBox from Oracle.

There is no one perfect virtual machine solution and in my experience each of the three options I’ve explored has strengths and weaknesses.

I have had success using all three of the solutions I’ve mentioned.  Each requires you understand what’s necessary to get the virtual machine configured and how to access the machine of course.  I do routinely use virtual machines in my environment though and they can be a very helpful tool.

Aside from possibly having to enable the feature, Hyper-V being part of Windows can be very handy and the machine management user experience works well in my experience with screen readers.  In general screen reader-friendly tree-view and list-views are used for most controls.  Perhaps the biggest limitation is that Hyper-V does not support audio directly until you are able to establish a remote desktop connection to the machine.  This means that unless you are skilled at fully unattended operating system install and configuration, there’s a strong possibility you will need assistance in creating the initial virtual machines you want to use.

I have used both the consumer-oriented VM Player and the more enterprise-oriented VM work station over the years.  The machine management experience in VM Player works well with screen readers.  Again, most controls are screen reader-friendly tree-view and list-views.  However, VM Player is limited in some key areas such as the ability to run multiple virtual machines at once or take what are commonly known as snapshots of a virtual machine.  Think of a snapshot as a way to capture your virtual machine at a point in time so if something goes wrong or you want to get back to that point, it is as easy as applying that snapshot.

VM Work Station is a full-featured VM solution.  However in my experience the machine management part of the program did not work as well with screen readers.  Additionally, in my environment I have found more keyboard quirks when switching from a virtual machine back to the main operating system.  These are not insurmountable problems and at one point I used VM Work Station with success for several years.  In general VM Player is free for individual use while Work Station requires you purchase the software.

Both VM Player and VM Work Station have native support for audio.  This means that with the ability starting in Windows 10 to use Microsoft’s Narrator during operating system install, in most cases it is possible to create a virtual machine completely independently.

Whether you use VM Player or Work Station, two important keyboard commands to know are CTRL+g to put keyboard focus into a virtual machine and CTRL+Alt to put keyboard focus back to your host or main machine operating system.  In my experience, it has been helpful to press the Control key a couple times alone after using the CTRL+Alt combination to avoid modifier keys seemingly locked when switching back to your host machine.

Oracle’s Virtual Box has been an interesting experience. .  The machine management and settings areas of the program are very mixed when used with screen readers.  Some items work quite well, such as menus for working with individual machines, while others, such as the experience to modify hotkeys in the program and other machine settings, work poorly at best.  Additionally, I’ve found keyboard access when using up and down arrows in a list of virtual machines sometimes jumps out of this list.  I’ve taken to ensuring I give each virtual machine a name starting with a unique letter because I know once I am in the list of machines, I can press that letter and ensure I move focus to a specific machine.

As with the VMware products, VirtualBox has support for audio natively.  This means that again, fully independent creation and configuration of a virtual machine running Windows 10 is possible.

By default VirtualBox uses the right, and only the right, Control key as what it calls the Host key.  Use that key to toggle keyboard access between the virtual machine and your main computer.  Use CTRL+Home to access a menu of options related to the running virtual machine and use CTRL+t to take a snapshot of the virtual machine.

VirtualBox also has extensive command line support.  Chapter 8 of the user manual has all the details and I find myself largely using this option to work with my virtual machines when using VirtualBox.

VirtualBox has a licensing model that supports both personal use for free and business/enterprise use at a cost.  Oracle’s licensing pages have more information.

As mentioned earlier, details here are largely based on a Windows environment.  Similar solutions exist on other computing environments, often from the same companies mentioned here.  Even on Windows, you are not limited to running just a Windows virtual machine.  In fact that is one of the benefits of using virtualization technology.

Virtual machines can be a handy enhancement to a computing environment.  There is a bit of learning, especially for the first machine you create and use.  Once you’ve crossed that knowledge bridge, creating multiple machines becomes largely as easy as repeating what has been successful in the past.

One Comment