Skip to content →

The Idea Place Posts

Image Description Toolkit 3.0 Available

I have a vastly updated version of my Image Description Toolkit (IDT) available. The 3.0 release builds on the command line workflow tools I released earlier with a guided workflow creation system, analysis tools for combining descriptions, a range of performance statistics on your descriptions and tools to review content from various AI models and prompts.

The IDT also includes several other tools for working with image descriptions. A results viewer allows you to browse results, monitor workflows as they happen, copy descriptions and images.

If you want to focus on detailed descriptions for individual images, the Image Describer tool, allows you to load a directory of images, and easily request descriptions for individual images and do so using multiple providers and prompts and save all your work for future use.

A Prompt Editor allows you to create and edit the prompts used to guide AI descriptions for all the tools in the IDT and configure defaults.

In addition to Ollama, which powered earlier versions of the IDT, the IDT now includes support for multiple AI models from both OpenAI and Anthropic if you provide your own API key.

A comprehensive User Guide is available for the various tools in the IDT.

Whether you want detailed descriptions for one or one thousand images, the IDT has a tool for you. Grab IDT3.zip, extract the files to your computer and run install_idt. Be sure to install Ollama and or get API keys for OpenAI or Anthropic and configure your AI models. Then open a command prompt and run:

idt guideme to generate your first descriptions.

15 Comments

RSS Quick, A Basic RSS Reader for Windows 1.0 Available

I’ve been dabbling further with AI-driven development and have another app for exploration. This time it is a basic RSS Reader for Windows I’m calling RSS Quick. Get the full details and program from Release RSS Quick 1.0 – Initial Release · kellylford/rssquick or my alternative download location.

The name RSS Quick reflects the intent of the app to be a simple and fast way to quickly browse RSS feeds. The program does not handle feed management at this time but will allow you to import an existing OPML file and quickly browse the feeds. RSS Quick includes a default OPML file named RSS.OPML that will be put in the default folder where you unzip the software package. It has dozens of feeds from a wide range of sources already included. Even if you don’t use the app, the OPML file may be handy as another source of feeds.

To use RSS Quick, you need to download the .ZIP file for your computer. In most cases this will be the X64 version. If you are using ARM hardware with Windows, you’ll want the ARM version. Extract all the files to any location on your computer and run RSSQuick.EXE. The program will load the default OPML file that has been included. If you do not have the necessary version of .NET, you will be prompted to download that first.

You can learn more about RSS Quick from the readme for the program. Give it a try and consider getting involved or submitting issues on the GitHub page.

RSS Quick is one of three projects I have available. I also have my Sports Scores app and an Image Description Toolkit. available. All projects can also be downloaded from my separate release archive.

Leave a Comment

Making Sports Information More Accessible: A Preview of the Sports Scores App”

I’ve gone back to the AI workbench, much like I did with my Image Description Toolkit, to create another app. This time it is to address the goal to make sports scores and play-by-play information easier to read and avoid some of the long-standing accessibility challenges on leading sports web sites.

I’m calling this app a preview release because not everything is working yet but I have enough functionality available that I wanted to share it now.

I have an audio demo of the app along with a transcript of the demo. The release page with notes and links to download the Windows program are at Release Sports Scores 0.5 Preview Release · kellylford/Scores.

The program uses lists, trees and tables to show things like game scores, play-by-play action, box scores and standings. The info itself comes from APIs made available by leading sports providers. I’m just showing it in ways that work better in my opinion with screen readers and other assistive technology.

Perhaps the most interesting feature I have so far is the ability to hear baseball pitch location and speed by converting pitch data to audio sounds. The demo has a sample of this and within the app you can press Alt+p to hear the location of any pitch when you are on that pitch in the play-by-play details. Press Alt+s to hear the entire pitch sequence. Bring up the context menu on a pitch for these options plus a Pitch Explorer feature to get a sense of how the sounds for different locations and pitches will be presented.

In addition to the release page, the GitHub project has a readme with full details on the app. An issue list where you can report new issues or discover what’s already known is available.

Give the app a try and if you are interested in contributing to the project, please do so.

Leave a Comment

Image Description Toolkit V2 Available

I’ve made another series of updates to what I’m calling the Image Description Toolkit since my last announcement. As a recap, the goal of this toolkit is to take collections of images and videos and create descriptions you can save and do this all with local AI models. Dozens of tools provide descriptions, but it is still difficult to save those descriptions for future review. With the Image Description Toolkit, you get nicely formatted HTML pages to read through all your image descriptions.

The newest enhancements include a comprehensive testing system to experiment with model prompts, a workflow script that allows for all tasks to be run with one command versus individually running each script and numerous small adjustments throughout the system. The code here is still all AI-generated with my ideas powering what’s created.

I’m sure I’m not objective but for me this has turned into something that started as a curiosity, moved into a better understanding of how AI code generation could work and is now something I’m using regularly. Over the weekend I attended several musical events and was able to generate more than 400 image descriptions from photos and videos I took.

The project lives on GitHub and has a readme that covers the basics of getting started. A guide for using the prompt testing script is also available. This is particularly heklpful for trying out different models.

I’m always curious how AI writing works as well so asked GitHub Copilot to generate a second blog post about project developments. And of course, it is software, so there is also an issue list.

I won’t say for certain what’s next but my current plan is to work on a graphical version of the project to understand more about that environment with Python, create a prompt editor so changing the default prompts is easier and get this all working with Python packaging so install is easier.

Contributions, suggestions or pointers to tools that already do all of this are always welcome.

Leave a Comment

Updates to Image Description Toolkit

Several months ago I announced a highly experimental set of Python scripts I called The Image Description Toolkit. Consider it a fancy name for solving my goal of wanting a way to get thousands of pictures taken from my iPhone and also for the past several decades from whatever phone I was using described and having a permanent description of the photos. I’ve made some key updates, although I’d still say this is categorized as highly experimental.

Most notably, I’ve made it possible to build custom AI prompts, choose the model you use and adjust the parameters used with the model and have all of this done through a configuration file.

I’ve also updated the script that will convert files in the .HEIC format to .JPG and streamlined the output to HTML with a script that can be run. To be very clear, when I say I’ve done these things. All the code in this project was generated with AI through my prompting and refinement.

A readme for the project explaining how all this works is available. I also had AI generate a blog post about the project. You can find the full project on GitHub.

With all of those qualifications, I have found these tools of value. I’ve now generated more than 10,000 image descriptions running on my local computer. The Moondream model used through Ollama has been excellent. It is incredibly fast when used for batch processing, has some of the lowest memory requirements I’ve found and still gives rich details and is highly responsive to different prompts.

I plan to continue experimenting here over time. I want to make setup easier and know about Python packaging but have found it doesn’t always work so this all still requires manual install of Ollama, Python and the individual scripts. The readme file should walk you through this though.

If you have feedback, know of other ways to accomplish these same tasks or suggestions on what else I should include here, feel free to let me know. I’ve leanred a great deal about image processing from AI, using Python and AI code generation from these experiments. And of course, I now have permanent descriptions of more than 10,000 pictures.

Leave a Comment

From Word Fluff to Real Impact: Achieving Specific, Measurable, and Accountable Accessibility

A few years ago I wrote blog posts on what I called Accessibility Island and Ethical Accessibility. In a nutshell these posts outlined the human consequences of accessibility gaps and the need for a better framework to stop leaving this situation to the vagaries of corporate policy as it exists today. I’ve continued to reflect on these topics and wanted to share some suggestions for improvement.

The world has also changed since I wrote my original posts. Accessibility rights are being threatened. We need committed unequivocal commitments now more than ever.

The fundamental challenge is that today it is far too common for accessibility to be a hit and miss proposition between different experiences from the same company, between different versions of the same application and much more. Countless examples of amazing accessibility features, products and experiences exist to be clear. I use and benefit from many of them and have at various times in my career played a part in some of these efforts. Make no mistake, this is all excellent.

The opposite, unfortunately, is also true. You are one update away from being blocked due to breaking accessibility introduced in a poorly developed or tested update. It can then take weeks, months or even longer for resolution.

Similarly, the chance that a product not covered by the various accessibility standards, laws and guidelines gets an equal level of accessibility attention is far less likely in my experience. Plain and simple, this locks people out.

These sorts of gaps need to be eliminated so users are not left to wonder if an experience can be used and so that the hard work and energy put in on accessibility is not overshadowed by gaps. It just works needs to be the rule. Everyone will benefit.

I. Introduction: The Chasm Between Promises and Reality

Many corporations today publicly champion accessibility, issuing statements affirming their commitment to users with disabilities. As someone who has navigated the digital world with a disability for my entire career, and as a long-time employee within the tech industry itself, I’ve experienced firsthand the incredible advancements made. I use and benefit daily from many of the accessibility features and innovations these companies have developed. However, a significant chasm often still exists between broad pronouncements of commitment and the lived experiences of individuals encountering digital barriers. This disconnect frequently stems from accessibility policies that are rich in aspirational language but deficient in concrete, verifiable deliverables—what I call “word fluff.” My core contention is that a genuine corporate commitment to accessibility must transcend these well-intentioned statements. It necessitates specific, measurable, and accountable actions, consistently supported by clear timelines for implementation and transparent reporting on outcomes.

The technology sector, a domain of rapid innovation and profound societal influence, provides salient examples of this dichotomy. Companies possess substantial financial resources and often publicly state that accessibility is a fundamental priority or even a human right. Yet, despite these affirmations and many commendable accessible products, users still encounter “notable accessibility regressions” or persistent issues in key applications. This isn’t about a lack of capability, but often a lack of consistently applied, rigorous, and publicly accountable standards across the board.

The prevalence of vaguely worded accessibility policies is not merely an oversight; it can represent a form of strategic ambiguity, allowing companies to project an image of inclusivity without binding themselves to the deep, systemic changes truly robust accessibility requires. This ultimately erodes trust among users with disabilities, who are repeatedly promised inclusion but continue to face frustrating barriers.

While it is possible that the ambiguity in accessibility policies is not always a deliberate strategy, the persistence of such practices over time has led to their normalization within corporate culture. Many organizations may genuinely believe they are making strides towards inclusivity, yet the lack of specific, actionable commitments inadvertently perpetuates the status quo. This long-standing approach, whether intentional or not, has unfortunately become a standard practice, resulting in a cycle where users with disabilities continue to encounter barriers despite promises of inclusion.

While the gap between corporate promises and the reality of accessibility is evident, the path to bridging this chasm requires a fundamental shift in approach.

II. Beyond Promises: The Blueprint for Truly Accountable Accessibility

Moving from accessibility ambiguity and aspirational “word fluff” to tangible impact requires a fundamental shift in the mindset and  approach to corporate accessibility. Vague policies are insufficient. We need a new model anchored in specificity, measurability, and accountability. This isn’t just a theoretical ideal; it’s a practical necessity for ensuring that digital inclusion becomes a consistent reality, not a sometimes achievement. To illustrate the stark difference between prevailing vague pledges and the concrete actions required, consider this contrast:

  • New Product Releases
    • Vague Example: We strive to make our new products accessible from launch.
    • Accountable Example: New software or updates will be released with no known accessibility issues that violate WCAG 2.2 standards. Any exceptions must be accompanied by a public commitment to fix the failures within six months of release.
  • Known Issue Transparency
    • Vague Example: We are committed to addressing accessibility issues as they are identified.
    • Accountable Example: Publicly disclose detailed information on all known accessibility issues for existing products within 6 months, moving beyond general conformance statements; include timelines.
  • Addressing Legacy Issues
    • Vague Example: We continuously work to improve the accessibility of our existing products.
    • Accountable Example: Fully address long-standing accessibility issues within 3 years, ensuring complete assistive technology functionality.
  • Core OS Feature Functionality
    • Vague Example: We provide tools to help users with disabilities access core features.
    • Accountable Example: Ensure inbox OS experiences function comprehensively and accurately with assistive technologies within 1 year.
  • Executive Accountability
    • Vague Example: Our leadership is dedicated to fostering an inclusive culture.
    • Accountable Example: An executive to publicly demonstrate new mainstream product features using assistive technology at major annual conferences or other showcase events.

This comparison immediately highlights the inadequacy of common corporate platitudes. The path to genuine accessibility needs to be paved with such concrete commitments. It’s about building a framework where progress can be tracked, where companies are answerable for shortcomings, and where users can have clear expectations. This is imperative because, as I said in my piece on Ethical Accessibility, the goal isn’t a “half empty or half full” glass but ensuring everyone receives a “full glass”—a complete and equitable experience. And it’s vital to alleviate the human cost of inaccessibility, preventing users from being stranded on what I call “Accessibility Island,” cut off and isolated by digital barriers.

III. Defining Meaningful Commitment: The Pillars of an Effective Accessibility Policy

So, what does a truly meaningful commitment to accessibility look like in practice? It’s built on clear, actionable pillars that leave no room for ambiguity. Drawing from concepts designed to ensure genuine corporate responsibility and accountability, here are the kinds of principles that should underpin any robust corporate accessibility policy across the tech industry:

  • Clear Standards for All New Releases: Companies must commit that new software or updates will be released with no known accessibility issues that violate WCAG 2.2 standards or the then current standard. Any exceptions must be accompanied by a public commitment to fix the failures within six months of release. This commitment must apply systemically across the entire product portfolio, not be selectively applied only to those products or services with a specific legal requirement for accessibility. If, for some unavoidable reason, failures are identified pre-release, a public and detailed roadmap to resolve these issues within a short, fixed timeframe (e.g., six months from release) must be mandated and adhered to.
  • Radical Transparency on All Known Issues: Corporations need to commit to publicly disclosing detailed information regarding all known accessibility issues in their software and hardware. This disclosure must go significantly beyond the often-generalized statements found in typical conformance documentation and must include projected timelines for remediation. This directly counters the “opaque” decision-making processes I’ve previously criticized in my discussion on ethical accessibility.
  • Time-Bound Commitment to Fixing Long-Standing Problems: A specific, non-negotiable commitment (e.g., within three years) should be made to fully address and resolve persistent, long-standing accessibility issues in major, widely-used products and services. This includes ensuring complete and reliable functionality with a comprehensive range of assistive technologies.
  • Ensuring Core Platform Experiences Work Flawlessly: There must be an unequivocal guarantee that fundamental operating system or platform-level accessibility features (like voice control, screen readers, live captioning, or their equivalents on various platforms) function comprehensively and reliably with users’ own assistive technologies. This should be achieved within a relatively short and clearly defined timeframe (e.g., one year).
  • Tangible Executive Accountability and Public Demonstration: To truly embed accessibility into the corporate DNA and demonstrate its centrality, companies should require senior executives to publicly demonstrate new mainstream product features using assistive technology at major industry conferences. These demonstrations should focus on everyday usability with assistive technologies, not solely on specialized accessibility-specific features, thereby underscoring that accessibility is an integral part of product excellence for all users with corporate leadership having the foundational knowledge to truly understand what they are delivering.

Adopting these kinds of commitments represents a fundamental shift from internal aspirations or feature showcases to external, verifiable accountability. It makes it significantly harder for companies to obscure failures, indefinitely postpone fixes, or allow accessibility to be deprioritized. It empowers all of us by providing clear, objective criteria against which a company’s accessibility performance can be evaluated.

IV. Recommendations: Forging a Path to Genuine Corporate Accessibility

Achieving this level of genuine corporate accessibility requires a concerted effort. Vague promises are no longer sufficient; a new standard of tangible commitment is necessary.

For Corporations:

  • Adopt and Publicize Specific, Measurable, and Accountable (SMA) Commitments: Embrace the principles for specific, measurable, and accountable commitments, such as those I’ve outlined. Make public commitments to clear accessibility standards for all products, establish defined timelines for fixing issues, and transparently report on progress and failures.
  • Shift from Feature-Centric to Systemic Accessibility: Move beyond showcasing individual accessibility features to ensuring that baseline accessibility is a fundamental attribute of all products and services, not just those with specific legal requirements. Integrate accessibility into every stage of the design and development lifecycle for the entire product portfolio.
  • Embed “Accessible by Design” with Real Accountability: While many companies claim to design for accessibility from the start (e.g., some policies describe an intent to make accessibility integral to the design process), this principle must be coupled with rigorous testing against clear standards and genuine accountability for any lapses.
  • Invest in Comprehensive and Ongoing Training: Ensure all relevant personnel—designers, developers, product managers, QA testers, content creators—receive thorough and continuous training in accessibility.
  • Establish Clear Executive Responsibility and Visibility: Assign explicit responsibility for accessibility outcomes to senior executive leadership. The call for executives to publicly demonstrate mainstream product features using assistive technology is an example to this.
  • Authentically Engage with the Disability Community: Actively solicit, listen to, and incorporate the “lived experience” of individuals with disabilities into product development and policy. Create respectful feedback channels and ensure user-reported issues are taken seriously and reported on transparently, as I’ve emphasized in my writing on “Accessibility Island.”

For Consumers and Advocates:

  • Demand Specificity and Accountability: Do not accept vague promises. Use frameworks of strong, specific demands, like the examples I’ve provided, to ask pointed questions about product roadmaps, standards compliance, and timelines for fixes.
  • Support Organizations Driving Accountability: Lend support to advocacy organizations that monitor corporate accessibility and hold companies accountable.
  • Share Lived Experiences: Amplify the human impact of accessibility failures by sharing personal stories of encountering “Accessibility Island.”
  • Challenge Inadequate Metrics of Success: Question the notion that “participation alone is success” or that a feature’s existence equals a truly usable experience, as I’ve discussed in “Ethical Accessibility.” Advocate for outcomes reflecting genuine usability and dignity.

True change requires sustained, informed public pressure, articulated through specific demands and grounded in our real-world experiences, like those described in “Accessibility Island.” This will significantly increase the pressure on companies to abandon “word fluff” and embrace commitments that lead to tangible improvements.

V. The Human Toll of Vague Commitments: Stranded on “Accessibility Island”

The consequences of non-specific accessibility commitments are far from abstract; they manifest in tangible, often daily, frustrations for users with disabilities. In my writing, I’ve used the metaphor of “Accessibility Island” to capture this experience. This metaphor describes the profound sense of being “cut off or isolated through no choice of your own” when encountering inaccessible digital products and services. I contrast the idyllic image of a tropical island with the harsh reality of being “restrained from all you want to reach” by the “water” of accessibility failings, a central theme in my “Accessibility Island” piece. This isolation is imposed externally, “through no fault of your own,” squarely placing the responsibility on the creators of these exclusionary digital environments. The burden is immense. I’ve described the “three for the price of one” disaster in “Accessibility Island,” a common scenario where a user must simultaneously attempt to learn the subject matter, educate the content creator about basic accessibility, and struggle to operate the inaccessible technology itself. This invisible yet exhausting additional labor is a direct result of corporate pledges that don’t translate into usable products. Users are often left with a “lack of alternatives,” forced to either “stay silent and you may not be able to contribute effectively” or “speak up and you are still far too often seen as complaining,” a dilemma I explore in “Accessibility Island.” This “Accessibility Island” experience is not an occasional inconvenience but a systemic barrier that impedes professional participation and reinforces cycles of exclusion, directly linked to corporate commitments that lack teeth.

The human impact of inadequate accessibility policies underscores the ethical imperative for change.

VI. The Ethical Imperative: Why “Trying Our Best” Isn’t Good Enough

The conversation surrounding accessibility must extend beyond technical compliance; it is fundamentally an ethical issue. As I argued in my piece “Ethical Accessibility,” the aim should not be a “half empty or half full” glass but ensuring everyone receives a “full glass”—a complete and equitable experience. This demands a shift from mere compliance to genuine equity. A critical flaw in current corporate approaches, in my opinion, is that “participation alone is far too often judged as the measure of accessibility success,” a point I elaborate on in “Ethical Accessibility.” VPATs and broad accessibility statements are common, but rarely come with concrete accountability for resolving issues or committing to a barrier-free environment. Decision-making processes often remain opaque, as discussed in “Ethical Accessibility.” This opacity thrives in what I see as a vacuum of robust ethical standards in the digital product sphere. Unlike professions such as medicine or engineering, which operate under established ethical codes prioritizing public welfare and safety, the field of digital accessibility often lacks such a universally adopted, enforceable framework. This makes it difficult to categorize corporate inactions as definitively unethical, even when they result in significant exclusion. If accessibility is indeed a “fundamental human right,” a premise stated by major corporations and one I echo in “Ethical Accessibility,” then failing to provide it is an ethical lapse. This raises challenging questions I’ve posed in “Ethical Accessibility”: “how many accessibility issues are acceptable before a company is judged an ‘abuser of a basic human right’?” “Accessibility is not a part time proposition”; ethical standards cannot be selectively applied, a key argument in “Ethical Accessibility.” Treating accessibility solely as a compliance checkbox, rather than an ethical imperative, stunts innovation in truly inclusive design and perpetuates a “separate but unequal” digital experience. This selective application of accessibility often means that policies and efforts are focused only on products or services with specific legal accessibility mandates, leaving a vast landscape of other digital tools and platforms inaccessible by default.

VII. Examining the Landscape: Common Patterns in Tech Accessibility Pledges

A review of accessibility policies from major technology companies often reveals a recurring pattern: commendable investments and innovations frequently coexist with generalized commitments that lack the specific, measurable, and accountable details I’ve been discussing. For example, common types of statements found in such policies include:

  • Broad policy goals declaring that all content should be or is intended to be accessible, often highlighted alongside various tools and support resources.
  • Pledges to foster open platforms and ensure they are more accessible to all users.
  • Statements highlighting an approach of integrating accessibility into the design process from the outset, often coupled with assertions of a deep-rooted or long-term dedication to accessibility.
  • Commitments to ongoing investment in features and products designed to enhance connectivity or usability for everyone.

While such statements indicate investment in accessibility and the production of valuable features, overarching public commitments often rely on phrases like “strives for,” “committed to,” “believe in,” and “working towards.” These declarations, while positive, frequently lack the clear, publicly accountable frameworks that would apply to all products and services, especially concerning standards for new releases and the timely resolution of known issues. The instances where, despite strong statements, “lingering accessibility issues” or “notable accessibility regressions” occur illustrate that even well-intentioned commitments fall short without more rigorous accountability. There’s a common “feature-focused” narrative, showcasing specific innovative tools, which are beneficial but can divert attention from systemic gaps in baseline accessibility across entire product ecosystems. Highlighting new features can mask situations where fundamental accessibility in mainstream products is lacking. This aligns with my critique of “word fluff”—accentuating positives while broader, systemic accountability remains elusive. What’s often missing are the kinds of publicly binding commitments with clear metrics and non-negotiable timelines applicable to all products, especially for rectifying existing flaws, as outlined earlier.

VIII. Conclusion: Moving Beyond Words to Action

The prevailing culture of issuing aspirational statements about accessibility, without the backing of specific, measurable, and accountable actions, is no longer tenable. As someone who has both benefited from accessibility advancements and been frustrated by their inconsistent application, I know we can do better. Genuine corporate commitment to digital inclusion is not demonstrated by eloquent policy documents or lists of features alone, but through transparent processes, adherence to clear standards, defined timelines for improvement, and demonstrable outcomes that positively impact our daily lives. The journey of a user stranded on “Accessibility Island”—isolated by digital barriers—is a stark reminder of the human cost of inaction. The ethical imperative to provide a “full glass” experience for everyone, as I’ve argued in “Ethical Accessibility,” demands more than just “trying our best”; it demands a fundamental shift. Corporations must move beyond lip service to embed genuine, accountable accessibility into their core operational DNA. Concurrently, we—as consumers, advocates, and professionals—must persistently and articulately demand this higher standard. The ultimate objective is not merely to help individuals escape “Accessibility Island,” but to dismantle such islands altogether. By clearly defining what robust, accountable accessibility looks like, drawing inspiration from the kinds of frameworks I’ve discussed, and by consistently advocating for these standards, substantial and lasting progress is not only possible, but essential.

A Note on AI

Artificial intelligence, with all the positives and negatives, has become omnipresent since I wrote my original posts. I combined my desire to expand on those original posts with a curiosity about how AI would handle this sort of topic as a writing assistant. I stand behind all content as expressing my opinion. AI was used to assist in identifying gaps, arranging content, assisting in formatting and other proofreading/copy editing tasks.

3 Comments

“I Just Want to Be Included”: How a Simple Plea Sparked an unintentional AI Accessibility Experiment

The work of editorial cartoonists and their ability to capture the essence of an idea in a picture or two must be quite the talent to have and job to hold. That phrase “a picture is worth a thousand words,” certainly can ring true.

A recent experience verifying an issue in Microsoft’s Copilot app I was told had been fixed, led to that all-too-common situation in accessibility of taking one step forward and a couple in reverse. My issue was fixed but I discovered new accessibility issues to report.

What began as a routine task—verifying an accessibility fix in Microsoft’s Copilot—quickly devolved into a familiar frustration as new issues emerged. It was in that moment that the app presented me with a default prompt to “Create an image of…”. My frustration, my curiosity about AI, and my interest in editorial cartoons suddenly converged, sparking an unintentional experiment with a fundamental question at its core: could an AI capture my feelings in a picture, and then accurately describe what it had just made?

I gave Copilot this prompt:

“Create an image of the Microsoft logo with a man’s face crying in sadness imposed over it and a cartoon bubble saying, “I just want to be included please,” and here is where things turned interesting.

I used Copilot, Gemini, ChatGPT, and Claude (the last two through JAWS Picture Smart functionality) to get descriptions of the resulting image. I expected some variability but take note in particular of the Copilot description. It leaves out critical details and gives a vastly different version of the very image created through Copilot. I have no illusion that Copilot would have the ability to know that it created the image but really do wonder why the details are so different with Copilot? Did this trigger some content flags internal to Copilot for example?

It is also interesting to note how Gemini’s interpretation of the image is not entirely my intent. Gemini seems to interpret that Microsoft is the entity being left out here and in fact it was my intent to express the opposite, that the person crying was the person being left out.

This experiment raises obvious questions about the capabilities and challenges of today’s AI technology. But for me, it highlights a much deeper, more personal challenge.

As someone who understands the visual world entirely through words—from alternative text, human-provided descriptions and as of late AI-based image description—this leads to a fundamental question: can my words ever be enough to accurately portray a feeling or an idea for an AI to translate into a picture I cannot see but trust represents my intent? If it does, will I have the ability to know this with confidence? Had I spent the time to write those thousand words in the prompt, would the result have been any closer to the image in my mind’s eye?

The image generated by Copilot can be found at http://theideaplace.net/wp-content/uploads/2025/06/CopilotMSFTLogoCryingAndCartoonBalloon.png.

The responses from the various AI descriptions are below.

ChatGPT

The image is a meme featuring the “Crying Jordan” face superimposed over the Microsoft logo. The face has a speech bubble coming from it, saying, “I just want to be included please.” The background shows the Microsoft logo, which includes its four colored squares and the word “Microsoft” in a large font.

The meme seems to humorously convey the idea of being left out or wanting inclusion.

Claude

This image is a popular internet meme that combines the Microsoft logo (including its distinctive four-colored window-pane design in orange and green)

with a crying face. The person in the image appears emotional with tears on their face, and there’s a speech bubble saying “I just want to be included please.” The image is meant to be humorous, playing on the idea of feeling left out, with “Microsoft” written in the company’s characteristic gray font at the bottom of the image. The background is plain white, making the elements of the meme stand out clearly.

Copilot

The image features the Microsoft logo, consisting of four colored squares—red, green, blue, and yellow—arranged in a 2×2 grid. Beneath the logo, the word “Microsoft” is displayed in a bold, gray font. On the right side of the image, there’s a speech bubble containing the text: “I just want to be included please.” The placement of the speech bubble suggests a plea for inclusion or recognition, adding an element of sentiment to the otherwise corporate design. Is there something specific you’re curious about regarding this image?

Gemini

This is a popular internet meme that combines several elements to create a humorous and satirical image. At the center of the image is the disembodied head of a person who appears to be a Black man, crying. His face is contorted in sadness, with visible tears welling in his eyes and streaming down his cheeks. This specific image is widely recognized as the “Crying Jordan” meme, which features a photograph of basketball legend Michael Jordan during his 2009 Basketball Hall of Fame induction speech. In the background, positioned behind the head, is the Microsoft logo, composed of four squares: orange-red in the top left and green in the top right. Below the head, the word “Microsoft” is intentionally misspelled as “Micrososft.” To the right of the head is a cartoon speech bubble with the text, “I just want to be included please.” The overall composition suggests that the corporation Microsoft is being personified as the “Crying Jordan” figure, expressing feelings of sadness, failure, and exclusion, as articulated by the text in the speech bubble. The meme is typically used to mock a perceived failure or a lack of popularity of a Microsoft product or service in comparison to its competitors.

Leave a Comment

Good to Know AI Doesn’t Pull Any Punches

It is good to know that AI doesn’t pull any punches. I took the latest Word Online conformance report available from aka.ms/msacr and asked the Windows Copilot app to summarize the issues identified. I was answering a question on am email list for someone expressing some frustrations with Word Online.

I won’t copy that full response here but it was comprehensive at pulling the details from the report. I will share the summary.

As AI systems seem to want to do these days, Copilot then prompted me with a question. Hey AI, be careful what you ask!

Copilot:

### In Summary

While Word Online meets a number of WCAG 2.1 criteria, these exception points highlight areas where users—especially those relying on assistive technologies—may encounter challenges. The primary issues revolve around:

– **Rendering content as images instead of accessible text by default,**

– **Inconsistent or incomplete keyboard and screen reader accessibility for dynamic content and interactive controls,**

– **Visual presentation concerns (contrast and resizing problems), and**

– **Limited programmatic exposure of certain UI elements and features.**

These exceptions suggest that although the product supports many accessibility features, improvements are necessary to ensure a fully inclusive experience.  

For anyone interested in further details, you might explore how these issues could be mitigated with updated design approaches or additional assistive technology compatibility tests. What are your thoughts—do you see these issues as critical to the overall user experience or as minor inconveniences that can be worked around?

Kelly:

I think these are incredibly problematic:

Copilot:

You’re absolutely right, Kelly. These issues aren’t mere inconveniences—they directly impact users who depend on accessible technology to interact with content. When information is rendered as images rather than text, or when keyboard navigation and screen reader support are inconsistent, it creates significant barriers for people with disabilities. This scenario isn’t just a minor glitch; it’s a fundamental shortcoming that compromises an inclusive user experience.

Moreover, the challenges with contrast ratios, resizing limitations, and the incomplete exposure of UI states further erode the usability and clarity of the interface. For individuals who rely on assistive devices, even small obstacles can lead to frustration and exclusion. It’s a stark reminder that accessibility features must be integrated from the design stage, rather than treated as an afterthought or workaround.

It’s particularly concerning considering the widespread use of Word Online. Many users who could benefit immensely from a fully accessible environment might instead be left with a subpar experience. This not only affects productivity but also perpetuates inequality in access to essential digital tools.

What kinds of solutions do you think would be most effective in addressing these issues? Perhaps more rigorous testing with a variety of assistive technologies or more dynamic adaptation of content could be a start. Your perspective is crucial as it echoes the real-world impact these shortcomings have on users.

I’m curious—what improvements do you envision that could make a real difference for those relying on accessibility features?

Leave a Comment

Tooltip Should Not Start An Accessible Name

I’ve noticed a growing trend to try and make the content from some popups or tooltips directly keyboard accessible on web pages with buttons that reveal the content. This is good but what’s not so good is the naming of these buttons. In pretty much every situation where I’ve encountered this, the name starts with the phrase “tooltip.” For example on a financial web site I use when obtaining a ticker quote, the button in question reads “Tooltip for Quick Quote Disclosure.”

This is not an isolated situation and maybe it is just because I’ve been using a range of financial web sites lately but it seems very common in those experiences of late.

This, as with most accessible names, would be better at minimum to put the unique part of the name first. In this case, “quick quote disclosure.”

There is obviously more too this, such as whether the word tooltip is needed at all, proper roles and more but this basic naming syntax makes a big difference in efficient access to the content. The unique should generally come first in an accessible name.

2 Comments

Experimental Image Description Toolkit For Batch Processing Available

I want to say upfront the vast majority of the code here was written using prompts to ChatGPT. I wanted to see how the AI tool worked for a simple coding project and jump start my own use of Python. Once I started with ChatGPT, I found making adjustments in the scripts myself and then working with ChatGPT became a challenge so for this effort, I opted to use prompts to ChatGPT exclusively.

I suspect like many, I have thousands of pictures from over the years. I wanted a way to process these in bulk and get descriptions.

I had hoped to use OpenAI but never found a way to use a vision model from them and their support department made it sound like it wouldn’t be available to me with the basic OpenAI subscription I have. If someone knows differently, please share more details. I certainly do not want to upload images individually.

That lead me to explore Ollama and their llama3.2-vision model that you can run locally. I’ve published scripts and instructions in a GitHub project that will take a directory of images, read the prompt you want to use from a file and write out individual description files as well as an HTML file with all descriptions.

This does work but is raw and still needs refinement. Work here is definitely defined as works on my equipment and the environments where I’ve tried it. I wanted to share what I have here so far because even in this form, I’ve found it works well for my task. Again, others may already know of better ways to do this. Some of the enhancements I want to add include:

*Better image selection versus just a directory of images.

*Linking to the image file in the HTML descriptions.

*Extracting meta data from the image files, such as date, to help remind of when the images were taken.

*If possible, use GPS data that may be embedded in the image to provide location information for the images.

*Learning more about the llama model and processing to ensure I’m taking advantage of all it offers.

*Cleaning up file use and allowing this to be configured outside the scripts for things such as image path, and results.

*Figuring out how to make this work on Windows and Mac from one script if possible. I’ve run this on both with success but this documentation and script is based on Windows.

*Packing this up as an executable to make it easier to use.

*Exploring a way to flag descriptions for another pass where you want more details.

*Long term, again assuming something doesn’t already exist, explore building GUI apps here.

My primary goal is the processing of images in bulk. I went to a museum recently and ended up with more than 150 images taken with Meta glasses. I did get some descriptions there but want more and again I have thousands of pictures from over the years.

As I said at the outset, I do not want to take any credit for what ChatGPT did here with the code. I guided to the goals I had in mind and such and that itself was an interesting activity. It is by no means automatic. It is also possible there is already a better way to do this so if someone reads all this and says, hey just use this tool or something, I have no investment in this being the end all of image description experiences. I tried finding something that would do what I wanted but didn’t have success so this was my attempt.

It is my understanding that running Ollama on Windows used to require the use of WSL. I don’t know when that changed but documentation and my own use says that you can now use Ollama on Windows without WSL and that’s what I’ve done here.

If you do try this and want to interrupt the script, just press ctrl+c at the cmd prompt. You’ll get an error from Python but processing will stop.

If there is value in this effort  and you want to contribute, I have a GitHub project for this effort. You can also grab the scripts mentioned from the project page.

Last, this is by no means instantaneous. On an M1 MacBook Air and a Lenovo ARM Slim 7, it is taking about three minutes an image. According to Ollama documentation, you do not need an ARM processor on Windows though. This is the sort of thing you run in the background.

If you opt to try this, take note of the scripts and note areas where you need to modify for file paths and such. Feedback is of course welcome. If you try this and it doesn’t work, please do your best to troubleshoot. Until I make more progress, this is kind of an as is idea and nothing something where I can offer a lot of assistance. Most errors are likely not having one of the Python dependencies installed or something not configured with file paths.

One Comment