Skip to content →

Month: January 2025

Experimental Image Description Toolkit For Batch Processing Available

I want to say upfront the vast majority of the code here was written using prompts to ChatGPT. I wanted to see how the AI tool worked for a simple coding project and jump start my own use of Python. Once I started with ChatGPT, I found making adjustments in the scripts myself and then working with ChatGPT became a challenge so for this effort, I opted to use prompts to ChatGPT exclusively.

I suspect like many, I have thousands of pictures from over the years. I wanted a way to process these in bulk and get descriptions.

I had hoped to use OpenAI but never found a way to use a vision model from them and their support department made it sound like it wouldn’t be available to me with the basic OpenAI subscription I have. If someone knows differently, please share more details. I certainly do not want to upload images individually.

That lead me to explore Ollama and their llama3.2-vision model that you can run locally. I’ve published scripts and instructions in a GitHub project that will take a directory of images, read the prompt you want to use from a file and write out individual description files as well as an HTML file with all descriptions.

This does work but is raw and still needs refinement. Work here is definitely defined as works on my equipment and the environments where I’ve tried it. I wanted to share what I have here so far because even in this form, I’ve found it works well for my task. Again, others may already know of better ways to do this. Some of the enhancements I want to add include:

*Better image selection versus just a directory of images.

*Linking to the image file in the HTML descriptions.

*Extracting meta data from the image files, such as date, to help remind of when the images were taken.

*If possible, use GPS data that may be embedded in the image to provide location information for the images.

*Learning more about the llama model and processing to ensure I’m taking advantage of all it offers.

*Cleaning up file use and allowing this to be configured outside the scripts for things such as image path, and results.

*Figuring out how to make this work on Windows and Mac from one script if possible. I’ve run this on both with success but this documentation and script is based on Windows.

*Packing this up as an executable to make it easier to use.

*Exploring a way to flag descriptions for another pass where you want more details.

*Long term, again assuming something doesn’t already exist, explore building GUI apps here.

My primary goal is the processing of images in bulk. I went to a museum recently and ended up with more than 150 images taken with Meta glasses. I did get some descriptions there but want more and again I have thousands of pictures from over the years.

As I said at the outset, I do not want to take any credit for what ChatGPT did here with the code. I guided to the goals I had in mind and such and that itself was an interesting activity. It is by no means automatic. It is also possible there is already a better way to do this so if someone reads all this and says, hey just use this tool or something, I have no investment in this being the end all of image description experiences. I tried finding something that would do what I wanted but didn’t have success so this was my attempt.

It is my understanding that running Ollama on Windows used to require the use of WSL. I don’t know when that changed but documentation and my own use says that you can now use Ollama on Windows without WSL and that’s what I’ve done here.

If you do try this and want to interrupt the script, just press ctrl+c at the cmd prompt. You’ll get an error from Python but processing will stop.

If there is value in this effort  and you want to contribute, I have a GitHub project for this effort. You can also grab the scripts mentioned from the project page.

Last, this is by no means instantaneous. On an M1 MacBook Air and a Lenovo ARM Slim 7, it is taking about three minutes an image. According to Ollama documentation, you do not need an ARM processor on Windows though. This is the sort of thing you run in the background.

If you opt to try this, take note of the scripts and note areas where you need to modify for file paths and such. Feedback is of course welcome. If you try this and it doesn’t work, please do your best to troubleshoot. Until I make more progress, this is kind of an as is idea and nothing something where I can offer a lot of assistance. Most errors are likely not having one of the Python dependencies installed or something not configured with file paths.

Leave a Comment

Clearing the Chaos: Using A Library Card and U.S. Newsstream for an Improved Reading Experience

While the digital world offers an abundance of online news sources, accessibility is still a work-in-progress  far too often. It is commonplace to spend more time navigating to content than reading that content when using screen reading technology. Poor heading structure, Ads lacking accessibility that are mixed in with the news story, multimedia that plays silently and grabs focus and much more take away from the reading experience.

A public library card and a resource known as U.S. Newsstream offered by many libraries is one solution to include in your reading toolkit to assist.

ProQuest’s U.S. Newsstream is a gateway to an improved reading experience. The full text of hundreds of publications is available  and the helpful thing is that through a URL for each publication, you can easily access a specific issue of a newspaper or other resource with a very screen reader-friendly view of all the article headlines. In addition, when full text is available, you can read the article free of ads or other distractions mixed in with the news.

To use this method of accessing content requires a few preparation steps. First, you need to have a library card for a library that subscribes to this service, and you will need to know your library barcode.

Second, and this is critical, you need to sign into the service through your library’s access point. You can typically find this on your library’s home page under an area called subscription databases, online resources or some other link pointing to the various databases available from your library.

For example, my local library is the Madison Public Library and their list of resources is available under the eResources link.

Following the U.S. Newsstream, link you are prompted to provide library information. Typically this involves some variation of providing your library barcode and at times indicating your library. Again, it is vital you start with this path before going to the next step.

Once you are authenticated to U.S. Newsstream, you can search and use the database directly. However, what has worked well for me is accessing publications directly.

U.S. Newsstream has a list of all their publications you can download. I took the liberty of downloading the file and turning it into a table within Excel and using Excel’s First Column feature to make the file fairly screen reader friendly and available to use.

To access a publication, open the file I’ve created and locate the publication you want to read. Titles are in the first column.

Next, locate the URL for the publication. Hint, with focus in the cell containing the title, press CTRL+Right Arrow and you will jump to the last column in the table row which contains the URL. Press CTRL+C to copy the URL and return to your web browser.

Move focus to the address bar in whatever browser you are using and paste the URL you copied and press Enter. This will take you to the publication page for the resource of interest. Combo boxes allow you to select the year and date for an issue of the publication and a Show Issue Content button brings up the content from that publication for the chosen day.

Article headlines are marked up as headings and links. Pressing enter will load the article.

The article content, when the publication has full text, starts under a heading aptly named full text. At this point you can simply start reading the article. Use whatever method you prefer to navigate back to the list of articles for the publication when finished reading.

As mentioned earlier, it is key that you are signed into the general U.S. Newsstream service before accessing the URL for the publication. If you are not, the URL will not work as described. You will be told to sign in through a library but without options for doing so directly.

The Excel file listing publications has multiple columns of information. These include items such as the dates for which content is available, gaps in coverage, and more.

U.S. Newsstream, other ProQuest databases and resources from your library offer much more functionality and information than outlined here. This is just a starting point.

Finally, I am a firm supporter of a robust and independent news media. Even though I access many articles in the way I’ve outlined here, I do also support publications through subscriptions, donations or other financial contributions. I urge anyone who is able to do so to do the same. Those working in the media have the same bills and needs in life as we all do, and we’ve already seen dramatic losses in meaningful careers in the profession.

Leave a Comment