Skip to content →

Month: July 2025

Image Description Toolkit V2 Available

I’ve made another series of updates to what I’m calling the Image Description Toolkit since my last announcement. As a recap, the goal of this toolkit is to take collections of images and videos and create descriptions you can save and do this all with local AI models. Dozens of tools provide descriptions, but it is still difficult to save those descriptions for future review. With the Image Description Toolkit, you get nicely formatted HTML pages to read through all your image descriptions.

The newest enhancements include a comprehensive testing system to experiment with model prompts, a workflow script that allows for all tasks to be run with one command versus individually running each script and numerous small adjustments throughout the system. The code here is still all AI-generated with my ideas powering what’s created.

I’m sure I’m not objective but for me this has turned into something that started as a curiosity, moved into a better understanding of how AI code generation could work and is now something I’m using regularly. Over the weekend I attended several musical events and was able to generate more than 400 image descriptions from photos and videos I took.

The project lives on GitHub and has a readme that covers the basics of getting started. A guide for using the prompt testing script is also available. This is particularly heklpful for trying out different models.

I’m always curious how AI writing works as well so asked GitHub Copilot to generate a second blog post about project developments. And of course, it is software, so there is also an issue list.

I won’t say for certain what’s next but my current plan is to work on a graphical version of the project to understand more about that environment with Python, create a prompt editor so changing the default prompts is easier and get this all working with Python packaging so install is easier.

Contributions, suggestions or pointers to tools that already do all of this are always welcome.

Leave a Comment

Updates to Image Description Toolkit

Several months ago I announced a highly experimental set of Python scripts I called The Image Description Toolkit. Consider it a fancy name for solving my goal of wanting a way to get thousands of pictures taken from my iPhone and also for the past several decades from whatever phone I was using described and having a permanent description of the photos. I’ve made some key updates, although I’d still say this is categorized as highly experimental.

Most notably, I’ve made it possible to build custom AI prompts, choose the model you use and adjust the parameters used with the model and have all of this done through a configuration file.

I’ve also updated the script that will convert files in the .HEIC format to .JPG and streamlined the output to HTML with a script that can be run. To be very clear, when I say I’ve done these things. All the code in this project was generated with AI through my prompting and refinement.

A readme for the project explaining how all this works is available. I also had AI generate a blog post about the project. You can find the full project on GitHub.

With all of those qualifications, I have found these tools of value. I’ve now generated more than 10,000 image descriptions running on my local computer. The Moondream model used through Ollama has been excellent. It is incredibly fast when used for batch processing, has some of the lowest memory requirements I’ve found and still gives rich details and is highly responsive to different prompts.

I plan to continue experimenting here over time. I want to make setup easier and know about Python packaging but have found it doesn’t always work so this all still requires manual install of Ollama, Python and the individual scripts. The readme file should walk you through this though.

If you have feedback, know of other ways to accomplish these same tasks or suggestions on what else I should include here, feel free to let me know. I’ve leanred a great deal about image processing from AI, using Python and AI code generation from these experiments. And of course, I now have permanent descriptions of more than 10,000 pictures.

Leave a Comment