开云体育

ctrl + shift + ? for shortcuts
© 2025 开云体育

Blog post for interest. Thoughts on the New AI Picture Smart in JAWS 2024.


 

开云体育

This post comes from my personal blog rather than our company blog.

?

The March update of JAWS contains considerable new enhancements to the Picture Smart functions. In my view this is one of the very best feature enhancements within JAWS for some time. The system uses both Gemini and Chat GPT services so as to provide descriptions of images from web sites, files or perhaps a screen-shot of an application.
You don't need a special account to use it. You can just enjoy it.

I have been using this for a few weeks now. Here are some practical use cases I frequently work through.
1. I am a very heavy user of X/Twitter as it contains a great deal of content in which I am interested and keeps me up-to-date with events. Many posts on X contain images. I am easily able to focus on the image and get an excellent description in return.
It also works very well on social media sites where, for example, a user has scanned in a page from a TV listings magazine of long ago as the text is rendered correctly.

Here is a description of an image from X:
The image is a collage of four photos of the Swedish pop group ABBA.

The first photo shows the group in a recording studio, with Benny Andersson, Anni-Frid Lyngstad, Bj?rn Ulvaeus, and Agnetha F?ltskog all standing around
a microphone.

The second photo shows the group in a more candid moment, with Bj?rn and Agnetha sitting on a couch and Benny and Anni-Frid standing behind them.

The third photo shows the group at a press conference, with all four members sitting at a table and answering questions from reporters.

The fourth photo shows the group at an awards ceremony, with all four members holding awards.
The Chat GPT output gives more descriptions of the outfits they are waring.

2. In order to prepare videos for upload to my wife's YouTube channel, I use a program called Handbreak. When the video is loaded into Handbreak, I can do a quick scan of the window. If it gives me a good description of the video then I know the visual appeal is of a high standard. I can also carry out the same action when the video is uploaded to YouTube and this gives me confidence since I know it has been uploaded correctly and is looking good.

Descriptions are like this:
The image shows an indoor setting with two individuals. On the left, there is a woman who appears to be smiling and has short, light-colored hair. She
is wearing a bright red top and is seated on a dark-colored couch. The woman has her head slightly tilted and is laughing or speaking. She is holding a
baby with both hands.

The baby, on the right side of the image, is lying on a cushion or support that is partially visible. The baby appears relaxed and is dressed in a white
onesie with various animal prints on it. The onesie has full sleeves, and the baby's left hand is resting on the onesie, showing only the fingers. The
baby has a pale complexion and eyes partially closed, possibly sleeping or resting comfortably.

3. We have a large library of pictures taken over the years and scanning these and obtaining descriptions brings back memories.

In the image, you see a person dressed elegantly in a red outfit with a tiara on their head, indicating a festive or formal occasion. This individual is
wearing jewelry like earrings, a necklace, and a bracelet, all showing a glittering appearance. They seem to be biting into a piece of red and white dessert,
possibly red velvet cake with cream cheese frosting or a similarly styled cupcake, given the size and how it's held.

In the background, you can see a table laid out with drinks, such as wine and champagne, and it seems there is a gathering or celebration taking place.
There's another person visible in the background, sitting at the table, looking towards the camera with a glass in front of them. The room has a warm and
joyous atmosphere, with a vibrant red wall that adds to the festive ambiance. The setting appears to be a home dining area, indicated by the presence of
bookshelves and domestic furnishings.

Useful tips:
By default, JAWS will just give you a brief description of the image. You need to activate the More Results link to obtain description from both services.
The Gemini summary is presented initially because the response time is faster and you may just want a brief picture description.

The list of keystrokes is:
JAWS Key+Space then P then:
C for control.
F for file, to be used in File Explorer as an example.
W for window.
S for screen.

It is important to note that when using this feature with social media sites, when the analysis is retrieved the Results Viewer does not always gain focus. This is a concern. You need to ALT+Tab over to it and possibly press Down Arrow afterwards for the description to read. It does not always happen but it frequently can.

?

Note to J-Say users.

We do not have voice commands for screen and window. But we do have:

Picture with Control. This is used a great deal on social media sites.

Picture with file. This will scan an image when the file is highlighted in File Explorer.

?

We have a very big J-Say update coming later this year and I will be sure to include those other commands which we do not have at the moment.

?

?

Brian Hartgen

Hartgen Consultancy.

Our usual opening Hours are 9 AM to 5 PM UK time, Monday to Friday.

?

Telephone (in the UK) 02921-051325.

Telephone (in the United States of America) 239-256-7779.

?