Image recognition isn’t a new feature in the Jieshuo screen reader, but it used to be slow and not detailed. Recently, Jieshuo integrated a new AI-based more detailed image description feature based on a Chinese AI model, exclusively available to paid subscribers.
Table of Contents
How to Use the New detailed Image Recognition:
The new image recognition, similar to the old one, works on the focused item. It is not possible to share an image to Jieshuo for description. Instead, you have to focus the image, or it’s better to open the image and focus on the image view. In some apps like Google Photos or Samsung Gallery, locating the image is easy. In others like WhatsApp, it’s enough to open the image and access the feature, avoiding moving the focus to prevent difficulties in refocusing on the actual image.
To start recognition, open the “Functions menu” accessed by default using the swipe up then right gesture. Then, select the “Recognition menu.” Currently, to initiate recognition, you need to choose the Chinese option, which cannot be translated at the moment. Alternatively, after entering the recognition menu, you can activate this item directly using the right shortcut function, provided that you have enabled the “Use shortcut gestures to click the OK/cancel button in dialog boxes” option from the shortcut gestures settings.
When the result is ready, it will be announced automatically. The waiting duration may range between 30 seconds and 2 minutes. If it takes too long for the result to be spoken, it’s recommended to try recognizing again.
To view recognition results, revisit the recognition menu accessible also from the main menu and tap “view recognition results.” There, you’ll find image recognition results, translations, OCR on the focused item results, and the scene recognition feature results.
The current daily limit of recognitions is 100 times, and there is no indication of how many attempts remain.
My Impressions About the Feature:
The new detailed image description is inconsistent based on my testing. At times, it offers a correctly detailed description, while in others, it omits a lot of detail or even fabricates details, such as hallucinating the presence of a man in a photo of a living room where no person was present. Not only did it fabricate the presence of the man, but it also claimed he was holding a cup of coffee.
The service can detect people and describe them, but the descriptions are usually not very detailed.
When text is detected in an image, its presence may be revealed, but usually, it will not be read entirely, or it might only read a short portion of it. For accurate text recognition, it is advisable to utilize text recognition features like “Recognize text of the current focused element” or “Recognize text on the current screen.”
Compared to GPT-4 or Google Bard, the AI model used lags behind.
There is noway to ask follow-up questions. Repeating recognition on the same image might yield additional details, but at times, it might result in even poorer outcomes.
Despite these issues, the feature itself performs well, particularly when it provides accurate descriptions.
It’s important to note that the older, less detailed image description remains available and can be accessed from the recognition menu or directly from the main menu, labeled as “Image recognition.” You can also assign a gesture to access it if desired.
Some users may have privacy concerns due to the lack of explicit information about the AI service and model used in image recognition, as well as its handling of image storage and analysis policies. Providing more details about this service would be beneficial, particularly considering that the recognition is conducted by a third-party service rather than Jieshuo itself.
Comments