Skip to content → Skip to footer

Video Description and Icon Detection Functions in Jieshuo Screen Reader

After a recent collaboration between the Jieshuo screen reader developer and VIVO to integrate the Blue Heart AI LLM into Jieshuo, driving the AI image description, icon detection, and inquire-by-voice functions, the daily limit on the number of recognitions has been lifted. This collaboration also introduced a new feature to Jieshuo: the video description function. Let’s explore how this feature works and briefly look at another AI-powered Jieshuo function: icon detection.

Video Description

How It Works

Despite its name, the “video description” feature in Jieshuo doesn’t function as a typical video description service might. Unlike apps such as Seeing AI or PiccyBot, which allow you to share a video directly with the AI for recognition, this feature operates more as a continuous, detailed image description tool. When video description is enabled, Jieshuo repeatedly captures screenshots of the entire screen and sends each one to the AI service for analysis. The results are read aloud as soon as they’re received. In practice, if a video is playing, the AI will describe stills from the video.

The current implementation does not connect or link consecutive images, so each screenshot is analyzed independently. This is noticeable when results arrive, as later images aren’t treated as related to the content of previous images. While the recognition speed is fairly good, it may struggle to stay in sync with videos that have rapid scene changes, potentially leading to delayed or mismatched descriptions.

How to Enable/Disable Video Description

The video description function can be accessed from the Jieshuo’s main menu, assigned to a gesture, or found in the recognition menu. Note that the name is currently in Chinese, but translation should be added in the next Jieshuo beta update. Activating the function begins recognition, which should work on any screen, regardless of whether a video is playing. The feature continues recognizing even when screen content is static.

Since the recognition works on the entire screen rather than a specific focused area, other onscreen elements will be included in the description results if the video isn’t in full-screen mode.

To stop the continuous recognition, you can use the back function or gesture, or restart Jieshuo. Currently, tapping the function name doesn’t stop recognition, so it doesn’t function as a toggle.

Notes:

  1. Video description requires Jieshuo to have the “display over other apps” or “appear on top” permission.
  2. As far as I know, descriptions are initially in Chinese and are translated by Jieshuo’s translation feature into the target language you set in voice assistant and translation settings.
  3. When new results arrive, Jieshuo interrupts the reading of the previous result if it hasn’t finished reading it.
  4. Description quality depends on the capabilities of the Blue Heart model. Recognition-related comments weren’t included, as this post is intended to demonstrate how the feature works rather than assess the quality of the descriptions.

Icon Detection

The icon detection feature aims to identify the icons of onscreen elements, which can be helpful when interacting with unlabeled items. To use it, focus on the element you want to describe, then activate the “recognize focused icon” function. This can be found in the main menu, assigned to a gesture, or triggered through the voice assistant. Once recognition completes, the result will be read aloud. If you miss the automatic reading, you can retrieve the results from the recognition menu. Note that recognition results are cleared upon restarting Jieshuo.

Audio Demonstration

About Author

Kareen Kiwan

Since her introduction to Android in late 2012, Kareen Kiwan has been a fan of the operating system, devoting some of her time to clear misconceptions about Android among blind people. She wrote articles about its accessibility and features on the Blindtec.net Arabic website, of which she was a member of its team. Kareen's experience was gained through her following of the Android-related communities and fueled by her love for technology and her desire to test new innovations. She enjoys writing Android-related articles and believes in the role of proper communication with both the blind screen reader Android users and app developers in building a more accessible and inclusive Android. Kareen is a member of the Blind Android Users podcast team and Accessible Android editorial staff.

Published in Tutorials

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Donate to Us

To uphold the standards of a robust and fully accessible project, we graciously request your support. Even a modest contribution can have a profound impact, enabling Accessible Android to continue its growth and development.

Donations can be made via PayPal.

For alternative methods, please do not hesitate to contact us.

We deeply appreciate your generosity and commitment to our cause.

Subscribe to Blind Android Users mailing list

RSS Accessible Android on Mastodon

  • Untitled
    Roads Audio: Voice Threads https://accessibleandroid.com/app/roads-audio-voice-threads/
  • Untitled
    Infinix Zero 40: A Review from a Visually Impaired User’s Perspective https://accessibleandroid.com/infinix-zero-40-a-review-from-a-visually-impaired-users-perspective/
  • Untitled
    BookFusion Voice: Natural TTS https://accessibleandroid.com/app/bookfusion-voice-natural-tts/
  • Untitled
    Samsung Galaxy Tab S11 Review https://accessibleandroid.com/samsung-galaxy-tab-s11-review/