Skip to content → Skip to footer

The Inquire by Voice Functions in the Jieshuo Screen Reader

Last updated on 25 January 2025

Jieshuo screen reader offers several functions to recognize and describe the current focused item or the entire screen. However, it is not possible to include custom questions or specific prompts with the captured image when using these functions. To address this, Jieshuo has introduced the Inquire by Voice functions, which allow users to capture a screenshot of the entire screen, the current focus, or use the camera to capture a scene, then send it to an AI service along with the text recognized from the user’s voice prompt. This enables users to ask specific questions about the captured images.
Please note that this post is not intended to review the functions’ results or compare them with other Jieshuo image recognition functions, as I haven’t thoroughly tested the new functions yet. Instead, it aims to demonstrate how these functions work and highlight their availability.

How to Use the Inquire by Voice Functions

As the names imply, these functions are used to ask questions via voice. Currently, it is not possible to type the question or prompt.
The available functions are:

  • Inquire by Voice about Current Focus: Captures a screenshot of the focused item and sends it to the AI service with the recognized text from the voice prompt you speak. If no item is focused, the screenshot will be of the entire screen.
  • Inquire by Voice about the Entire Screen: Takes a screenshot of the entire screen, not only a specific item.

Both functions can be found in the main menu, assigned to gestures, activated by the Voice Assistant, and accessed from the Recognition menu. The Recognition menu is available in both the main menu and the functions menu, which can be accessed by swiping up and then right with one finger (default gesture).

When activating either function, you will hear a chime signaling that you can speak your message. Once you’re done, the text is sent with the captured image, followed by a “recognition in progress” message. The response will be read out automatically when ready. If you want to review any responses, go to the Recognition menu and select Recognition Results. Note that the responses are available, but the corresponding questions are not shown. Also, both questions and responses cannot be viewed in the Chat History for now.

Touch and Hold to Inquire About Current Focus

This method allows the user to tap and hold a focused element to use the inquire by voice function about the current focus. The item must be focused either by touching it or using the swiping method. After focusing on the item, the user needs to place their finger on the item’s position on the screen and hold it until hearing the prompt to start speaking. If the user does not wish to speak a prompt, they can use the back gesture to stop the listening service. However, this does not cancel recognition; instead, it provides a description of the item.
To activate this method, go to Advanced Settings, then Voice Assistant and Translation Settings, and check the option that is still in Chinese.

Asking About a Scene Using the Jieshuo Camera

To capture a scene and ask a question about it, open the Jieshuo Camera from the main menu and long-press the volume down key. Speak your prompt and wait for the camera shutter sound, which indicates that the recognition process has started. I noticed during testing that there is a slight delay between asking the question and taking the photo.

Typing Prompts Instead of Using Voice

If you prefer to type your inquiries instead of speaking them, navigate to Screen Reader Settings, select Advanced Settings, and then open Voice Assistant and Translation Settings. There, check the option: “Switch to Text for Voice Assistant and Inquire by Voice Functions.” After it is checked, a text field will appear when you activate any of the inquire functions where you can type your prompt. Tap OK to send the text along with the captured image.

Final remarks

Final remarks

The inquire-by-voice functions use the Vivo BlueHeart AI model, similar to other image recognition-related functions. While other functions rely on Jieshuo’s translation feature to translate results into the user’s target language, as specified in Voice Assistant and Translation settings, my tests show that the text of the spoken prompts is sent in English. Additionally, the responses are also received in English. For instance, I asked the system to identify a grammar mistake in an English text, and it provided a response. Note that I only tested this with English and did not use other languages.

Adding the ability to ask custom questions when recognizing a photo or scene is certainly a welcome addition. This feature could be further improved by offering follow-up questions.

Additionally, since these functions are linked to the AI service they use, any changes in the service will directly impact both the outcome and the functions’ usability. Although Chinese AI models might be considered less advanced than the more well-known international models, AI’s natural language processing and image description capabilities are continuously evolving. Lesser-known services are catching up and showing improvements.

Audio Demonstration

About Author

Kareen Kiwan

Since her introduction to Android in late 2012, Kareen Kiwan has been a fan of the operating system, devoting some of her time to clear misconceptions about Android among blind people. She wrote articles about its accessibility and features on the Blindtec.net Arabic website, of which she was a member of its team. Kareen's experience was gained through her following of the Android-related communities and fueled by her love for technology and her desire to test new innovations. She enjoys writing Android-related articles and believes in the role of proper communication with both the blind screen reader Android users and app developers in building a more accessible and inclusive Android. Kareen is a member of the Blind Android Users podcast team and Accessible Android editorial staff.

Published in Tutorials

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Donate to Us

To uphold the standards of a robust and fully accessible project, we graciously request your support. Even a modest contribution can have a profound impact, enabling Accessible Android to continue its growth and development.

Donations can be made via PayPal.

For alternative methods, please do not hesitate to contact us.

We deeply appreciate your generosity and commitment to our cause.

Subscribe to Blind Android Users mailing list

RSS Accessible Android on Mastodon

  • Untitled
    New app added to Accessible Android apps directory Wispr Flow: AI Voice-to-Text accessible https://accessibleandroid.com/app/wispr-flow-ai-voice-to-text/ #Android #AI
  • Untitled
    Huawei FreeBuds Pro 5 Review: Living With the Upgrade https://accessibleandroid.com/huawei-freebuds-pro-5-review-living-with-the-upgrade/
  • Untitled
    Roads Audio: Voice Threads https://accessibleandroid.com/app/roads-audio-voice-threads/
  • Untitled
    Infinix Zero 40: A Review from a Visually Impaired User’s Perspective https://accessibleandroid.com/infinix-zero-40-a-review-from-a-visually-impaired-users-perspective/