A screen reader is a software, specifically an accessibility service, that enables blind users to read, identify, and interact with onscreen elements on their computers and mobile devices. Although screen readers may cater to other users’ needs, their main user base, and the one that Accessible Android is built for, is blind people. Android includes a built-in screen reader developed by Google called TalkBack, as well as third-party screen readers. Every blind Android user should be familiar with some screen reader basics. In this article, I will talk about some of these basics not only to provide information to blind beginners but also to educate sighted users. Despite the high probability that most sighted users will not need to work with a screen reader for their entire lives, understanding its basics contributes to a more informed and inclusive society.
Table of Contents
Looking at the screen, tapping an item to activate it, and swiping with one finger to scroll and show other elements or to perform item-specific actions. This is how a sighted person interacts with their Android device. For a blind screen reader user, things are different.
Exploring and activating onscreen elements:
Because a blind user cannot see what is on the screen and consequently cannot tap exactly on the element they want to activate, screen readers allow blind people to explore their screens. As the finger moves on the screen, elements that it touches are reported. Lifting the finger from any of the elements doesn’t do anything by default other than focusing the element.
Alternatively, the user can use navigation gestures, mainly swiping left/right with one finger to navigate through onscreen elements, exploring onscreen content, and focusing elements. To activate any element, the user double taps anywhere on the screen. It is important to be fully aware that it is not necessary for the finger to be on the element; a double tap anywhere is the required step. This is accomplished thanks to the accessibility focus. This screen reader feature is responsible for giving the screen reader the ability to focus onscreen elements and to enable a blind user to activate them in the mentioned method. The item is focused by putting a finger on it or by navigating to it using swiping. After the item is focused, it will remain in focus until either double tapping anywhere on the screen to activate it or moving to another element.
I always remember times when I was using swiping to move to a certain item while being observed by a sighted person. Suddenly, as my finger is about to double tap to activate the focused item, the person says anxiously that this is not the item that I need to activate. In some cases, the sighted person’s finger intervenes to tap where the actual item is at the same time when my finger is double tapping, leading to not activating the item and losing time.
In summary: When using screen readers, tapping an element focuses it and reads it for the user. Moving the finger reads elements on the screen and focuses an element as it is being read. Swiping with one finger right and left moves to the next and previous elements on the screen and focuses the element that the screen reader is reading. Double tapping anywhere on the screen activates the item that was focused using any of the above methods.
TalkBack and third-party screen readers offer a single tap activation method, which in reality is a method that works only when focusing on items by touching them and then tapping again to activate them. In this mode, the next tap should be on the actual element’s position, not anywhere else, because tapping on another position is interpreted as touching and focusing on another item. Additionally, this method doesn’t work when using navigation gestures to focus on elements, where double tapping must be used to activate them.
Swiping and scrolling
As a general rule, swipes that a sighted user performs with one finger are done with two fingers when using a screen reader. For example, moving between home screen pages, which is normally done with a swipe right/left, is accomplished using two fingers instead of one with a screen reader active.
Applying this rule is not always straightforward though. Sometimes, a screen reader user can double tap and hold the item that they want to swipe and then drag their finger in the direction they want to swipe to. An example is swiping up to lock the recording in WhatsApp or Telegram. A blind user performs the action like sighted people, with the difference being that to hold the button, a double tap and hold is necessary instead of a tap and hold. Typically, the user doesn’t need to double tap and hold where the actual button is if they focus on the item using swiping, which moves between onscreen elements without touching them. Another example is sliding down to reveal the notification shade, which can be done by double tapping and holding a status bar item and then dragging the finger downwards before lifting it off the screen.
In other situations, the user must focus on the item by touching its actual position and then, without lifting the finger, placing another finger on the screen and moving them together in the desired direction. This is true when replying to messages on WhatsApp.
To keep things simple, the general rule is to replace any one-finger swipe with two fingers when using a screen reader, but with exceptions. Especially when the user aims to swipe a specific element, where long pressing the element and dragging it could work, or where it is needed to focus on the element by touching it first and then swiping it with two fingers.
Screen reader gestures:
The screen reader has its own focus and is capable of monitoring and altering the behavior of onscreen touches. This brings us to the screen reader gestures.
It was already stated before that swiping with one finger right and left is the default method to move to the next onscreen element or the previous one. These two swipes are part of a set of screen reader gestures that the user can customize. The gestures could involve drawing one or two lines using one finger or swiping in any of the four directions using three or four fingers, as well as multiple finger taps. The gestures can be assigned to certain actions, like moving between granularities such as characters, words, headings, etc., opening the notification shade or quick settings, going back a window, or returning to the home screen, among others. the variety of Actions that the user can assign is decided by the screen reader in use, where third-party screen readers offer more choices compared to TalkBack, the default screen reader.
To make it simple, if you are sighted and you see a blind person doing multiple swipes, drawing an angular gesture like the letter “L,” or double tapping or triple tapping with three fingers on the screen, don’t rush to offer help or assume that the person doesn’t know what they are doing. Just remember screen reader gestures.
Screen reader labels:
Visual representation of buttons and elements is essential in many applications, whether system or third-party ones. It is known that the three dots-shaped button represents “more options”, for example. Other buttons could be arrows or other shapes.
For a screen reader user, buttons are defined by their labels. A label is a description that app developers give to buttons to introduce the function of the button to the screen reader. The three dots, for me as a screen reader user, is “more options.” This difference in what the screen reader user hears and what the sighted person sees could lead to confusion when trying to guide someone to the button that should be pressed. Understanding the nature of labels is very important so sighted users are able to explain things to a blind user better. For example, instead of describing the button, the sighted user could tell the blind user what it does and, if possible, navigate to it with the screen reader to hear what it says.
Unfortunately, labeling is absent in many apps, leaving the blind person confused about what a certain button or control does. Screen readers offer the reading of button IDs when there are no labels, but this doesn’t help sometimes because the ID might not provide the necessary information about what the button does. TalkBack offers some sort of icon description that can help hint at the functionality of buttons. However, the most important action remains to educate app developers to give the onscreen buttons and controls a clear description that can be read by the screen reader.
To make it simple: what a screen reader user is hearing could be different from what you are seeing when dealing with a control or button. Proper button labeling is essential for a convenient user experience because not every unlabeled control can be labeled or described properly by the screen reader, despite the progress and improvements in this regard.
Screen readers and text-to-speech engines combination
The screen reader speaks element names, received messages, interaction results, and other texts aloud. However, the reality is that the screen reader itself cannot speak. It passes the text to another software, which is the text-to-speech (TTS) engine. A TTS engine is capable of converting received text to audible speech.
TTS engines are not used only in screen readers; they are also used in navigation apps to give directions, read-aloud apps that read a webpage or document content, and more. TTS engines differ in quality and responsiveness, with some offered for free, like the built-in Google TTS, and others that are paid. They also differ in the number of languages they support. TalkBack doesn’t include any TTS engine; it relies on what the user selects as the system-preferred TTS. Other screen readers like Jieshuo may come with a TTS engine included in the bundle but are considered separate elements that help the screen reader convert displayed information to speech.
If there is no TTS engine active (typically, phones and tablets already come with a preinstalled engine or two), the user cannot hear what is on the screen. If the engine crashes for any reason, the screen reader is impacted. Unfortunately, TalkBack doesn’t have the capability to switch to another speech engine on the fly to prevent the consequences of a TTS engine crash or when it fails to speak because no voice data is installed.
It is important to note that screen readers on Android don’t offer any automatic language-switching capability mechanisms, leaving the work to TTS engines and third-party tools that try to detect the text language and select the voice that should speak based on the detected language. Additionally, emoji reading is left to the TTS engine when using TalkBack, whereas another famous screen reader, Jieshuo takes a different approach by handling the emoji reading within the screen reader itself, not by the TTS.
TalkBack supports braille (the method that enables a blind person to read using their fingers). In this case, the text is sent to a compatible hardware refreshable braille display device. This is useful for people who own braille displays and prefer to use braille instead of speech, or for people who have hearing difficulties along with their visual impairment.
The screen reader is not the magical solution
Generally, via a screen reader, a blind person can read, focus, activate, and interact with the system and apps’ screens. It also offers workarounds to perform certain tasks, like a method to move sliders easily. However, the screen reader cannot do everything on its own. It relies on the accessibility APIs to interact with the system and receive information, and it relies on app developers’ adherence to accessibility standards to be able to interact with apps. An app developer can ruin the screen reader user experience by making a totally or partially inaccessible UI. Whether intentionally or not, some app developers leave their apps with unlabeled controls, make some necessary elements unreachable by screen readers, use inaccessible sliders and time/date pickers, or even make the whole app’s UI inaccessible.
Although screen readers are improving in terms of detecting some elements and offering icon descriptions, with one third-party screen reader giving the ability to recognize the window content using OCR to make the content more reachable, these techniques don’t always give the best results and might not be able to do anything in certain apps.
With some accessibility-respecting actions, developers enable the screen reader to do its work, allowing blind people to use their devices equally as their sighted counterparts.
Final thoughts
Most blind screen reader Android users are familiar with the screen reader usage basics. However, it is necessary to make sighted people aware of these basics as well. It is also essential for trainers to provide blind beginners with enough information on how to interact with a screen reader and how to customize it to their liking. The screen reader is the core element of a blind user’s experience. Its stability, features, and options are the foundation of using the features of Android devices reliably. However, the screen reader cannot do everything on its own. It needs a text-to-speech engine to convert the displayed text to audio and accessible system and third-party apps to be able to do its work properly. A blind screen reader user needs to feel that their close circles and society as a whole understand screen reader basics and limitations, which prevents facing awkward and embarrassing situations.
In my life, I deal with people who think that a blind person using a touch screen is unbelievable science fiction, people who know a little about screen readers with many myths regarding them, and others who feel very worried if the manufacturer of their phone keeps the accessibility shortcut of holding both volume keys to enable TalkBack on by default. They trigger it by mistake, being unable to use their phones afterward, trying to reach me as quickly as possible to help them in their disaster, or even going to a phone maintenance shop where it is likely that the owner will be more confused than them. And finally, I have a sighted young nephew who is able to deal with a phone with a screen reader turned on and eyes closed.
Integrating blind people into society is highly impacted by screen reader basics education. It could decide if the future holds a wider gap between sighted and blind people or if it is a brighter, more accessible future.
Thank You Kareen. Please Write more posts like this.