Possibilities for intelligent mobile screenreaders

30th September, 2010, E. M. Rogers

Introduction

Why couldn't a screenreader do more than layer itself over an inappropriate GUI? Why, in this age of the smartphone, couldn't it intelligently manage the content within a flexible, independent container of its own creation?

Screenreaders

The Pachyderms' Picture BCD will, when built, rely entirely on a software package for the blind and visually impared known as a screenreader. It is the same for all BCDs. The screenreader interprets the textual and graphical output produced by other programs and by the OS itself, matches this against the user's key commands and then either speaks it aloud or feeds the resulting line of Grade 1 or Grade 2 Braille to the BCD. Some of these are hardwired into systems (such as the sonorously voiced Microsoft Narrator on all versions of Windows post ME), some are free (such as Orca) and some are stand-alone commercial products (such as the highly popular JAWS).

A screenreader's function is to provide an efficient way of navigating around the modern GUI by keyboard whilst simultaneously reducing the information therein down to the bare textual content. Perhaps they're missing an opportunity to do more than this. Or rather, perhaps the current trends within smartphone operating systems presents an opportunity for them.

A smartphone trend

Increasingly noticeable through the haze of the iPhone's two-finder-stroke-operated glamour, which has so thoroughly enveloped the rejuvenated smartphone market, is the trend towards seeing computing devices as a way of efficiently bringing together individual packets of information. Not necessarily information as it is to be found on a web browser, a sprawling mass of chaotic links though which the user floats; neither as a series of large lumpen files, dotted across the folder metaphor of a hard disc. Rather, it is a stream of distinct nuggets of information which, though quite possibly as closely related to each other in content as the paragraphs in a website's homepage or Microsoft Word file, are intended to be far more flexible in the mode of their delivery.

The rapid patter of incoming texts, tweets, RSS updates, weather reports, Facebook messages, pokes, GPS directions, pictures of cats and, of course, e-mails from the Managing Director of the Bank of Nigeria is what defines modern smartphone usage. Not so much tools to aid the creation and consumption of complete and relatively lengthy forms of media (films, word-processed documents, PC games &c.) as traditional desktop computers usually are; the smartphones running iOS, Android, QNX et al. focus more on manipulating this stream to constantly present the user with whatever vital or utterly trivial nuggets they most want to see at that very moment. God forbid that your phone should do anything but fall over itself to present you with instant updates on the latest everything.

Getting to the point

The smartphone category, which has risen so rapidly in significance and popularity over the last few years with the renewed focus on slick GUIs, makes much use of this very modular approach to information. And as this approach is highly suitable to screenreaders which, of course, put absolutely no focus on slick GUIs, it may be that the two types of product, superficially with little similarity between them, could in fact be well suited to each other.

Could it be possible for screenreaders on mobiles to take advantage of this modularity? Perhaps so, by using it to populate unique, dymamic documents which represents everything the blind user is interested in and is participating in. A kind of XML file into which the screenreader inputs information, the user reads the information, the user enters information and the screenreader sends this new or altered information back to the app or system call responsible for that communication. This could sit on top of the file system and the network access, not obliging the user to interact with either but mirroring them, so that any changes are instantly relayed back from the dynamic document to the underlying system and visa versa.

By virtue of its absolute reliance on text (whether spoken or rendered in Braille) a screenreader of this ilk could make hay of current trends and introduce their users to an experience which may be so efficient in its focus and flexibility that for some forms of creation and consumption it surpasses even the methods available to the sighted user.

Caveat lector

How, exactly, would this work? I don't know. I don't even know generally, let alone exactly. It was a thought which struck me whilst I was being especially anal about the mark-up of a HTML5 website I'm half way through writing, a thought which I've tried to put on paper in essentially the same form that it occurred to me. It is therefore largely unsubstansiated or researched, but, nevertheless, one which I feel might have some promise.

I'd be interested if anyone else has been thinking along similar lines, knows an existing solution or even (especially, in fact) some flaw, some unfounded assumption in this article which has eluded me. I can be contacted in the PM studio or via pachpict@sdf.lonestar.org.