Assistive Tech on the End of Sight

0
386
Assistive Tech on the End of Sight


Seeing his phrases on the printed web page is a giant deal to Andrew Leland—as it’s to all writers. But the sight of his ideas in written type is way more valuable to him than to most scribes. Leland is steadily dropping his imaginative and prescientas a result of a congenital situation referred to as retinitis pigmentosa, which slowly kills off the rods and cones which might be the eyes’ mild receptors. There will come a degree when the most important kind, the faces of his family members, and even the solar within the sky received’t be seen to him. So, who higher to have written the newly launched guide The Country of the Blind: A Memoir on the End of Sight, which presents a historical past of blindness that touches on occasions and advances in social, political, creative, and technological realms? Leland has fantastically woven within the gleanings from three years of deteriorating sight. And, to his credit score, he has performed so with out being in the slightest degree doleful and self-pitying.

Leland says he started the guide mission as a thought experiment that may enable him to determine how he may finest handle the transition from the world of the sighted to the neighborhood of the blind and visually impaired. IEEE Spectrum spoke with him concerning the function expertise has performed in serving to the visually impaired navigate the world round them and benefit from the written phrase as a lot as sighted individuals can.

IEEE Spectrum: What are the bread-and-butter applied sciences that almost all visually impaired individuals depend on for finishing up the actions of every day dwelling?

Andrew Leland: It’s not electrons like I do know you’re in search of, however the elementary expertise of blindness is the white cane. That is step one of mobility and orientation for blind individuals.

A book cover shows illustrations of sightless individuals in different action poses. The text reads The Country of the Blind, A Memoir at the End of Sight, Andrew Leland.

It’s humorous…. I’ve heard from blind technologists who will typically be pitched new expertise that’s like, “Oh, we came up with this laser cane and it’s got lidar sensors on it.” There are instruments like that which might be actually helpful for blind individuals. But I’ve heard tremendous techy blind individuals say, ‘You know what? We don’t want a laser cane. We’re simply nearly as good with the traditional expertise of a extremely lengthy stick.”

That’s all you want. So, I might say that’s No. 1. No. 2 is about literacy. Braille is one other old-school expertise, however there’s in fact, a contemporary model of it within the type of a refreshable Braille show.

How does the Braille show work?

Leland: So, should you think about a Kindle, the place you flip the web page and all the electrical Ink reconfigures itself into a brand new web page of textual content. The Braille show does an analogous factor. It’s bought anyplace between like 14 and 80 cells. So, I suppose I want to elucidate what a cell is. The approach a Braille cell works is there’s as many as six dots organized on a two-by-three grid. Depending on the permutation of these dots, that’s what the letter is. So, if it’s only a single dot within the higher left house , that’s the letter a. if it’s dots one and two—which seem within the prime two areas on the left column, that’s the letter b. And so, in a Braille cell on the refreshable Braille show there are little holes which might be drilled in, and every cell is the scale of a finger pad. When a line of textual content seems on the show, completely different configurations of little tender dots will pop up via the drilled holes. And then whenever you’re able to scroll to the following line, you simply hit a panning key they usually all drop down after which pop again up in a brand new configuration.

They name it a Braille show as a result of you possibly can hook it as much as a pc in order that any textual content that’s showing on the pc display, and thus within the display reader, you possibly can learn in Braille. That’s a extremely vital characteristic for deafblind individuals, for instance, who can’t use a display reader with audio. They can do all of their computing via Braille.

And that brings up the third actually vital expertise for blind individuals, which is the display reader. It’s a bit of software program that sits in your telephone or laptop and takes the entire textual content on the display and turns it into artificial speech—or within the instance I simply talked about, textual content to Braille. These days, the speech is an efficient artificial voice. Imagine the Siri voice or the Alexa voice; it’s like that, however moderately than being an AI that you simply’re having a dialog with, it strikes all of the performance of the pc from the mouse. If you concentrate on the blind particular person, having a mouse isn’t very helpful as a result of they’ll’t see the place the pointer is. The display reader pulls the web page navigation into the keyboard. You have a collection of scorching keys, so you possibly can navigate across the display. And wherever the main target of the display reader is, it reads the textual content aloud in an artificial voice.

So, if I’m getting into my electronic mail, it would say, “112 messages.” And then I transfer the main target with the keyboard or with the contact display on my telephone with a swipe, and it’ll say “Message 1 from Willie Jones, sent 2 p.m.” Everything {that a} sighted particular person can see visually, you possibly can hear aurally with a display reader.

You rely an important deal in your display reader. What would the hassle of writing your guide have been like along with your current stage of sightedness should you had been attempting to do it within the technological world of, say, the Nineties?

Leland: That’s a very good query. But I might perhaps counsel pulling again even additional and say, like, the Sixties. In the Nineties, display readers have been round. They weren’t as highly effective as they’re now. They have been costlier and more durable to search out. And I might have needed to do much more work to search out specialists who would set up it on my laptop for me. And I might most likely want an exterior sound card that may run it moderately than having a pc that already had a sound card in it that might deal with all of the speech synthesis.

There was screen-magnification software program, which I additionally rely rather a lot on. I’m additionally actually delicate to glare, and black textual content on a white display doesn’t actually work for me anymore.

All that stuff was round by the Nineties. But should you had requested me that query within the Sixties or 70s, my reply can be utterly completely different as a result of then I may need needed to write the guide longhand with a extremely large magic marker and refill a whole bunch of notebooks with big print—principally making my very own DIY 30-point font as an alternative of getting it on my laptop.

Or I may need had to make use of a Braille typewriter. I’m so gradual at Braille that I don’t know if I truly would have been capable of write the guide that approach. Maybe I may have dictated it. Maybe I may have purchased a extremely costly reel-to-reel recorder—or if we’re speaking Eighties, a cassette recorder—and recorded a verbal draft. I might then should have that transcribed and rent somebody to learn the manuscript again to me as I made revisions. That’s not too completely different from what John Milton [the 17th-century English poet who wrote Paradise Lost] needed to do. He was writing in an period even earlier than Braille was invented, and he composed strains in his head in a single day when he was on their lonesome. In the morning, his daughters (or his cousin or buddies) would come and, as he put it, they might “milk” him and take down dictation.

We don’t want a laser cane. We’re simply nearly as good with the traditional expertise of a extremely lengthy stick.

What have been the vital breakthroughs that made the display reader you’re utilizing now doable?

Leland: One actually vital one touches on the Moore’s Law phenomenon: the work performed on optical character recognition, or OCR. There’s been variations of it stretching again shockingly far—even to the early twentieth century, just like the 1910s and 20s. They used a light-sensitive materials—selenium­—to create a tool within the twenties referred to as the optophone. The approach was referred to as musical print. In essence, it was the primary scanner expertise the place you would take a bit of textual content and put it underneath the attention of a machine with this actually delicate materials and it might convert the ink-based letter types into sound.

I think about there was no Siri or Alexa voice popping out of this machine you’re describing.

Leland: Not even shut. Imagine the capital letter V. If you handed that underneath the machine’s eye, it might sound musical. You would hear the tones descend after which rise. The reader may say “Oh, okay. That was a V.” and they’d hear for the tone mixture signaling the following letter. Some blind individuals learn whole books that approach. But that’s extraordinarily laborious and a wierd and tough option to learn.

Researchers, engineers, and scientists have been pushing this type of proto–scanning expertise ahead and it actually involves a breakthrough, I believe, with Ray Kurzweil within the Nineteen Seventies when he invented the flatbed scanner and perfected this OCR expertise that was nascent on the time. For the primary time in historical past, a blind particular person may pull a guide off the shelf—[not just what’s] printed in a specialised typeface designed in a [computer science] lab however any outdated guide within the library. The Kurzweil Reading Machine that he developed was not instantaneous, however in the middle of a pair minutes, transformed textual content to artificial speech. This was an actual sport changer for blind individuals, who, up till that time, needed to depend on guide transcription into Braille. Blind faculty college students must rent any person to document books for them—first on a reel-to-reel then afterward cassettes—if there wasn’t a particular prerecorded audiobook.

Black and white photo of a young dark haired girl with her eyes closed, and her fingers resting on a rectangular machine with buttons on it. Audrey Marquez, 12, listens to a taped voice from the Kurzweil Reading Machine within the early Eighties.Dave Buresh/The Denver Post/Getty Images

So, with the Kurzweil Reading Machine, all of a sudden the complete world of print actually begins to open up. Granted, at the moment the machine value like 1 / 4 million {dollars} and wasn’t broadly out there, however Stevie Wonder purchased one, and it began to seem in libraries at colleges for the blind. Then, with a variety of the opposite technological advances of which Kurzweil himself was a well-liked form of prophet, these machines grew to become extra environment friendly and smaller. To the purpose the place now I can take my iPhone and snap an image of a restaurant menu, and it’ll OCR that restaurant menu for me routinely.

So, what’s the following logical step on this development?

Leland: Now you may have ChatGPT machine imaginative and prescient, the place I can maintain up my telephone’s digicam and have it inform me what it’s seeing. There’s a visible interpreter app referred to as Be My Eyes. The eponymous firm that produced the app has partnered with Open AI, so now a blind particular person can maintain their telephone as much as their fridge and say “What’s in this fridge?” and it’ll say “You have three-quarters of a 250 milliliter jug of orange juice that expires in two days; you have six bananas and two of them look rotten.”

So, that’s the type of capsule model of the development of machine imaginative and prescient and the ability of machine imaginative and prescient for blind individuals.

What do you suppose or hope advances in AI will do subsequent to make the world extra navigable by individuals who can’t depend on their eyes?

Hands hold a phone with a chat open. The user has posted a photo, and asked the AI to describe the clothes in detail.Virtual Volunteer makes use of Open AI’s GPT-4 expertise.Be My Eyes

Leland: [The next big breakthrough will come from] AI machine imaginative and prescient like we see with the Be My Eyes Virtual Volunteer that makes use of Open AI’s GPT-4 expertise. Right now, it’s solely in beta and solely out there to some blind individuals who have been serving as testers. But I’ve listened to a few demos that they posted on podcast, and to an individual. They speak about it as an absolute watershed second in historical past of expertise for blind individuals.

Is this digital interpreter scheme a completely new concept?

Leland: Yes and no. Visual interpreters have been out there for some time. But the way in which Be My Eyes historically labored is, let’s say you’re a completely blind particular person, with no mild notion and also you wish to know in case your shirt matches your pants. You would use the app and it might join you with a sighted volunteer who may then see what’s in your telephone’s digicam.

So, you maintain the digicam up, you stand in entrance of a mirror, they usually say, “Oh, those are two different kinds of plaids. Maybe you should pick a different pair of pants.” That’s been superb for blind individuals. I do know lots of people who love this app, as a result of it’s tremendous useful. For instance, should you’re on an accessible web site, however the display reader’s not working [as intended] as a result of the try button isn’t labeled. So you simply hear “Button button.” You don’t understand how you’re going to take a look at. You can pull up Be My Eyes, maintain your telephone as much as your display, and the human volunteer will say “Okay, tab over to that third button. There you go. That’s the one you want.”

And the breakthrough that’s occurred now could be that Open AI and Be My Eyes have rolled out this expertise referred to as the Virtual Volunteer. Instead of getting you join with a human who says your shirt doesn’t match your pants, you now have GPT-4 machine imaginative and prescient AI, and it’s unbelievable. And you are able to do issues like what occurred in a demo I just lately listened to. A blind man had visited Disneyland along with his household. Obviously, he couldn’t see the images, however with the iPhone’s image-recognition capabilities, he requested the telephone to explain one of many photos. It mentioned, “Image may contain adults standing in front of a building.” Then GPT did it: “There are three adult men standing in front of Disney’s princess castle in Anaheim, California. All three of the men are wearing t-shirts that say blah blah.” And you possibly can ask follow-up questions, like, “Did any of the men have mustaches?” or “Is there anything else in the background?” Getting a style of GPT-4’s image-recognition capabilities, it’s simple to know why blind persons are so enthusiastic about it.

LEAVE A REPLY

Please enter your comment!
Please enter your name here