As the internet becomes dominated by images, Facebook is launching a system which can “read” photos and tell visually impaired people what appears in them.
The internet is changing. From a medium based almost entirely on text, it is now becoming increasingly picture-led. An estimated 1.8 billion images are uploaded every day to social networks such as Twitter, Instagram and Facebook.
Good news for aspiring photographers, bad news for blind or partially sighted users who often have no way of telling what is in an image – despite the available modern assistive technologies.
But a new service from Facebook, being launched on Tuesday, is attempting to remedy that.
Blind people use sophisticated navigation software called screenreaders to make computers usable. They turn the contents of the screen into speech output or braille. But they can only read text and can’t “read” pictures.
Using artificial intelligence (AI), Facebook’s servers can now decode and describe images uploaded to the site and provide them in a form that can be read out by a screenreader.
The man behind the development is Matt King, a Facebook engineer who lost his sight as a result of retinitis pigmentosa – a condition which destroys the light sensitive cells in the retina.
“On Facebook, a lot of what happens is extremely visual,” King says. “And, as somebody who’s blind, you can really feel like you’re left out of the conversation, like you’re on the outside.”
The technology that King and his team have developed uses Facebook’s in-house object-recognition software to decipher what an image contains. It has been trained to recognise items such as food and vehicles.
“Our artificial intelligence has advanced to the point where it’s practical for us to try to get computers to describe pictures in a meaningful way,” King says.
“This is in its very early stages, but it’s helping us move in the direction of that goal of including every single person who wants to participate in the conversation.”
The system currently describes images in fairly basic terms such as: “There are two people in this image and they are smiling.”
However, Facebook says it has now trained its software to recognise about 80 familiar objects, from cars and trains, to food and settings such as mountain, water and beach, and sports such as tennis, swimming and golf. It adds the descriptions as alternative text, or alt text, on each photo. The more images it scans, the more sophisticated the software will become.
Last month, Twitter added a similar function which enables users to manually add their own descriptive text to images. Although the descriptions may be better, it requires users to actively choose to do it, whereas Facebook’s new system automatically tags every photo.
King and Facebook would like the system to go one step further and use face recognition to identify people in a picture by name with help from their database of users, but others are resisting the idea on privacy grounds.
For King, it is a matter of principle – he says sighted and visually-impaired people should have equal access to the content posted online. Sighted people know who is in many of the photos they see, so blind people should also be allowed that same privilege, he believes.
“I feel I have a right to that information,” he says. “I am asking for information that is already available to other people to be revealed to me. So I see it as a matter of fairness.”
Jeff Wieland, head of the Facebook accessibility team, says the social networking site is investing in accessibility and devising strategies for different communities, to allow them to engage with it.
He says the site is “going to have dedicated teams thinking about how to get all these different communities on
-board and connecting with each other. That is the chance for us to be equalisers and to really empower the world”