TTS/Screenreader support for blind players

Post a reply

Smilies
:D :) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :wink: :geek: :ugeek: :!: :?: :idea: :arrow: :| :mrgreen: :3: :wub: >:( :blergh:
View more smilies

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: TTS/Screenreader support for blind players

Re: TTS/Screenreader support for blind players

by Guest » Fri Jul 09, 2021 4:49 pm

Howdy folks! I think the menus can be read with Capture 2 Text and Minispeaker which are both free software. The issue is the fancy font should probably be replaced in the main menu with font similar to that of the settings menu and brightened up while reducing the background sto something like a solid black color. I've tried to read the menus with Capture 2 text but it wants to read the background as well and won't recognize dark text that is against the busy background. It did work... kinda... but I imagine it MIGHT be improved with these changes. Also, OCR with the NVDA screen reader MAY work as well... but I have not yet tested it. Lion OCR is an addon to NVDA and works in a different way but still using Windows 10s OCR engine but is yet another option.

Using Capture to text reads whatever you highlight with a hotkey and the mouse then puts it in the clipboard as well as a popup window I'd recomend turning off. Minispeaker simply reads everything copied to the clipboard and so using them both together can create a sort of rough screen reader.
It is not perfect but does work in other games.

Capture 2 Text
https://sourceforge.net/projects/capture2text/

Minispeaker
http://speakcomputer.com/text-to-speech ... eaker.aspx

Re: TTS/Screenreader support for blind players

by Alando1 » Mon Jun 07, 2021 7:14 pm

Graf Zahl wrote:
objectinspace wrote: ???

If empowering the 35 million plus people with sight loss around the world to play Doom has "very little payoff in the end," what is the point of this project? I don't imagine , for example, adding OpenGL/DX11 support to a game that's some 30 years old was trivial either, but it was done, because people wanted it.
How many of these play Doom? You seem to be the first one to come around.
I'm really sorry but this is a hobby project - I simply do not have the time to invest into a feature that may ultimately be used by a handful of users. I surely won't say "no" if someone stepped up to implement it, but as things stand, I see zero chance for it to happen.
@Graf Zahl - Back in 2017, a gentleman named Toby (who the wad is named after) reached out to me. There's a documentary called "Gaming Through New Eyes" that he showed me. Here's the link to the video: https://youtu.be/P7n9s7yBlGw

In fact, there are many blind/visually impaired gamers out in the world. I've teamed up with not just Toby, but SightlessKombat and IllegallySighted. These gentlemen have been helping me and my programmer, Jarewill, to further improve the accessibility features. A handful of blind/visually impaired people comment on some of my YouTube videos about the mod. If they have any questions or concerns, I do my best to help them. All in all, there are many people who are blind and want to play games such as Doom. I'm willing to do what I can to give them the chance to have fun with an amazing game.

I am very grateful there are many people who support my project and I'm very grateful I get to work with a group of amazing individuals who helped shape this project into what it is today. As I have pointed out on the main project thread regarding the accessibility mod, if anyone would like to contribute something to the project, they are more than welcome.

-Alando1

Re: TTS/Screenreader support for blind players

by wildweasel » Sun Jun 06, 2021 1:47 pm

Proydoha wrote:Menu problem can be attacked from a different angle. You can load into any map bypassing menus by providing suitable launch parameters. Usage of various launchers is not unknown concept for doom players.

Maybe somebody can make a launcher that would compose all launch parameters needed as well as editing an .ini file if settings need to be changed while providing all sounds needed in the launcher and not in the game.
This doesn't necessarily mean the menu problem shouldn't be tackled at some point; what if the player needs to change game options, or save the game?

Re: TTS/Screenreader support for blind players

by Proydoha » Sun Jun 06, 2021 12:30 pm

Menu problem can be attacked from a different angle. You can load into any map bypassing menus by providing suitable launch parameters. Usage of various launchers is not unknown concept for doom players.

Maybe somebody can make a launcher that would compose all launch parameters needed as well as editing an .ini file if settings need to be changed while providing all sounds needed in the launcher and not in the game.

Re: TTS/Screenreader support for blind players

by SanyaWaffles » Sat Jun 05, 2021 8:35 pm

@Alando1 - I appreciate the work you're doing. I don't think I ever have said that.

There are ways of doing menus with custom sounds in the menus in theory to add such a feature. Check and see how ZScript implements it.

Re: TTS/Screenreader support for blind players

by Alando1 » Sat Jun 05, 2021 1:42 pm

A screen reader would be helpful, indeed. However, my team and I who are working on the Toby Accessibility Mod have been trying to think about if it is possible to have specific audio queues based on what the Doom skull cursor is highlighting. It would be nice if we could define unique sounds for the Doom menu - Perhaps something with ZScript?

If anyone wants to check out the Toby Accessibility Mod, you can find it here: viewtopic.php?f=4&t=71349

-Alando1

Re: TTS/Screenreader support for blind players

by Enjay » Sun May 30, 2021 5:37 am

Fair enough. My experience with OCR is limited.

I tried it many years ago to convert scanned paper documents to text and it had so many errors that it was actually easier to re-type the documents in most cases. More recently, however, at work we needed to do the same thing but the OCR worked well enough that the errors were minimal and all that was needed was a quick proof read. So because things had improved for that job, I wondered if it might be possible elsewhere. However, I can certainly understand how the low-res pixelated graphical fonts from a 320x200 game would be difficult.

Perhaps one day AI will be intelligent enough to look at shapes and make sense of them in a similar way to how humans can - but not yet I guess.

Re: TTS/Screenreader support for blind players

by Graf Zahl » Sun May 30, 2021 4:51 am

Enjay wrote: I'm just wondering about this from the opposite perspective. Does some sort of overlay software exist that can do OCR and then do a TTS conversion on the fly? Would it take too much processing power? Would it be too confused by "fancy" fonts?
You may do something like that if all you used is simple to read stock fonts - and even then OCR is still hit and miss. Doing it with graphical fonts - especially extremely low res ones blown up 5-6 times - is not going to work. Having done some processing of books into a digital format I can assure you that it just won't work. Most OCR software will already bail out if you feed it some blackletter printing from early 20th century Germany, and these are known fonts, not some random collections of pixels.

Re: TTS/Screenreader support for blind players

by Enjay » Sun May 30, 2021 4:11 am

objectinspace wrote:what is the point of this project?
Primarily to get the Doom engine running on modern hardware with a modern renderer and to (massively) expand the modding capabilities while remaining compatible with as many legacy mods as possible. Anything else, including accessibility for disabled users, is a bonus - but not a main goal. GZDoom is already more accessible than the original Doom games.
objectinspace wrote:I'm actually extremely offended by this answer, on a deep personal level. Your attitude is ignorant, hateful, bigoted, ableist--and most importantly, wrong.
You are offended - well, that's disappointing, but that's all it is. (I tend to agree with the Stephen Fry quote about people being offended. (Though perhaps not so hard-over as that quote implies, and I certainly have sympathy in this particular case. There are times, IMO, when being offended is an appropriate response.)) However, I feel that your response could be described using a very similar set of words as the ones you use.

I get that being ignored, marginalised and dismissed for a disability (or for any reason for that matter) is hugely frustrating, hurtful, sometimes even dangerous and a daily occurrence for many people. It sucks, I really do get that (which is why I feel that the Stephen Fry position is too harsh in this case). But GZDoom is a hobby project; anything and everything that has been done with it is as a free gift to the community. Coming out swinging at the people who invest thousands of hours to make it really isn't a good way forward. Demanding action from someone who has no obligation to you and who does not have the time/skills/whatever to do what you want and attacking them by calling them names for it isn't going to help either. Surely the irony in you using language likely to be interpreted as offensive to describe someone who offended you will not be lost on you.

Now that is out of the way...

I'm just wondering about this from the opposite perspective. Does some sort of overlay software exist that can do OCR and then do a TTS conversion on the fly? Would it take too much processing power? Would it be too confused by "fancy" fonts?

I agree with the views expressed by most of the contributors to this thread. I think that they are realistic and correct in the context of GZDoom. I'll not rehash them because they have been well covered many times already. But it strikes me that if some sort of software could sit on a person's computer and convert text on images to speech in real time it would be the answer to this.

I'm not suggesting that this would be something for the GZDoom devs to do (far from it) but it strikes me as the kind of thing that probably should already be in development (or already exist) somewhere and, if it does exist, then the people responsible would be the ones best placed to make it more universal to allow it to be used in games - any games - and a whole bunch of other programs too. It wouldn't just enable blind and visually impaired people to play a nearly 30 year old niche retro game, but any game! Does such a thing exist?

Re: TTS/Screenreader support for blind players

by SanyaWaffles » Sun May 30, 2021 3:32 am

MartinHowe wrote:As a disabled person myself, I would have had a much better life if I'd been born in 1995 instead of 1965, but there's nothing I can do about it; only hope (and, in my job, work) for a better life for people like me in 50 year's time.
Yeah, I can understand. I have a form of autism myself and have a hard time working in America, but I do my best. I do my best to work daily. It's tough, can't deny it.

You made some good points there MartinHowe.

Re: TTS/Screenreader support for blind players

by MartinHowe » Sun May 30, 2021 2:35 am

I think the problem here is the issue of bolting on a modern feature to a game architecture written before even the 1995 Disability Discrimination Act, let alone the 2010 Equality Act (using UK legislation for illustration here). Back in 1993, awareness among the general population - presumably including Id's programmers in their capacities as such - was low. As time has progressed, our understanding as a society has improved a bit; we now have laws to help disabled people and computer programming standards are more aligned to the needs of diverse people. The problem is, I feel, two-fold.

Firstly, the architecture of the Doom Engine is fundamentally 8-bit ASCII with images used to contain text (e.g., M_JKILL). Heretic improved that with the ability to write these as text from a doom-laden font (pardon the pun :)) Graf Zahl and others have added things like Unicode support. The engine has, generally speaking, evolved over time. However, to do EQD properly requires software to be designed from the ground up with it in mind, with all the hooks into the engine to support it. In the case of software with user mods, modders need to support it as well. Since Doom is 28 years old, it's simply too late. There are literally thousands of mods out there that were written by kids who had never even heard of EQD, at at time when, depending on their location and first language, the term EQD didn't even exist.

Secondly, the vast amount of effort and formality needed to support EQD properly is way beyond the resources of the average hobbyist or small team. Big megacorps and larger regular companies can afford teams of people who do nothing but oversee this sort of thing, compliance officers who check software and procedures to ensue they are EQD friendly, etc. Small companies and hobbyists simply don't have the time to do this as there are so many different disabilities, health conditions, races, languages, etc., to consider.

So this is really a resource issue, plus a historical legacy issue. As a disabled person myself, I would have had a much better life if I'd been born in 1995 instead of 1965, but there's nothing I can do about it; only hope (and, in my job, work) for a better life for people like me in 50 year's time. It's harsh, but that's the reality of it. I think better support for visually, or other, impaired people is possible to do in an old game, but it will never be as complete as a game designed with today's sensibilities in mind, and more importantly, it needs enough volunteers with time on their hands to do it.

Re: TTS/Screenreader support for blind players

by SanyaWaffles » Sat May 29, 2021 8:49 pm

I'm just going to say, this is probably a post that is going to ruffle some feathers. Mainly because I have plenty of experience with this.

First, as Phantombeta said, it's a matter of there's multiple ways text is rendered and processed. We have to account for mods, and since GZDoom is a game that supports a plethora of modded content - we have to make backwards compatibility a paramount effort. This would singlehandedly, with the scope of the project be the biggest undertaking of the engine - bigger than probably getting it GPLv3 compliant.

One of my partners is blind and she contributes to the work we do extensively at our studio. I haven't asked her about this specific topic (GZDoom hooking into screen readers), but she seems to enjoy just having a more auditory experience when we stream the game our project makes extensive use out of voice acting, rather than relying on solely text-related prompts like most games do.

I know it devastated her to lose her sight and put her into a deep depression, which sometimes hits her like a ton of bricks, and she wishes she could engage in gaming again. This is why I do my best to work within the GZDoom engine to make it so people like Jenna don't have to rely on just a reading experience. And she's completely blind - only able to see basic colors and light. She can't make out shapes. She's been like this since 2017. It's one of the reasons SHDX had such extensive use of voice acting, and we're continuing that for Project Absentia.

My opinion on the matter - mainly from talking to Jenna extensively about her being blind and what she needs from a user friendly experience - is this is a problem with a ton of open source projects - there's not enough funding and not enough dedicated people, and people tend to get it wrong even in an open source environment. And it can be even worse in a corporate environment. You think it's bad in the open source community? Look at Twitch - a company owned by a gigantic monolithic corporation that bleeds their workers dry. Their idea of accessibility? Fucking censoring people because "blind playthrough" is somehow ableist. She doesn't agree with that ruling - and I'm going to listen to Jenna more than some random Twitterite on the matter, frankly. And you got corporations rather taking the issue all the way to the supreme court than funneling money to make their goddamn app accessible to NVDA or Talkback or whatever you use.

That said, she also knows small hobby engines like GZDoom don't have the manpower to implement something like this - it's a community effort, but none of us, myself included, are experienced with how to hook something almost entirely graphical (with no windows form UI to speak of) into something like a screen reader. The best way to do it would be to either fork the engine yourself with people who know what needs to be implemented, or do as I'm doing - work within the framework of the engine and use stuff like ZScript (which powers most of the menus and most game logic) to make the user experience less reliant on text alone.

The thing is, it can be done if people were willing to put effort into it. People who know what they're doing. People who'll listen to people with poor sight - and do more than just polish up mere language. However, it's something that's hard to do when you have a ton of dedicated time and there's all sorts of ways it could be impemented - and it could end up wrong.

I do want to add something though. And this might come across harsh.
I'm actually extremely offended by this answer, on a deep personal level. Your attitude is ignorant, hateful, bigoted, ableist--and most importantly, wrong.
I know I'm not the devs thus it was not directed at me, but assuming everyone here is all these "isms" is fucking stupid, frankly - and I don't care if the devs or the mod teams get angry at me, I'm not going to stand for this.

There are people here who know this shit - I deal with Jenna's condition daily, and she's been having issues with her hearing on top of all this - and I'm deeply worried I'm going to lose her. It breaks my heart. She's never seen my face, and I've visited her twice! And it's a fucking pain to work with apps that aren't coded properly - so it's a crapshoot if NVDA will even work with whatever we have to use for collaboration. I was on a podcast with Jenna recently, and it was 50/50 that NVDA would work with the website we used to do the podcast.

And yes, in WIS, I do tailor the apps we use to accommodate her lack of sight. Even then it's not 100%. It's making me seriously considering moving to Office once we get enough funding just so she can have accessibility - Google Docs ain't cutting it as much as I'd like it to.

So I get it - I fucking get it so goddamn much. It makes me really goddamn sad to see her go through this shit and the only thing I can do is, within my knowhow and what she tells me, try to make her contributions at Waffle Iron Studios count, and to do my best to make my projects accessible to her so she can "see" the fruits of her labor rather than never be able to experience what we've made.

I don't consider it a hindrance. She's my best friend and I love her dearly. And I'm doing the best I can within my means to make things accessible for her. However, it pains me there's only so much I can fucking do with a limited budget. If I knew someone and had billions to spare, you bet your britches I'd point them towards GZDoom and go "hey, make this hook into NVDA please"

I'm getting quite tired of people just assuming because I can't just snap my fingers and make shit happen personally dragging out Jeff Bezos' to sieze his means of production myself, or making everything accessible when I have no goddamn money yet and my programming skills aren't the best aside working with a fork of an engine older than most of my friends, I'm somehow a bigoted monster who wants minorities to fry or some shit. I have fucking limits. I'm just a single person. And I'm tired of this goddamn mentality being prevalent everywhere I go. People assuming the worst out of everyone. It frankly needs to stop.

EDIT: Talking to Jenna about this just now. She says the best way would be to hook a TTS into GZDoom itself, but that'd be such massive overhead from what she's describing to me, and still would be a monumental task. Not to mention, it'd make the software massive unless you choose a proper library for it, or use dedicated speech APIs per system. However, that might be horrible to code given we target tons of OSes - and they all probably have their own.

I also want to state, there is UI scaling available, and I'm working to make sure my project works with this properly.

Re: TTS/Screenreader support for blind players

by peewee_RotA » Sat May 29, 2021 8:17 pm

The menus are all read from Language files right? Wouldn't this just require sending that language token off to some other method when certain events happen? (i.e. click or arrow key)

I see that the keys are handled through this CurrentMenu->CallResponder method. That CurrentMenu object has got to have a reference to the i18n token. I'm sure with just a little bit more familiarity, it would be a good item for a community member to look into.

Totally understandable to not try to take it on as a main development feature, but for a curious programmer from the community, this would be a great task.

Re: TTS/Screenreader support for blind players

by Rachael » Sat May 29, 2021 3:48 pm

I'm going to add to what Graf said here, speaking only for myself: (and this was all typed before phantombeta posted her response, so this post is ignorant of anything she said, sorry)

If it's as simple as hooking up an API to a ZScript function so that a modder can take over from there, then I can do that and am willing to do it.

But let's get down to brass tacks here: I am not visually impaired, and neither is anyone I live with. Yes, I take my sight for granted. I have no idea what I would do without it. And with that in mind, I am not unsympathetic to people who have vision loss, it's just that I have no idea what they need, and quite frankly it would be more insulting for me to try and guess at it than it would be to let someone make a mod who's much more capable and knowledgeable than me to offer such assistance to said folks, to do so in such a mod.

So that would be my answer. In the end it is a hobby project, and no I am not going to spend a week researching this topic (especially because I am fairly sure I'll get it wrong anyway if I do - I am just too lacking in knowledge on this subject, that's all there is to it).

Re: TTS/Screenreader support for blind players

by phantombeta » Sat May 29, 2021 3:41 pm

objectinspace wrote:If empowering the 35 million plus people with sight loss around the world to play Doom has "very little payoff in the end," what is the point of this project? I don't imagine , for example, adding OpenGL/DX11 support to a game that's some 30 years old was trivial either, but it was done, because people wanted it.
GZDoom has many, many ways to display text, a couple of which the engine can't even recognize as being text.
And there's also how doing this naïvely would make older mods (and pretty much any map that wants to draw an image to the screen) constantly spam the letter "A" at you, because the only way to draw images up until pretty recently was by telling it to use a non-font image as a font, then telling it to draw the letter "A". And that's not even going into how none of us (the developers) would really be able to test this properly.

Ultimately, there would indeed be "very little payoff in the end", because most mods wouldn't work properly with it. At best, a couple of mods might work well, but everything else would have issues working with it.
The only way to really have such a thing work acceptably would be to add generic functions for it and have mods implement this themselves, which wouldn't immediately solve the problem, as it'd require modders to make use of it... Which would probably take quite a while to be used in most mods, if ever.
objectinspace wrote:I'm actually extremely offended by this answer, on a deep personal level. Your attitude is ignorant, hateful, bigoted, ableist--and most importantly, wrong.
There's nothing hateful or bigoted about this. No one ever said blind people shouldn't play games, the issue is with implementing it into GZDoom specifically.
objectinspace wrote:I also think the implementation will be easier than you think , as there are plenty of other games and emulators--Skullgirls, GTAV, Retroarch, and scumVM come to mind--that have added some form of speech output for blind players, either by the developers or after the fact via modding.
All of which were certainly developed by groups of people with existing knowledge of how to do this, and most likely without supporting other mods being used on top of the accessibility mods. GZDoom is developed by an extremely small team, none of which receive even donations, and who have zero knowledge of how to do anything like this.

Skullgirls is a commercial game, developed by people who get paid for it, and can even spend money to get it done. GTA V was done by modders who specifically set out to do this, and who most likely already knew what they were doing beforehand. RetroArch and ScummVM are huge projects with dozens of hundreds of contributors. ScummVM even gets assisted with this by having things standardized in the engine.
And unlike GZDoom, these projects aren't a hodgepodge of 20+ years of code (a non-insignificant amount of which was pretty badly designed) that's sometimes only held together by constant maintenance.

If you were willing to find or pay someone else to add and maintain this, sure, we'll try to take it in. But if we have to add and maintain it ourselves, or it causes problems, it simply won't happen, because it isn't feasible for us to do so. Please don't think we don't want this; there's simply technical difficulties with supporting such a thing in this specific engine.

[Edit]: Added the OP's username to the quote headers.

Top