[Can't fix] No carriage returns in WSL console

Bugs that have been investigated and resolved somehow.

Moderator: GZDoom Developers

Re: No carriage returns in WSL console

Postby Chris » Sat Mar 16, 2019 2:53 pm

Graf Zahl wrote:But the biggest issue here is not that it theoretically works but that in order to get full Unicode support you have to use a special API to access it because at least in Windows 8 there is no UTF-8 console code page so the standard file writing API cannot be used to send UTF-8 to the console.

Can't you use MultiByteToWideChar to convert UTF-8 to UCS-2 (or whatever it is MS is using), and write using the Unicode-aware *W functions? That's even what UTF-8 Everywhere suggests for Windows. Internally work with UTF-8, then at the point of interaction with the system APIs, convert to/from wide-chars and explicitly use the system's Unicode-aware functions.
User avatar
Chris
 
Joined: 17 Jul 2003

Re: No carriage returns in WSL console

Postby Graf Zahl » Sat Mar 16, 2019 3:00 pm

That's what I am already doing - and what is the cause for the problems here - WSL apparently doesn't handle WriteConsole output correctly. But since WriteConsole only can write to the console, there is no unified way to write Unicode text to stdout. None of the functions can universally handle all cases because there is none that is UTF-8 capable.
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sat Mar 16, 2019 3:25 pm

From a strictly platform point of view, writing UTF-8 unconditionally to stdout is incorrect. The console uses the system "OEM" code page (not to mistaken with the "ANSI" code page used by the Windows subsystem). The WriteConsole command most likely converts it to the OEM character set before writing it out.

Yes, it is stupid, but essentially you're trying to do something the system officially does not support (*). That WSL on top of this doesn't handle newlines correctly is a different story of its own.

*) Strictly speaking Windows 10 has an experimental feature where you can specify the local character set is UTF-8, like modern Linux uses. However, that's experimental because Microsoft doesn't seem to be willing to break the endless of applications that do not support it - just like it happened with Linux about 10-15 years ago when the major distros switched.
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Graf Zahl » Sat Mar 16, 2019 3:52 pm

The console is well capable of handling Unicode, just in the Windows flavour of UCS-2. That archaic old code page mapping is just the same as with the ANSI API, with the added stupidity of defaulting to a useless code page. This all wouldn't be a problem if Windows supported the UTF-8 code page 65001 here.

Regarding setting the local character set to UTF-8, I think that you have to really search for old apps that don't support Unicode yet and would get caught in this dilemma. Any semi-modern software already uses the Unicode API because it needs internationalization features working properly.
What I haven't been able to figure out yet it whether switching the ANSI API to UTF-8 is only a global switch or can be done by each app on its own - but right now it's mostly academic anyway because it'd break Windows 7 and 8 support for no real gain. It's just a necessary first step to sanitize Windows for the future and eventually get rid of those godforsaken code pages.
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sat Mar 16, 2019 5:56 pm

I think you'll be surprised just how many apps are written so poorly they use the legacy family of input/output. In fact, any C++ program using FILE and iostream stuff will be subject to this as they use the god awful C locale junk. Yes, the mainstream applications that cover all continents don't do this, but the bottom of the barrel line of business apps? Oh you can bet on it.

The internals of the console may very well be able to handle UCS-2, but the streams are a different matter. Even to this day, the official output format if you pipe a console application to disk is the active OEM character set. Which in turn means that if you then open it in Notepad (with the ANSI character set) you'll not get the right characters without any errors here technically. The whole system has been broken for 25 years.

This all wouldn't be a problem if Windows supported the UTF-8 code page 65001 here.

That's exactly what they added in Windows 10. But its not the system default OEM code page and they called it experimental because they know it will break programs left and right.
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Graf Zahl » Sat Mar 16, 2019 6:37 pm

dpJudas wrote:I think you'll be surprised just how many apps are written so poorly they use the legacy family of input/output. In fact, any C++ program using FILE and iostream stuff will be subject to this as they use the god awful C locale junk. Yes, the mainstream applications that cover all continents don't do this, but the bottom of the barrel line of business apps? Oh you can bet on it.


I wouldn't bet on them not being junk. Just the last few days I was on a developer conference of my employer's parent company where several teams presented their software. The oldest ones are still using tools like VB6 and Visual FoxPro. I do not really want to know how the internals of their software look like...

As for typical business people and computers I have little hope. A friend of mine recently told me that his employer is having trouble with some customers - the reason is that they rolled out some new 3D presentation feature on their website which of course requires WebGL. And these customers expect it to run on some ancient IE version, but when being told to update their browser they just get angry. These people are so stupid it is unbelievable. Our own graphics developer had similar complaints that he cannot do our own web app as he likes because of the need to support old browsers. It doesn't need rocket science to guess that the software in such outfits - where even the web browser is junk - is normally on the shitty side of things.

The internals of the console may very well be able to handle UCS-2, but the streams are a different matter. Even to this day, the official output format if you pipe a console application to disk is the active OEM character set. Which in turn means that if you then open it in Notepad (with the ANSI character set) you'll not get the right characters without any errors here technically. The whole system has been broken for 25 years.


Of course it is broken. Haven't I said that the entire time? The entire console was a typical victim of Steve Ballmer's ignorance - it had no impact on business so it never got fixed. Let's hope that the recent work for Windows 10 ultimately results in something better. At least it looks like now they started to realize that they need a working console and not a piece of outdated garbage. The real irony here is that you cannot actually develop "proper" console apps with any development tool because they do not know CP 437 and most third party stuff I come across just uses CP 1252 instead, which creates broken text in the console if they forget to set the code page, but at least the redirected output is working.

That's exactly what they added in Windows 10. But its not the system default OEM code page and they called it experimental because they know it will break programs left and right.


It would be fully sufficient if they allowed to set this code page as the output code page programmatically and every app could suddenly work properly with one added line of code.
But that little tidbit doesn't seem to exist, rendering the entire UTF-8 thing totally useless. Why use it if the existing way of converting to the W API is more robust and more portable?
You can bet that this will end up a failure if the only way to set it is globally. Why do they even have manifests on Windows if it cannot be used to tell the system "Hey, I am aware of this, let me in."
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sat Mar 16, 2019 7:17 pm

Graf Zahl wrote:It would be fully sufficient if they allowed to set this code page as the output code page programmatically and every app could suddenly work properly with one added line of code.
But that little tidbit doesn't seem to exist, rendering the entire UTF-8 thing totally useless. Why use it if the existing way of converting to the W API is more robust and more portable?
You can bet that this will end up a failure if the only way to set it is globally. Why do they even have manifests on Windows if it cannot be used to tell the system "Hey, I am aware of this, let me in."

In my opinion what they should do is abandon the old console subsystem completely. Then add a manifest rule to normal Windows subsystem executables that makes stdin/stdout go to a console if launched from one. While they are at it they could ditch WinMain as the entry point and change the process ANSI character set to UTF-8 as well. The codepage is UTF-8 for the new stdin/stdout/stderr streams. Voila! All new applications have a first class console.

The console window process itself could then support processes from the old console subsystem by doing character conversion to/from the old OEM character set if things are piped between the two worlds. This way all the promises in the old console subsystem are kept for old executables, while all new software see it as a clean UTF-8 world.

Fat chance of this happening.
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Gez » Sat Mar 16, 2019 7:21 pm

Graf Zahl wrote:Visual FoxPro.

Now that's a blast from the past!
Gez
 
 
 
Joined: 06 Jul 2007

Re: No carriage returns in WSL console

Postby Graf Zahl » Sun Mar 17, 2019 2:37 am

dpJudas wrote:In my opinion what they should do is abandon the old console subsystem completely. Then add a manifest rule to normal Windows subsystem executables that makes stdin/stdout go to a console if launched from one.


That part is not really the problem. Normally it is the calling process setting these things up. So surely, if you start an app from, say, bash, it will properly set everything up and receive the redirected output.

dpJudas wrote:While they are at it they could ditch WinMain as the entry point and change the process ANSI character set to UTF-8 as well. The codepage is UTF-8 for the new stdin/stdout/stderr streams. Voila! All new applications have a first class console.


My guess is that they consider ANSI apps deprecated anyway. But obviously they should do something to make the ANSI API useful to Unicode apps - and for me that doesn't mean setting UTF-8 as a system-global code page but to allow each app to change its ANSI code page to UTF-8 itself if so desired. If they did it that way, all the problems would resolve themselves without any compatibility concerns whatsoever. The ANSI code pages could remain as fallback for legacy software and compiler support for this then slowly be phased out over time. Any app doing this would automatically change the console's code page along to UTF-8 as well and suddenly we'd be in a state where the only remaining issue is that old legacy software that cannot be fixed retroactively. But this software is an issue no matter what, so nothing would be lost.

But apparently that's a bit too simple for people who need to think bigger.



Gez wrote:
Graf Zahl wrote:Visual FoxPro.

Now that's a blast from the past!


Heh, yes. you know how it goes: If it ain't broke, don't fix it! The software we are talking about here has been in development for nearly 20 years.
But the term "ticking time bomb" was actually mentioned...
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sun Mar 17, 2019 3:59 am

Graf Zahl wrote:That part is not really the problem. Normally it is the calling process setting these things up. So surely, if you start an app from, say, bash, it will properly set everything up and receive the redirected output.

I mostly wanted it controlled by the manifest to avoid whatever compatibility issue might arise from some garbage program that actually writes to stdout without expecting it to go anywhere. The important thing here is letting go of the old idiotic design dictated by the old console subsystem. There's no point in having a different subsystem for console apps at all.

My guess is that they consider ANSI apps deprecated anyway.

I actually meant exactly what you is saying here - that the manifest will dictate the process' ANSI character set to be set to UTF-8. It would immediately make anything using the "A" family of Win32 functions behave exactly like Linux. fopen and everything else would Just Work. I prefer the manifest over a SetACP function as it would allow them to fix the pointless WinMain construct while they are at it.

But apparently that's a bit too simple for people who need to think bigger.

And this is why we got such "beauties" as PowerShell the last time they decided to improve the console. Oh and the first time, too, when they decided to make it a NT subsystem. That was only really done because it was cool the NT kernel could do such a thing. :roll:
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Graf Zahl » Sun Mar 17, 2019 5:51 am

dpJudas wrote:fopen and everything else would Just Work


If they didn't consider fopen and any other kind of synchronous file i/O deprecated. Try using this in UWP apps and you'll be surprised how nasty things will get with timeouts and similar shit.
Trying to make GZDoom an UWP app would be doomed to fail because it'd constantly run afoul of responsiveness tests run by the system.

But that's not really that much different from other "modern" OSs, UWP is just the worst in a shitty bunch. They all try to steer the developers away from the tried and true (and working) solutions of the past.

In hindsight it's really too bad that Windows had Unicode support this early, before all the kinks could be ironed out. Now we will be saddled with that baggage for all eternity.
And still, none of this is such a disaster as the C locale system which seems to have been designed to break software, so you do not need Microsoft to screw things up beyond repair.
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sun Mar 17, 2019 6:04 am

When it comes to UWP, I am deliberately not writing any such apps. It could very well be that Microsoft is still trying to push it, but I will do my part making sure it will not replace Win32. As for asynchronous I/O, I find this to be a solution causing more problems than it solves. Same thing with system thread pools for that matter. They can try to push it all they want, but that won't be enough to get me to use it. :)

When it comes to Windows and Unicode, I find it works perfectly fine converting between UTF-8 and UTF-16 for everything except the console. Yes it would be nice if the ANSI functions in the API could be changed to accept UTF-8 directly, but I'll survive without it. As for the console, eventually they will do something as their Azure stuff means they really could use a better console themselves.
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Graf Zahl » Sun Mar 17, 2019 6:25 am

dpJudas wrote:When it comes to UWP, I am deliberately not writing any such apps. It could very well be that Microsoft is still trying to push it, but I will do my part making sure it will not replace Win32.


Seeing that UWP is a system made for toy apps it will never replace Win32. Even on macOS all the sandboxing rules basically mean that some real productivity apps cannot be distributed via the app store, and here it's even worse because the API is so crippled.
I am sure that UWP will be popular for the kind of shit stuff aimed at unsuspecting customers but the entire system basically ensures that everything with some cross-platform design or things targeted at power users would be too confined by the strict rules and limitations. And if that kind of shit software now gets made for UWP, nobody will really miss it.
The only contact I had with it was some years ago when I had to port a game for the Windows Store.

dpJudas wrote: As for asynchronous I/O, I find this to be a solution causing more problems than it solves. Same thing with system thread pools for that matter. They can try to push it all they want, but that won't be enough to get me to use it. :)


What's the point of an asynchronous API anyway? The iOS app I am working on is doing a lot of asynchronous I/O - but that is being done by launching a worker thread that controls the entire thing from top to bottom without such nonsense as having to wait for futures for single operations, etc. Since it is a single continuous thread it's easy to debug, unlike the asynchronous API, and the end result is the same - I/O runs in the background without stalling the app - once it is done I call a completion function to wrap it all up, instead of some insane construct where each I/O call results in a completion callback starting the next slice, and so on, and so on...
The web access in the app needs to be done with an asynchronous API on the other hand - and calling that code "messy" would be an understatement. I'd rather do it the same way as the file I/O: Fire off a worker thread that synchronously executes the request so that I can keep full contol over it instead of having to rely on a badly designed callback system. But no such luck here...
About system thread pools, that's something hard to avoid on iOS, though. The entire system is built on this concept. But on Windows the entire system for the thread pools is a medium-sized disaster where the boilerplate to get things running exceeds all sane conventions, so why use it...?


dpJudas wrote:When it comes to Windows and Unicode, I find it works perfectly fine converting between UTF-8 and UTF-16 for everything except the console. Yes it would be nice if the ANSI functions in the API could be changed to accept UTF-8 directly, but I'll survive without it. As for the console, eventually they will do something as their Azure stuff means they really could use a better console themselves.


I think they are already working on something here, but as that WSL issue suggests, it's not quite finished.
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: No carriage returns in WSL console

Postby dpJudas » Sun Mar 17, 2019 7:33 am

Graf Zahl wrote:What's the point of an asynchronous API anyway?

Theoretically it is more efficient as a single thread can do more things without needing to do a context switch. Thread pools are the same thing, except here thread management is added into the mix.

In both cases I think a generic solution is an awful trade off for an inefficiency that only really applies to a very limited set of applications. But it sounds fancy and makes some developers feel extra clever, so it got promoted to the OS and language level. Even in the cases where the extra performance is truly needed, in most situations it would be better to just buy an extra server a little earlier than you'd otherwise have to.

The iOS thread pool may be a little nicer to work with, but ultimately it still builds on this fire and forget illusion that haunts any generalized thread pool solution. I also use it there because the platform is built around it. And my applications will crash hard if you exit while one of those async jobs are running. Just like I'm sure half of Apple's own shit will do. :)
dpJudas
 
 
 
Joined: 28 May 2016

Re: No carriage returns in WSL console

Postby Graf Zahl » Sun Mar 17, 2019 8:04 am

dpJudas wrote:And my applications will crash hard if you exit while one of those async jobs are running. Just like I'm sure half of Apple's own shit will do. :)



Tell me about it. In the app I inherited at work there were spurious crashes every now and then, and when I analyzed this stuff it all was a fatal design pattern where it was assumed that background threads do not need to be synchronized and monitored. When I found that out the first reaction from my superiors was that adding synchronization would negate all the benefits of asynchronous execution and if there was another way. Turned out there wasn't another way and the code depended on some non-shareable state so adding some mutexes and short waits along with them was the only solution.

But in the end, adding this asynchronous stuff to the operating systems and language runtimes will be inevitable because it's the only way to make software run faster with the CPU clock speeds hitting a concrete wall. But then please do it in a way that is less prone to poor usage which seems to be the norm with Apple's approach where you fire off something and do not even get a handle to the task that just got started. Whoever designed this must have been a complete moron. I couldn't have done the multithreading in GZDoom's hardware renderer with such an implementation because in the end the main thread needs to wait for the worker to finish.
User avatar
Graf Zahl
Lead GZDoom Developer
Lead GZDoom Developer
 
Joined: 19 Jul 2003
Location: Germany

Previous

Return to Closed Bugs

Who is online

Users browsing this forum: Ahrefs [Bot] and 0 guests