"gzdoom" received signal SIGFPE, Arithmetic exception

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.

Post a reply

Smilies
:D :) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :wink: :geek: :ugeek: :!: :?: :idea: :arrow: :| :mrgreen: :3: :wub: >:( :blergh:
View more smilies

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: "gzdoom" received signal SIGFPE, Arithmetic exception

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Tue Oct 15, 2019 2:01 pm

The problem here is that they dared to print this piece of nonsense in the first place. It's ok trying to lock memory but considering it warning worthy if it can't be done is another story.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Chris » Tue Oct 15, 2019 1:45 pm

_mental_ wrote:
mwnn wrote:The fluidsynth warnings which occasionally appear in the terminal i.e:

Code: Select all

fluidsynth: warning: Failed to pin the sample data to RAM; swapping is possible.
Look like nonsense - how can that be an issue with 8GB of RAM + 2GB swapfile?
It’s normal to get this warning on Linux. Locking memory pages to RAM requires special privileges that executables don’t have by default.
Individual processes can also be limited in the amount of memory they can lock to RAM, to prevent a single app from hogging too much.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by _mental_ » Tue Oct 15, 2019 1:01 pm

mwnn wrote:The fluidsynth warnings which occasionally appear in the terminal i.e:

Code: Select all

fluidsynth: warning: Failed to pin the sample data to RAM; swapping is possible.
Look like nonsense - how can that be an issue with 8GB of RAM + 2GB swapfile?
It’s normal to get this warning on Linux. Locking memory pages to RAM requires special privileges that executables don’t have by default.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by mwnn » Tue Oct 15, 2019 11:44 am

Graf Zahl wrote:I hope it's fixed with my recent changes. Waiting for confirmation.
So I've built a debug build of the zmusic_interface branch.
Copied the soundfonts to .config/gzdoom/soundfonts and run the game.
Set Fatboy as the active soundfont and let it run the main menu loop for about 5 minutes without hitting a key and it didn't blow up on me.
I'll continue to try that a few more times.
I think that's 3/3 solved.

The fluidsynth warnings which occasionally appear in the terminal i.e:

Code: Select all

fluidsynth: warning: Failed to pin the sample data to RAM; swapping is possible.
Look like nonsense - how can that be an issue with 8GB of RAM + 2GB swapfile?

EDIT: Not very exciting viewing but here you go:
https://drive.google.com/file/d/1LDuF_c ... sp=sharing
Skip to 4:40 to see the settings in use.
Before these changes it was lucky to last more than a minute or two.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Tue Oct 15, 2019 10:57 am

I hope it's fixed with my recent changes. Waiting for confirmation.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Tue Oct 15, 2019 5:11 am

Correct, but in this case the client side pointer needs to be nulled before taking down the object.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by _mental_ » Tue Oct 15, 2019 4:26 am

ZMusic_Close() tries to delete a locked mutex. I think there is no need to lock here as long as the song is stopped.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Tue Oct 15, 2019 2:53 am

I just committed a refactor of the music playback interface and used the new top level functions to add proper synchronization. Turned out it was never done properly - there were various mutexes deeper in the music code but they didn't cover everything that was needed. It also turned out that better all interface functions try to lock the mutex because with the exception of two simple getters they all can change state which shouldn't overlap with producing output data.

This needs a bit of testing, so far it's in a work branch.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Mon Oct 14, 2019 11:21 am

Well, in that case the only solution is to use a mutex to synchronize Stop() with ServiceStream() because obviously, Stop() may not be called when the device is busy doing stuff.
But I think this needs to be done outside the worker functions, it may also cause problems with non-MIDI music.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by _mental_ » Mon Oct 14, 2019 8:17 am

Graf Zahl wrote:But looking at the code, where does it crash? It looks to me that at some point the MIDI streamer fails to validate its internal state. If its internal player state got deleted it should play silence instead.
When the title music stops, the main thread deletes MIDI device when Stop() is called from IsPlaying().
The corresponding stream inside the backend is still registered. The background streaming thread is still processing it regularly.
Spoiler: Callstack of main thread
Spoiler: Callstack of streaming thread
You can comment this 100 ms wait to trigger the crash much faster.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Mon Oct 14, 2019 7:36 am

Yes, right. Obviously, IsPlaying may not do this thing - it's nasty. Obviously the function may stop the playback but it must not, under any circumstances delete it.
It should merely return the current status and let the caller perform the shutdown instead.
I must have overlooked this particular piece of nastiness, it was things like this that made me do the refactor in the first place. Thanks to such internal antics the code was basically not reusable.

But looking at the code, where does it crash? It looks to me that at some point the MIDI streamer fails to validate its internal state. If its internal player state got deleted it should play silence instead.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by _mental_ » Mon Oct 14, 2019 7:05 am

Graf Zahl wrote:I think the best option would be a mutex on the outward facing interface of the music code - of course this means that the interface should not involve any direct virtual calls but go through wrapper functions instead. Otherwise the synchronization needs to be in too many places.
Probably, I didn't explain the problem well. Like I said, it's quite hard to grasp it without debugger.
Everything is fine with synchronization in the backend and client code.

The main issue is with responsibility. MIDIStreamer can shutdown MIDI device by itself, client code can do this too via S_StopMusic() function.
In the former case, client can has no indication that device is gone and stream should be closed. Actually, stream needs to be closed before device's destruction.
That's what I meant when blaming MIDIStreamer::IsPlaying() as a source of race condition.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by mwnn » Mon Oct 14, 2019 4:34 am

_mental_ wrote:Despite a completely misleading title of the topic, it reports a real problem
You lot are always telling me off! :lol:
The topic title made perfect sense when I wrote the topic.
You've solved the serious crashing in the previous (closed) topic with the exit_cleanup changes.
The SIGFPE Arithmetic exception which occurs after exiting the game (topic title) was solved with the SDL error + Linux crash handler changes.

The fluidsynth issue only seems to manifest itself with the soundfont.
You did ask for the fluidsynth version I had installed and it's the final release of fluidsynth 1.xx.
We've just carried on and have not changed the topic title or made a new topic.

If the fluidsynth related bugs are solved then gzdoom 4.2.2 / 4.3 (?) will be as robust as a german panzer.
I was able to play through the entire first episode of Heretic yesterday using Vulkan without too much trouble so I must be finding fringe issues - the sort of bug that affects 1 in 100 or triggers under certain rare conditions.
I've noticed you've already added changes for fluidsynth 2.xx.
I'm confident that you'll come up with a solution in any case.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by Graf Zahl » Mon Oct 14, 2019 4:12 am

I think the best option would be a mutex on the outward facing interface of the music code - of course this means that the interface should not involve any direct virtual calls but go through wrapper functions instead. Otherwise the synchronization needs to be in too many places.

I cannot really say this surprises me because the entire self-driving design in the old code was like a house of cards.

Re: "gzdoom" received signal SIGFPE, Arithmetic exception

by _mental_ » Mon Oct 14, 2019 3:54 am

Despite a completely misleading title of the topic, it reports a real problem.

There is a race condition between main and streaming threads.
MIDI device can be in process of destruction on the main thread while streaming one asks this MIDI device to fill a buffer with new data.
It's hard to understand the issue without seeing all related code in the debugger.

Previously, it worked because stream was closed by MIDI device. Now, it is closed by the backend without any synchronization with the device.
So, calling MIDIStreamer::Stop() from MIDIStreamer::IsPlaying() is no longer correct.
Moreover, any call to MusInfo::Stop() with its stream still registered in the backend is potentially dangerous.

Removal of these calls from MIDIStreamer::IsPlaying() will fix this particular crash, but this doesn't fix the problem in general.

Top