To ensure 100% compatibility, you'd have to include a copy of the map in the demo (To avoid running the wrong map) or at least a checksum that ensures the proper map is loaded (and then it would "break" if the map wasn't present...)
This happens already. A checksum of loaded wads at record time would obviously be the ideal choice.
then you'd have to include all the actual physics.
Not really. You could just keep track of everything that moves between tics, gets created/removed, ect. It'd basically act like a very precise script of actions, leaving as little to chance (read: engine behaviour) as possible.
Things activating, speed of movement of floors/ceilings/polyobjs
All of which could lead to a break in compatibility should something drastic change in the code, which is what I'm proposing this for, to prevent breakage; obviously at a cost(bigger DEMO files).
Is this really worth it?
That's ultimately up to Randy and how he thinks to implement such a change. Given enough knowlegde of the ZDoom source, it would be something I would at least look into.