Hi. This is an ambitious piece. Readability is an issue, yes. You've already made great progress. Here's some more ideas.
My edit is dirty as all hell (used photoshop to alter colours and make the three levels separate, then I outputted a 64 colour composite and fucked around in pro motion with every paint modifier there is, blend, darkness/lightness, smooth, whatever. Just to give you a mockup of what you could be heading towards if you want.
The tools I'm using here to achieve readability are the following:
1. More colors. I do not believe what you're trying to do here can be accomplished with so few colours, unless you're willing to sacrifice detail and simplify the threee 'zones' of the perspective to the point where they're more symbolic. I'll explain both points of view here. If you want this to be complex and realistic (to a degree) you need to look at what demoscene artists do. The first thing a demoscene artist would do is steal from an established real-world non-digital painter. That's bad, don't do that. But the second thing they'd do is establish a large palette with which they can convey what the real-world, non-digital painter did. The real-world artist probably knows about composition and depth and stuff like that so it trickles down through the demoscene even through their bad practices. This is why you should study the demoscene regardless of their problems, which you can avoid with honesty and hard work.
Either a big palette that gives you a lot of options, or a small palette and very simple shapes and constructs. For example, where there are figures, the brickpattern on the walls should not touch their lineart. That's a very simple example. Or another example, characters could use straight-up black outlines to be readable even with NES colors or whatever.
You can go either way, but in the middle the space is more uncomfortable and it requires you to be more crafty to achieve a balance.
2. Less focal points. I'm no master of composition but even I know that what you've got going right now is too dense and waaaay the hell too bright/hurts eyes. If you want flashing lights, give us some darkness. There's an issue of mood you're trying to convey and me trying to convey a different, darker mood, but I thought 'dungeonify a kiddie cartoon' calls for some darkness. Ominous clouds and a castle in the distance instead of caramel clounds and a blue castle in the distance. However, with more colours and smart outlining and simplifying of shapes here and there I don't think you necessarily need the black far-ground in the distance like I chose. But it's a trick that works, no?
3. Less sharpness on less important, distant or otherwise disconnected points of interest. Pixel art is a hard medium in which to convey blurry flashes of light, but you can go SOME of the way towards that direction without horrible banding (like my use of the blend brush introduced in my mockup). If you set your mind to it, good pixel technique can get you somewhere in the ballpark. I also toned the digital dungeon master screen down in the distance, I think it sorta works. You can do without that, but you'd need more colour separation between DM_screen and blue_dragon layers.
4. Warming the colors in the foreground, making them more cold in the distance. Not much to say to this, very common trick. Also pushing saturation on the digital aspects of the playfield (the green vectorish grid, most of all. It's not the future unless it's saturated).
5. Fancy FX to signify the digitalness of this setup, which you should absolutely do if it helps with readability as well. Mostly around the DM_screen
6. Dramatic lighting. Well, it's cheap. But you've got spotights on your crew after all! I did this last, so I know the edit works without the contour lightning as well, but it looks much better like this in my opinion.