update document on Windows ME. The INT 33h disable and fix to mouse

movement code resolved one of the BSODs that can happen during the setup
phase. It turns out the other crash, related to the dynamic core, can be
overcome if you're patient and willing to run the setup phase with
core=normal (it takes a LONG time though!).

Added document on what DOSBox did wrong and why it affected Windows NT
3.1 and Windows ME the way it did.
This commit is contained in:
Jonathan Campbell 2014-04-17 10:58:07 -07:00
parent e832ae9b31
commit 9f65f164b4
4 changed files with 277 additions and 7 deletions

View File

@ -0,0 +1,95 @@
Assumptions made in DOSBox that caused crashes in Windows NT and Windows ME.
You may remember on the forums or from the build log my complaint that in
Windows NT 3.1, moving the mouse caused an immediate system crash.
Well, one day, I noticed that at the BOOTLDR screen, the protected mode
memory manager would also report a page fault on mouse movement. Even if
BIOS PS/2 emulation was turned off. Even if the AUX port was turned off.
At some point a value that showed up in the page fault dump jumped out
at me: Linear address 0x449 AKA the "current video mode" value of the BIOS
data area.
Now what did that have to do with the mouse?
It turns out that when DOSBox processes mouse movement from the SDL library,
it calls INT10_SetCurMode() to ensure that the CurMode pointer reflects the
current (BIOS) video mode so that INT 33h emulation can properly map the
mouse cursor. One of the things it does is read the BIOS data area using
real_readb(). Now, real_readb() like most memory I/O functions in DOSBox
are beholden to the emulated page tables, so if the page is NOT PRESENT,
that action causes a page fault instead! This is why moving the mouse, even
when AUX emulation and INT 15h mouse emulation were disabled, caused Windows
NT to crash!
So it was at this point I made several key modifications to DOSBox: When
booting a guest OS, it disables INT 33h emulation entirely (INT 33h is an
API provided by DOS or DOS TSRs and is never expected to be there at boot
time). Then, I modified the mouse movement handler to call INT10_SetCurMode()
if and only if emulating INT 33h.
Having made these modifications, Windows NT is now able to boot into it's
graphical desktop without crashing, and the mouse cursor is usable.
Apparently this fix also resolved one of the major BSOD crashes at setup time
in Windows ME. The initial System Configuration setup phase no longer BSODs
with a hung mouse cursor (but it still crashes DOSBox at some point when the
dynamic core issues a floating point instruction that causes a page fault).
To explain why Windows NT and ME had this problem while DOSBox was able to
get away with doing this in Windows 3.1, 95, and 98, you need to understand
how these Windows environments mapped the page tables, which I learned from
testing that region of memory with my experimental DOSLIB and DOSLIB2 code
on Hackipedia.org.
Linear address mapping (first 64KB of virtual memory) and how it's mapped:
Environment | 0x0000-0xFFFF are mapped as | Comments
---------------------+-----------------------------+----------------------
Windows 3.0 | Read/write |
| |
Windows 3.1 | Read/write | Win32s 32-bit subsystem catches NULL pointer errors
| | regardless by creating 32-bit code & data segment
| | with base=0xFFFF0000 and limit=0xFFFFFFFF in other
| | words shifting virtual vs. linear up by 64KB to
| | help cover it up.
| |
Windows 95 | Read/write | Apparently Win95 removed the 64KB offset for Win32
Windows 95 SP1 | | programs. Both Win16 and Win32 can address VMA 0
Windows 95 OSR1 | | as valid memory! In other words, NULL is a valid
| | pointer in Win32, and it's even writeable! Yikes!
| |
Windows 95 OSR2 | Read only | Finally, writing to NULL causes an exception as it
Windows 98 | | should in Win32 land. But reading through a NULL
Windows 98 SE | | pointer still works.
| |
Windows ME | Not present | Finally, right at the end of the 9x line, Microsoft
| | got around to making sure NULL reads or writes cause
| | an exception (like they would in any normal OS). But
| | the restriction is apparently only applied to Win32.
| | Win16 can still read linear memory address 0.
NOTE: Despite linear memory 0x0000-0xFFFF being read/write, Win16 applications will crash as expected
dereferencing a NULL pointer because in Win16, a NULL pointer 0x0000:0x0000 (segment=0 offset=0)
contains an invalid segment value. BUT, if the same program were to allocate a protected mode
selector that refers to linear memory address 0x00000000, that pointer would be readable/writeable
without any exceptions whatsoever.
So when DOSBox's mouse movement code issues real_readb(0x40,0x49), it works it's way through the OS
page tables and can cause a page fault. The only reason it didn't in Win 3.1 and 9x was because those
versions of Windows were DOS-oriented and left the mapping open for performance reasons. And with
Windows 98, DOSBox reading the BIOS data area doesn't clash with Win98's write protecting the pages.
It's not until Windows ME maps out the page entirely that DOSBox has a problem with causing page
faults in the way described.
Also stating the obvious: Windows NT doesn't map anything down there at all 99% of the time (the exception
being the NTVDM.EXE process for virtual 8086 mode), which is why moving the mouse caused the crash.
In Windows ME, this problem made itself obvious when, after installing itself, it boots up
for the first time and runs the Setup phase of setting up the Control Panel, Programs on the
start menu, etc. and then finally, the System Configuration phase. Prior to the bugfix, this
phase would eventually get to a driver load or test that causes a BSOD, one that's obviously
related to the mouse. You can hit ENTER to recover from the BSOD, but the cursor will no longer
respond to movement (the setup phase at that point will be stuck as well).

View File

@ -0,0 +1,95 @@
Assumptions made in DOSBox that caused crashes in Windows NT and Windows ME.
You may remember on the forums or from the build log my complaint that in
Windows NT 3.1, moving the mouse caused an immediate system crash.
Well, one day, I noticed that at the BOOTLDR screen, the protected mode
memory manager would also report a page fault on mouse movement. Even if
BIOS PS/2 emulation was turned off. Even if the AUX port was turned off.
At some point a value that showed up in the page fault dump jumped out
at me: Linear address 0x449 AKA the "current video mode" value of the BIOS
data area.
Now what did that have to do with the mouse?
It turns out that when DOSBox processes mouse movement from the SDL library,
it calls INT10_SetCurMode() to ensure that the CurMode pointer reflects the
current (BIOS) video mode so that INT 33h emulation can properly map the
mouse cursor. One of the things it does is read the BIOS data area using
real_readb(). Now, real_readb() like most memory I/O functions in DOSBox
are beholden to the emulated page tables, so if the page is NOT PRESENT,
that action causes a page fault instead! This is why moving the mouse, even
when AUX emulation and INT 15h mouse emulation were disabled, caused Windows
NT to crash!
So it was at this point I made several key modifications to DOSBox: When
booting a guest OS, it disables INT 33h emulation entirely (INT 33h is an
API provided by DOS or DOS TSRs and is never expected to be there at boot
time). Then, I modified the mouse movement handler to call INT10_SetCurMode()
if and only if emulating INT 33h.
Having made these modifications, Windows NT is now able to boot into it's
graphical desktop without crashing, and the mouse cursor is usable.
Apparently this fix also resolved one of the major BSOD crashes at setup time
in Windows ME. The initial System Configuration setup phase no longer BSODs
with a hung mouse cursor (but it still crashes DOSBox at some point when the
dynamic core issues a floating point instruction that causes a page fault).
To explain why Windows NT and ME had this problem while DOSBox was able to
get away with doing this in Windows 3.1, 95, and 98, you need to understand
how these Windows environments mapped the page tables, which I learned from
testing that region of memory with my experimental DOSLIB and DOSLIB2 code
on Hackipedia.org.
Linear address mapping (first 64KB of virtual memory) and how it's mapped:
Environment | 0x0000-0xFFFF are mapped as | Comments
---------------------+-----------------------------+----------------------
Windows 3.0 | Read/write |
| |
Windows 3.1 | Read/write | Win32s 32-bit subsystem catches NULL pointer errors
| | regardless by creating 32-bit code & data segment
| | with base=0xFFFF0000 and limit=0xFFFFFFFF in other
| | words shifting virtual vs. linear up by 64KB to
| | help cover it up.
| |
Windows 95 | Read/write | Apparently Win95 removed the 64KB offset for Win32
Windows 95 SP1 | | programs. Both Win16 and Win32 can address VMA 0
Windows 95 OSR1 | | as valid memory! In other words, NULL is a valid
| | pointer in Win32, and it's even writeable! Yikes!
| |
Windows 95 OSR2 | Read only | Finally, writing to NULL causes an exception as it
Windows 98 | | should in Win32 land. But reading through a NULL
Windows 98 SE | | pointer still works.
| |
Windows ME | Not present | Finally, right at the end of the 9x line, Microsoft
| | got around to making sure NULL reads or writes cause
| | an exception (like they would in any normal OS). But
| | the restriction is apparently only applied to Win32.
| | Win16 can still read linear memory address 0.
NOTE: Despite linear memory 0x0000-0xFFFF being read/write, Win16 applications will crash as expected
dereferencing a NULL pointer because in Win16, a NULL pointer 0x0000:0x0000 (segment=0 offset=0)
contains an invalid segment value. BUT, if the same program were to allocate a protected mode
selector that refers to linear memory address 0x00000000, that pointer would be readable/writeable
without any exceptions whatsoever.
So when DOSBox's mouse movement code issues real_readb(0x40,0x49), it works it's way through the OS
page tables and can cause a page fault. The only reason it didn't in Win 3.1 and 9x was because those
versions of Windows were DOS-oriented and left the mapping open for performance reasons. And with
Windows 98, DOSBox reading the BIOS data area doesn't clash with Win98's write protecting the pages.
It's not until Windows ME maps out the page entirely that DOSBox has a problem with causing page
faults in the way described.
Also stating the obvious: Windows NT doesn't map anything down there at all 99% of the time (the exception
being the NTVDM.EXE process for virtual 8086 mode), which is why moving the mouse caused the crash.
In Windows ME, this problem made itself obvious when, after installing itself, it boots up
for the first time and runs the Setup phase of setting up the Control Panel, Programs on the
start menu, etc. and then finally, the System Configuration phase. Prior to the bugfix, this
phase would eventually get to a driver load or test that causes a BSOD, one that's obviously
related to the mouse. You can hit ENTER to recover from the BSOD, but the cursor will no longer
respond to movement (the setup phase at that point will be stuck as well).

View File

@ -1,10 +1,45 @@
The most likely reason Windows ME setup will crash in DOSBox-X
is that the Setup program wants to initialize the shell and
system configuration on startup. It will always crash (so far).
It is possible to install Windows ME properly without crashing. It requires
some workarounds and changing parameters per boot, but it is possible.
One way around that is to boot into Safe Mode, wait for the
shell to appear, and then pull up RegEdit.
1. Boot into a disk image with a base DOS os on it (MS-DOS 6.22 or Win 9x
DOS mode). your dosbox.conf should have core=dynamic and IDE emulation
enabled. Make sure the v86io hack is enabled so that Windows ME can
use the same INT 13h detection method to enable it's IDE driver.
Also make sure that apmbios=false at this stage so that Windows ME
does not install the APM driver (which is incompatible with DOSBox
and can be a major source of BSODs).
Locate the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce
key, then look for the string value InitShell. Delete the string value. Reboot.
2. Run the first setup phase of the Windows ME setup program. When it
reboots, shut down the DOSBox emulator. Do NOT let it boot.
3. Change your dosbox.conf to core=normal or core=full. Start DOSBox
and let Windows ME run through it's setup and autodetection.
This will take a long time because you are not using the dynamic
core. But because of some issues with the dynamic core and some
floating point tests carried out by Setup, you cannot use the
dynamic core at this stage.
4. When the setup and detection phase is complete, let Windows ME
reboot. It will begin the normal desktop, but may insist on
running the Setup and autodetection phase again before allowing
the desktop to show. When the desktop finally appears, shut
down Windows ME.
5. Edit your dosbox.conf to change core=dynamic. You may also
set apmbios=true at this point if you want Windows shutdown
to close DOSBox automatically---but make sure you do not allow
WinME to install the APM driver!
Alternative method (faster): Run Windows ME setup with core=dynamic.
Let it go as far as it can. When you can no longer proceed without
crashing, reboot WinME into safe mode, force your way to the desktop,
and use regedit.exe to delete all startup keys from registry location
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce
This will allow Windows ME to boot, though with a lot of missing
driver functionality. It is very likely in this mode that you will
forever be stuck in 640x480 16-color mode, and every time you boot,
Windows ME will complain the display is not configured correctly and
any attempt to remedy that will fail (evil voice) MWAHAHAHAHAHAHAAAA!
THAT'S WHAT YOU GET FOR BEING SO IMPATIENT SUCKER!

View File

@ -0,0 +1,45 @@
It is possible to install Windows ME properly without crashing. It requires
some workarounds and changing parameters per boot, but it is possible.
1. Boot into a disk image with a base DOS os on it (MS-DOS 6.22 or Win 9x
DOS mode). your dosbox.conf should have core=dynamic and IDE emulation
enabled. Make sure the v86io hack is enabled so that Windows ME can
use the same INT 13h detection method to enable it's IDE driver.
Also make sure that apmbios=false at this stage so that Windows ME
does not install the APM driver (which is incompatible with DOSBox
and can be a major source of BSODs).
2. Run the first setup phase of the Windows ME setup program. When it
reboots, shut down the DOSBox emulator. Do NOT let it boot.
3. Change your dosbox.conf to core=normal or core=full. Start DOSBox
and let Windows ME run through it's setup and autodetection.
This will take a long time because you are not using the dynamic
core. But because of some issues with the dynamic core and some
floating point tests carried out by Setup, you cannot use the
dynamic core at this stage.
4. When the setup and detection phase is complete, let Windows ME
reboot. It will begin the normal desktop, but may insist on
running the Setup and autodetection phase again before allowing
the desktop to show. When the desktop finally appears, shut
down Windows ME.
5. Edit your dosbox.conf to change core=dynamic. You may also
set apmbios=true at this point if you want Windows shutdown
to close DOSBox automatically---but make sure you do not allow
WinME to install the APM driver!
Alternative method (faster): Run Windows ME setup with core=dynamic.
Let it go as far as it can. When you can no longer proceed without
crashing, reboot WinME into safe mode, force your way to the desktop,
and use regedit.exe to delete all startup keys from registry location
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce
This will allow Windows ME to boot, though with a lot of missing
driver functionality. It is very likely in this mode that you will
forever be stuck in 640x480 16-color mode, and every time you boot,
Windows ME will complain the display is not configured correctly and
any attempt to remedy that will fail (evil voice) MWAHAHAHAHAHAHAAAA!
THAT'S WHAT YOU GET FOR BEING SO IMPATIENT SUCKER!