COMMAND

    TCP/IP

SYSTEMS AFFECTED

    Most unices

PROBLEM

    Wietse Venema found following.   This note is about a  subtle data
    corruption problem with TCP data  streams that may bite people  as
    more and  more (LINUX)  systems are  sending network  traffic with
    TCP-level options turned on.

    Several  Postfix  users  reported  mail  delivery failures because
    sequences of control characters  (for example, ^A^A^H) were  being
    inserted into their SMTP  connections, resulting in SMTP  protocol
    errors and non-delivery of email.  These data corruption  problems
    are  not  host  specific:  they  are  observed with both Linux and
    BSD/OS systems, and with mail sent to and/or received from systems
    running Postfix, Sendmail and qmail.

    Over the  weekend of  March 18,  2000, a  few people  left tcpdump
    running  on  their  machines,  in  order  to  record some of these
    corrupted SMTP  sessions.   This note  is based  on an analysis of
    that data.

    The  corruption  appears   to  be  caused   by  a  buggy   traffic
    manipulation scheme  that plays  games with  TCP acknowledgements.
    It sounds  like a  great argument  for more  deployment of  IPSEC,
    which is designed to prevent modification or insertion of  traffic
    in transit; but  it also illustrates  the conflict that  some have
    with  IPSEC,  because  it  prevents  them  from  doing any traffic
    manipulation at all.

    See   also   draft-ietf-pilc-pep-02.txt   (performance   enhancing
    proxies)   for   a   discussion   of   well-intended  TCP  traffic
    manipulation techniques.

    The problem is with "extra" ACK packets that are generated by some
    helpful  intermediate  routers.   Under  some conditions involving
    retransmission and/or packets arriving out of order, such  routers
    copy a real  ACK packet from  an end system,  turn the copied  ACK
    around by swapping source and  destination etc.  fields, and  send
    it off.

    The problem happens  when, by mistake,  TCP option bytes  from the
    original  ACK  packet  are  sent  as  DATA bytes in the copied ACK
    packet.   This corrupts  the TCP  data stream,  because the  bogus
    data  is  sent  in  a  packet  with  correct  IP  and  TCP  header
    checksums.   The  fact  that  the  next  TCP data will overlay the
    bogus data does  not prevent the  bogus data from  being passed up
    to the application.

    What follows  is a  fragment of  a corrupted  SMTP session, one of
    several dozen sessions that were recorded at both endpoints of the
    connections.  The recordings  are available via FTP  (see pointers
    at the end).

    The first figure shows an ACK packet sent by the SMTP server.  The
    figure shows one line of tcpdump output (folded for  readability),
    followed by  an annotated  version of  the packet.  The annotation
    identifies 20 bytes  of IP header  fields, 20 bytes  of TCP header
    fields, and 12 bytes of TCP header options.

        12:28:37.051883 195.52.11.4.25 > 194.25.134.80.1730: . ack 86 win
        32120 <nop,nop,timestamp 1105397 766737219> (DF)

        IP_HDR   45  00  00  34  52  2f  40  00  40  06
	        vhl tos len len id  id  off off ttl pro
        IP_HDR   d1  f2  c3  34  0b  04  c2  19  86  50
	        sum sum src src src src dst dst dst dst
        TCP_HDR  00  19  06  c2  f5  22  60  dd  f4  ce
	        src src dst dst seq seq seq seq ack ack
        TCP_HDR  fc  e1  80  10  7d  78  0d  1a  00  00
	        ack ack off flg win win sum sum urp urp
        TCP_OPT  01  01  08  0a  00  10  dd  f5  2d  b3
	        opt opt opt opt opt opt opt opt opt opt
        TCP_OPT  7b  43
	        opt opt

    The second figure shows an  "extra ACK" packet that was  generated
    by an intermediate router, not by an end system (it shows up  only
    in the tcpdump recording of the receiving system).  Note that  the
    "extra ACK" has the  same 0x522f IDENT field  in the IP header  as
    the preceding packet.   The "extra ACK" has  the same 12 bytes  of
    TCP options as the preceding packet.  However, the TCP options are
    by mistake sent as  data, so they are  read by the application  as
    ^A^A^H...

        12:28:37.056438 194.25.134.80.1730 > 195.52.11.4.25: . 86:98(12)
        ack 112 win 2920 (DF)

        IP_HDR   45  00  00  34  52  2f  40  00  3c  06
	        vhl tos len len id  id  off off ttl pro
        IP_HDR   d5  f2  c2  19  86  50  c3  34  0b  04
	        sum sum src src src src dst dst dst dst
        TCP_HDR  06  c2  00  19  f4  ce  fc  e1  f5  22
	        src src dst dst seq seq seq seq ack ack
        TCP_HDR  60  d5  50  10  0b  68  af  32  00  00
	        ack ack off flg win win sum sum urp urp
        DATA     01  01  08  0a  00  10  dd  f5  2d  b3
	         ^A  ^A  ^H  ^J  ^@  ^P  dd  f5  -   b3
        DATA     7b  43
	         {   C

    Note that the ACK  with bogus data is  sent towards the host  that
    sent the  original ACK  with TCP  option bytes.   Turning off  TCP
    options  would  prevent  this  corruption from happening. However,
    turning off TCP options in the local system would solve only  half
    the problem.  When a  remote system connects to the  local system,
    and the remote  system has TCP  options turned on,  the connection
    can still suffer from the type of corruption shown above.

    As discussed above, some  intermediate systems generate an  "extra
    ACK" by cloning a real ACK  packet. They modify the cloned ACK  by
    swapping source and destination fields etc., then send it off.

    By measuring the time differences between sending the original ACK
    and receiving  the cloned  ACK it  is possible  to narrow down the
    router responsible for the data corruption.  By playing games with
    tools such as traceroute, ping  and mtr it is possible  to further
    identify the source  of a problem.   Getting the problem  fixed is
    another matter, of course.

    A more extensive version of this note, with tcpdump recordings  of
    corrupted SMTP sessions, and with  tools used for the analysis  of
    those recordings is available via FTP:

        ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz
        ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz.sig

SOLUTION

    Apparently,  once  instance  of  this  data  corruption problem is
    caused by an  unnamed bandwidth management  system.  It  runs as a
    bridge, and does not show  up in traceroute etc. output.   Testers
    were able to estimate its  location (at 5 ms round-trip  time from
    one endpoint) by analyzing packet arrival times.

    Until now, this TCP data corruption problem has been observed only
    when one of the connection endpoints runs a recent LINUX  version.
    Sightings have been reported by sites in Germany and in France.

    Only recent LINUX  versions request the  use of timestamp  options
    that cause the tell-tale patterns of "01 01 08 0a" in TCP packets,
    and that end up being regurgitated as ^A^A^H^J data.