COMMAND
TCP/IP
SYSTEMS AFFECTED
Most unices
PROBLEM
Wietse Venema found following. This note is about a subtle data
corruption problem with TCP data streams that may bite people as
more and more (LINUX) systems are sending network traffic with
TCP-level options turned on.
Several Postfix users reported mail delivery failures because
sequences of control characters (for example, ^A^A^H) were being
inserted into their SMTP connections, resulting in SMTP protocol
errors and non-delivery of email. These data corruption problems
are not host specific: they are observed with both Linux and
BSD/OS systems, and with mail sent to and/or received from systems
running Postfix, Sendmail and qmail.
Over the weekend of March 18, 2000, a few people left tcpdump
running on their machines, in order to record some of these
corrupted SMTP sessions. This note is based on an analysis of
that data.
The corruption appears to be caused by a buggy traffic
manipulation scheme that plays games with TCP acknowledgements.
It sounds like a great argument for more deployment of IPSEC,
which is designed to prevent modification or insertion of traffic
in transit; but it also illustrates the conflict that some have
with IPSEC, because it prevents them from doing any traffic
manipulation at all.
See also draft-ietf-pilc-pep-02.txt (performance enhancing
proxies) for a discussion of well-intended TCP traffic
manipulation techniques.
The problem is with "extra" ACK packets that are generated by some
helpful intermediate routers. Under some conditions involving
retransmission and/or packets arriving out of order, such routers
copy a real ACK packet from an end system, turn the copied ACK
around by swapping source and destination etc. fields, and send
it off.
The problem happens when, by mistake, TCP option bytes from the
original ACK packet are sent as DATA bytes in the copied ACK
packet. This corrupts the TCP data stream, because the bogus
data is sent in a packet with correct IP and TCP header
checksums. The fact that the next TCP data will overlay the
bogus data does not prevent the bogus data from being passed up
to the application.
What follows is a fragment of a corrupted SMTP session, one of
several dozen sessions that were recorded at both endpoints of the
connections. The recordings are available via FTP (see pointers
at the end).
The first figure shows an ACK packet sent by the SMTP server. The
figure shows one line of tcpdump output (folded for readability),
followed by an annotated version of the packet. The annotation
identifies 20 bytes of IP header fields, 20 bytes of TCP header
fields, and 12 bytes of TCP header options.
12:28:37.051883 195.52.11.4.25 > 194.25.134.80.1730: . ack 86 win
32120 <nop,nop,timestamp 1105397 766737219> (DF)
IP_HDR 45 00 00 34 52 2f 40 00 40 06
vhl tos len len id id off off ttl pro
IP_HDR d1 f2 c3 34 0b 04 c2 19 86 50
sum sum src src src src dst dst dst dst
TCP_HDR 00 19 06 c2 f5 22 60 dd f4 ce
src src dst dst seq seq seq seq ack ack
TCP_HDR fc e1 80 10 7d 78 0d 1a 00 00
ack ack off flg win win sum sum urp urp
TCP_OPT 01 01 08 0a 00 10 dd f5 2d b3
opt opt opt opt opt opt opt opt opt opt
TCP_OPT 7b 43
opt opt
The second figure shows an "extra ACK" packet that was generated
by an intermediate router, not by an end system (it shows up only
in the tcpdump recording of the receiving system). Note that the
"extra ACK" has the same 0x522f IDENT field in the IP header as
the preceding packet. The "extra ACK" has the same 12 bytes of
TCP options as the preceding packet. However, the TCP options are
by mistake sent as data, so they are read by the application as
^A^A^H...
12:28:37.056438 194.25.134.80.1730 > 195.52.11.4.25: . 86:98(12)
ack 112 win 2920 (DF)
IP_HDR 45 00 00 34 52 2f 40 00 3c 06
vhl tos len len id id off off ttl pro
IP_HDR d5 f2 c2 19 86 50 c3 34 0b 04
sum sum src src src src dst dst dst dst
TCP_HDR 06 c2 00 19 f4 ce fc e1 f5 22
src src dst dst seq seq seq seq ack ack
TCP_HDR 60 d5 50 10 0b 68 af 32 00 00
ack ack off flg win win sum sum urp urp
DATA 01 01 08 0a 00 10 dd f5 2d b3
^A ^A ^H ^J ^@ ^P dd f5 - b3
DATA 7b 43
{ C
Note that the ACK with bogus data is sent towards the host that
sent the original ACK with TCP option bytes. Turning off TCP
options would prevent this corruption from happening. However,
turning off TCP options in the local system would solve only half
the problem. When a remote system connects to the local system,
and the remote system has TCP options turned on, the connection
can still suffer from the type of corruption shown above.
As discussed above, some intermediate systems generate an "extra
ACK" by cloning a real ACK packet. They modify the cloned ACK by
swapping source and destination fields etc., then send it off.
By measuring the time differences between sending the original ACK
and receiving the cloned ACK it is possible to narrow down the
router responsible for the data corruption. By playing games with
tools such as traceroute, ping and mtr it is possible to further
identify the source of a problem. Getting the problem fixed is
another matter, of course.
A more extensive version of this note, with tcpdump recordings of
corrupted SMTP sessions, and with tools used for the analysis of
those recordings is available via FTP:
ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz
ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz.sig
SOLUTION
Apparently, once instance of this data corruption problem is
caused by an unnamed bandwidth management system. It runs as a
bridge, and does not show up in traceroute etc. output. Testers
were able to estimate its location (at 5 ms round-trip time from
one endpoint) by analyzing packet arrival times.
Until now, this TCP data corruption problem has been observed only
when one of the connection endpoints runs a recent LINUX version.
Sightings have been reported by sites in Germany and in France.
Only recent LINUX versions request the use of timestamp options
that cause the tell-tale patterns of "01 01 08 0a" in TCP packets,
and that end up being regurgitated as ^A^A^H^J data.