COMMAND
kernel
SYSTEMS AFFECTED
Solaris
PROBLEM
Ofir Arkin found following. RFC 791 defines a three bits field
used for various control flags in the IP Header.
Bit 0 is the reserved flag, and must be zero.
Bit 1, is called the Don’t Fragment flag, and can have two values.
A value of zero (not set) is equivalent to May Fragment, and a
value of one is equivalent to Don't Fragment. If this flag is set
than the fragmentation of this packet at the IP level is not
permitted, otherwise it is.
Bit 2, is called the More Fragments bit. It can have two values.
A value of zero is equivalent to (this is the) Last Fragment, and
a value of 1 is equivalent to More Fragments (are coming).
The next field in the IP header is the Fragment Offset field,
which identifies the fragment location relative to the beginning
of the original un-fragmented datagram (RFC 791, bottom of page
23).
A close examination of the ICMP Query replies would reveal that
some operating systems would set the DF bit with their replies.
The tcpdump trace below illustrates the reply a Sun Solaris 2.7
box produced for an ICMP Echo Request:
17:10:19.538020 if 4 > y.y.y.y > x.x.x.x : icmp: echo request (ttl 255, id 13170)
4500 0024 3372 0000 ff01 9602 yyyy yyyy
xxxx xxxx 0800 54a4 8d04 0000 cbe7 bc39
8635 0800
17:10:19.905254 if 4 < x.x.x.x > y.y.y.y : icmp: echo reply (DF) (ttl 233, id 24941)
4500 0024 616d 4000 e901 3e07 xxxx xxxx
yyyy yyyy 0000 5ca4 8d04 0000 cbe7 bc39
8635 0800
ICMP Query replies for an operating system maintains the same
behavioral patterns. Either they set the DF bit on all ICMP query
reply types or they do not.
The DF bit would be set by default with ICMP Query replies with
Sun Solaris. With HP-UX 10.30, & 11.0x, and with AIX 4.3.x
setting the DF Bit will vary from one queried host to another
(explanation coming). It may be set with the first ICMP Query
reply onwards, or after a number of ICMP Query replies. This
detail will help us to distinguish between Sun Solaris, HP-UX
10.30 & 11.0x, and AIX 4.3.x operating systems.
Why HP-UX 10.30 / 11.0 & AIX 4.3.x operating systems act this way?
HP claims to have a proprietary method in order to determine the
PMTU with HP-UX v10.30, and HP-UX v11.0x using ICMP Echo requests.
AIX 4.3.x do exactly the same.
The next trace will help understanding the process taken by HPUX
10.30 & 11.0x and AIX 4.3.x. Here we have sent an ICMP Echo
request to an HP-UX 11.0 machine:
00:27:56.884147 ppp0 > y.y.y.y > x.x.x.x: icmp: echo request (ttl 255, id 13170)
4500 0024 3372 0000 ff01 7c51 yyyy yyyy
xxxx xxxx 0800 5238 6d04 0000 dce5 c339
8b7d 0d00
00:27:57.165620 ppp0 < x.x.x.x > y.y.y.y : icmp: echo reply (ttl 236, id 41986)
4500 0024 a402 0000 ec01 1ec1 xxxx xxxx
yyyy yyyy 0000 5a38 6d04 0000 dce5 c339
8b7d 0d00
The first pair of ICMP Echo request and ICMP Echo reply was pretty
usual. Computer has sent an ICMP Echo request and has received an
ICMP Echo reply from the probed machine. One notable detail – the
DF bit is not set in the ICMP Echo reply.
Than something that was not expectable has happened:
00:27:57.435620 ppp0 < x.x.x.x > y.y.y.y : icmp: echo request (DF) (ttl 236, id 41985)
4500 05dc a401 4000 ec01 d909 xxxx xxxx
yyyy yyyy 0800 7e52 9abc def0 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
...
00:27:57.435672 ppp0 > y.y.y.y > x.x.x.x: icmp: echo reply (ttl 255, id 53)
4500 05dc 0035 0000 ff01 a9d6 yyyy yyyy
xxxx xxxx 0000 8652 9abc def0 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
...
The machine queried pinged us back. The ICMP Echo request size
was 1500 bytes. It was the maximum transfer unit Internet
Connection was allowed to process. The request was sent with the
DF bit set. Any router along the way, trying to fragment the
request because the MTU of the destined network was smaller than
the datagram’s size would fail and send an ICMP Error message
back stating a fragmentation was required but the don't fragment
bit was set. It would allow the sending machine to send a smaller
sized datagram according to its PMTU discovery process/algorithm
with ICMP. If for this ICMP Echo request an ICMP Echo reply would
be received, than the PMTU is discovered.
00:27:57.885662 ppp0 > y.y.y.y > x.x.x.x : icmp: echo request (ttl 255, id 13170)
4500 0024 3372 0000 ff01 7c51 yyyy yyyy
xxxx xxxx 0800 5832 6d04 0100 dde5 c339
8383 0d00
00:27:58.155627 ppp0 < x.x.x.x > y.y.y.y : icmp: echo reply (DF) (ttl 236, id 41987)
4500 0024 a403 4000 ec01 debf xxxx xxxx
yyyy yyyy 0000 6032 6d04 0100 dde5 c339
8383 0d00
The following ICMP Echo Request sent from my machine to the
queried HP-UX 11.0 just milliseconds after my reply to the HP-UX's
query was sent. It has resulted in an ICMP Echo reply coming back
from the queried machine. This time the DF bit was set with the
ICMP Echo reply. Rather than sending an ICMP datagram that will
be fragmented somewhere along the way to the destination machine,
it is more beneficial from performance perspective, to fragment
the ICMP datagram on sending. Setting the DF bit on the following
replies would help to maintain the PMTU between the two systems,
if for any reason, the PMTU would be decreased. For example,
because the datagram have used another route to the destined
system.
Sending immediately another ICMP Query message type to this
particular HP-UX 11.0x operating system based machine, will not
result in the PMTU discovery process to be repeated. The DF Bit
would be set within the ICMP Query reply. Expect a threshold to be
maintained by the HP-UX 11.0x. When reached the next time we
query this host with any type of communication, the process of
determining the PMTU using ICMP Echo requests will begin again.
This gives us the ability to distinguish between Sun Solaris
machines, HP-UX 11.0x/10.30, and AIX 4.3.x based machines.
Sun Solaris sets the DF bit with the ICMP Query replies the
operating system answers for, in order to support its global PMTU
discovery process. If the networking link will not let the ICMP
Query reply to get back to the querying host, because the MTU used
is higher than the allowed and fragmentation is not allowed (the
DF Bit is set), than the size of the MTU used should be lowered.
This is a simple operating system fingerprinting method, which
does not require additional or unusual patterns to be set.
The following operating systems where queries and checked for this
kind of behavior: Linux Kernel 2.4 test 2,4,5,6; Linux Kernel
2.2.x; FreeBSD 4.0, 3.4; OpenBSD 2.7,2.6; NetBSD 1.4.1,1.4.2; BSDI
BSD/OS 4.0,3.1; Solaris 2.6,2.7,2.8; HP-UX 10.20, 11.0x; Compaq
Tru64 5.0; Aix 4.1,3.2; Irix 6.5.3, 6.5.8; Ultrix 4.2 & 4.5;
OpenVMS v7.1-2; Novel Netware 5.1 SP1, 5.0, 3.12; Microsoft
Windows 98/98SE/ME, Microsoft Windows NT WRKS SP6a, Windows NT
Server SP4, Microsoft Windows 2000 Family.
SOLUTION
With HP-UX 10.30, & 11.0 , one of the ndd command option is the
ip_pmtu_strategy. The variable settings for this option are
either 1 or 2. If this bit value is 2, than the Path MTU
Discovery Process is used with ICMP Echo Requests. This is the
default value. If this bit value equals 1, than the HP-UX
machines will not use the ICMP echo-request PMTU discovery
strategy, and will not set the DF bit after determining the
accurate PMTU.
To turn off ip_path_mtu_discovery on a Sun Solaris machine use the
following command as root:
# ndd -set /dev/ip ip_path_mtu_discovery 0
Than when the ICMP Echo Reply is sent (this example) the DF bit is
not set:
# SING v1.0beta7 initiated on Host_Address at Thu Sep 14 10:01:02 2000
# Command line:
# -> sing -c 1 -L Host_Address
SINGing to Host_Address (IP_Address): 16 data bytes
16 bytes from 10.13.57.20: icmp_seq=0 ttl=254 TOS=0 time=1.578 ms
--- Host_Address sing statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 1.578/1.578/1.578 ms
# SING finished at Thu Sep 14 10:01:02 2000
This was tested against Solaris 2.5.1, Solaris 2.6 and Solaris
2.7, all SPARC boxes. With Sun Solaris turning this option off,
will turn off the PMTU discovery process with TCP as well. This
is not recommended because of performance issues.
You can disable both ICMP address mask request and ICMP Timestamp
(broadcast and unicast) under Solaris with ndd. The commands are:
ndd -set /dev/ip ip_respond_to_address_mask_broadcast 0
ndd -set /dev/ip ip_respond_to_timestamp_broadcast 0
ndd -set /dev/ip ip_respond_to_timestamp 0
These are recommended by Sun (along with other fun ndd commands)
in their "Solaris Operating Environment Network Settings for
Security By Alex Noordergraaf and Keith Watson", a Sun Blueprint
available at
http://www.sun.com/blueprints