COMMAND

    kernel

SYSTEMS AFFECTED

    Solaris

PROBLEM

    Ofir Arkin found  following.  RFC  791 defines a  three bits field
    used for various control flags in the IP Header.

    Bit 0 is the reserved flag, and must be zero.

    Bit 1, is called the Don’t Fragment flag, and can have two values.
    A value of  zero (not set)  is equivalent to  May Fragment, and  a
    value of one is equivalent to Don't Fragment.  If this flag is set
    than  the  fragmentation  of  this  packet  at the IP level is not
    permitted, otherwise it is.

    Bit 2, is called the More  Fragments bit. It can have two  values.
    A value of zero is equivalent to (this is the) Last Fragment,  and
    a value of 1 is equivalent to More Fragments (are coming).

    The next  field in  the IP  header is  the Fragment  Offset field,
    which identifies the fragment  location relative to the  beginning
    of the original  un-fragmented datagram (RFC  791, bottom of  page
    23).

    A close examination  of the ICMP  Query replies would  reveal that
    some operating systems would set the DF bit with their replies.

    The tcpdump trace  below illustrates the  reply a Sun  Solaris 2.7
    box produced for an ICMP Echo Request:

        17:10:19.538020 if 4  > y.y.y.y > x.x.x.x : icmp: echo request (ttl 255, id 13170)
			         4500 0024 3372 0000 ff01 9602 yyyy yyyy
			         xxxx xxxx 0800 54a4 8d04 0000 cbe7 bc39
			         8635 0800
        17:10:19.905254 if 4  < x.x.x.x > y.y.y.y : icmp: echo reply (DF) (ttl 233, id 24941)
			         4500 0024 616d 4000 e901 3e07 xxxx xxxx
			         yyyy yyyy 0000 5ca4 8d04 0000 cbe7 bc39
			         8635 0800

    ICMP  Query  replies  for  an  operating system maintains the same
    behavioral patterns.  Either they set the DF bit on all ICMP query
    reply types or they do not.

    The DF bit would  be set by default  with ICMP Query replies  with
    Sun  Solaris.   With  HP-UX  10.30,  &  11.0x,  and with AIX 4.3.x
    setting the  DF Bit  will vary  from one  queried host  to another
    (explanation coming).   It may  be set  with the  first ICMP Query
    reply onwards,  or after  a number  of ICMP  Query replies.   This
    detail  will  help  us  to  distinguish between Sun Solaris, HP-UX
    10.30 & 11.0x, and AIX 4.3.x operating systems.

    Why HP-UX 10.30 / 11.0 & AIX 4.3.x operating systems act this way?
    HP claims to have a  proprietary method in order to  determine the
    PMTU with HP-UX v10.30, and HP-UX v11.0x using ICMP Echo requests.
    AIX 4.3.x do exactly the same.

    The next trace will help  understanding the process taken by  HPUX
    10.30 &  11.0x and  AIX 4.3.x.   Here we  have sent  an ICMP  Echo
    request to an HP-UX 11.0 machine:

        00:27:56.884147 ppp0 > y.y.y.y > x.x.x.x: icmp: echo request (ttl 255, id 13170)
			         4500 0024 3372 0000 ff01 7c51 yyyy yyyy
			         xxxx xxxx 0800 5238 6d04 0000 dce5 c339
			         8b7d 0d00
        00:27:57.165620 ppp0 < x.x.x.x > y.y.y.y : icmp: echo reply (ttl 236, id 41986)
			         4500 0024 a402 0000 ec01 1ec1 xxxx xxxx
			         yyyy yyyy 0000 5a38 6d04 0000 dce5 c339
			         8b7d 0d00

    The first pair of ICMP Echo request and ICMP Echo reply was pretty
    usual.  Computer has sent an ICMP Echo request and has received an
    ICMP Echo reply from the probed machine. One notable detail –  the
    DF bit is not set in the ICMP Echo reply.

    Than something that was not expectable has happened:

        00:27:57.435620 ppp0 < x.x.x.x > y.y.y.y : icmp: echo request (DF) (ttl 236, id 41985)
			         4500 05dc a401 4000 ec01 d909 xxxx xxxx
			         yyyy yyyy 0800 7e52 9abc def0 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         ...
        
        00:27:57.435672 ppp0 > y.y.y.y > x.x.x.x: icmp: echo reply (ttl 255, id 53)
			         4500 05dc 0035 0000 ff01 a9d6 yyyy yyyy
			         xxxx xxxx 0000 8652 9abc def0 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         0000 0000 0000 0000 0000 0000 0000 0000
			         ...

    The machine queried  pinged us back.   The ICMP Echo  request size
    was  1500  bytes.   It  was  the  maximum  transfer  unit Internet
    Connection was allowed to process.  The request was sent with  the
    DF bit  set.   Any router  along the  way, trying  to fragment the
    request because the MTU of  the destined network was smaller  than
    the datagram’s  size would  fail and  send an  ICMP Error  message
    back stating a fragmentation  was required but the  don't fragment
    bit was set.  It would allow the sending machine to send a smaller
    sized datagram according  to its PMTU  discovery process/algorithm
    with ICMP.  If for this ICMP Echo request an ICMP Echo reply would
    be received, than the PMTU is discovered.

        00:27:57.885662 ppp0 > y.y.y.y > x.x.x.x : icmp: echo request (ttl 255, id 13170)
			         4500 0024 3372 0000 ff01 7c51 yyyy yyyy
			         xxxx xxxx 0800 5832 6d04 0100 dde5 c339
			         8383 0d00
        00:27:58.155627 ppp0 < x.x.x.x > y.y.y.y : icmp: echo reply (DF) (ttl 236, id 41987)
			         4500 0024 a403 4000 ec01 debf xxxx xxxx
			         yyyy yyyy 0000 6032 6d04 0100 dde5 c339
			         8383 0d00

    The  following  ICMP  Echo  Request  sent  from  my machine to the
    queried HP-UX 11.0 just milliseconds after my reply to the HP-UX's
    query was sent.  It has resulted in an ICMP Echo reply coming back
    from the queried machine.  This  time the DF bit was set  with the
    ICMP Echo reply.  Rather  than sending an ICMP datagram  that will
    be fragmented somewhere along the way to the destination  machine,
    it is  more beneficial  from performance  perspective, to fragment
    the ICMP datagram on sending.  Setting the DF bit on the following
    replies would help to maintain  the PMTU between the two  systems,
    if for  any reason,  the PMTU  would be  decreased.   For example,
    because  the  datagram  have  used  another  route to the destined
    system.

    Sending  immediately  another  ICMP  Query  message  type  to this
    particular HP-UX  11.0x operating  system based  machine, will not
    result in the PMTU discovery process  to be repeated.  The DF  Bit
    would be set within the ICMP Query reply. Expect a threshold to be
    maintained by  the HP-UX  11.0x.   When reached  the next  time we
    query this  host with  any type  of communication,  the process of
    determining the PMTU using ICMP Echo requests will begin again.

    This  gives  us  the  ability  to  distinguish between Sun Solaris
    machines, HP-UX 11.0x/10.30, and AIX 4.3.x based machines.

    Sun  Solaris  sets  the  DF  bit  with  the ICMP Query replies the
    operating system answers for, in order to support its global  PMTU
    discovery process.  If the  networking link will not let  the ICMP
    Query reply to get back to the querying host, because the MTU used
    is higher than the allowed  and fragmentation is not allowed  (the
    DF Bit is set), than the size of the MTU used should be lowered.

    This is  a simple  operating system  fingerprinting method,  which
    does not require additional or unusual patterns to be set.

    The following operating systems where queries and checked for this
    kind of  behavior:   Linux Kernel  2.4 test  2,4,5,6; Linux Kernel
    2.2.x; FreeBSD 4.0, 3.4; OpenBSD 2.7,2.6; NetBSD 1.4.1,1.4.2; BSDI
    BSD/OS 4.0,3.1;  Solaris 2.6,2.7,2.8;  HP-UX 10.20,  11.0x; Compaq
    Tru64  5.0;  Aix  4.1,3.2;  Irix  6.5.3,  6.5.8; Ultrix 4.2 & 4.5;
    OpenVMS  v7.1-2;  Novel  Netware  5.1  SP1,  5.0,  3.12; Microsoft
    Windows 98/98SE/ME,  Microsoft Windows  NT WRKS  SP6a, Windows  NT
    Server SP4, Microsoft Windows 2000 Family.

SOLUTION

    With HP-UX 10.30, &  11.0 , one of  the ndd command option  is the
    ip_pmtu_strategy.  The  variable  settings  for  this  option  are
    either  1  or  2.   If  this  bit  value  is  2, than the Path MTU
    Discovery Process is  used with ICMP  Echo Requests.   This is the
    default  value.   If  this  bit  value  equals  1,  than the HP-UX
    machines  will  not  use  the  ICMP  echo-request  PMTU  discovery
    strategy,  and  will  not  set  the  DF  bit after determining the
    accurate PMTU.

    To turn off ip_path_mtu_discovery on a Sun Solaris machine use the
    following command as root:

        # ndd -set  /dev/ip  ip_path_mtu_discovery 0

    Than when the ICMP Echo Reply is sent (this example) the DF bit is
    not set:

        # SING v1.0beta7 initiated on Host_Address at Thu Sep 14 10:01:02 2000
        # Command line:
        # -> sing -c 1 -L Host_Address
        SINGing to Host_Address (IP_Address): 16 data bytes
        16 bytes from 10.13.57.20: icmp_seq=0 ttl=254 TOS=0 time=1.578 ms

        --- Host_Address sing statistics ---
        1 packets transmitted, 1 packets received, 0% packet loss
        round-trip min/avg/max = 1.578/1.578/1.578 ms
        # SING finished at Thu Sep 14 10:01:02 2000

    This was  tested against  Solaris 2.5.1,  Solaris 2.6  and Solaris
    2.7, all SPARC boxes.   With Sun Solaris turning this  option off,
    will turn off  the PMTU discovery  process with TCP  as well. This
    is not recommended because of performance issues.

    You can disable both ICMP address mask request and ICMP  Timestamp
    (broadcast and unicast) under Solaris with ndd.  The commands are:

        ndd -set /dev/ip ip_respond_to_address_mask_broadcast 0
        ndd -set /dev/ip ip_respond_to_timestamp_broadcast 0
        ndd -set /dev/ip ip_respond_to_timestamp 0

    These are recommended by Sun  (along with other fun ndd  commands)
    in  their  "Solaris  Operating  Environment  Network  Settings for
    Security By Alex Noordergraaf  and Keith Watson", a  Sun Blueprint
    available at

        http://www.sun.com/blueprints