COMMAND
Passive System Fingerprinting using Network Client Applications
SYSTEMS AFFECTED
munices
PROBLEM
Jose Nazario posted following white paper. Here is his a low-jack
approach to passive network analysis. Passive target
fingerprinting involves the utilization of network traffic between
two hosts by a third system to identify the types of systems being
used. Because no data is sent to either system by the monitoring
party, detection approaches the impossible. Methods which rely
solely on the IP options present in normal traffic are limited
in the accuracy about the targets. Further inspection is also
needed to determine avenues of vulnerability, as well. We
describe a method to rapidly identify target operating systems
and version, as well as vectors of attack, based on data sent by
client applications. While simplistic, it is robust. The
accuracy of this method is also quite high in most cases. Four
methods of fingerprinting a system are presented, with sample data
provided.
Passive OS mapping has become a new area of research in both white
hat and black hat arenas. For the white hat, it becomes a new
method to map their network and monitor traffic for security. For
example, a new and possibly subversive host can be identified
quickly, often with great accuracy. For the black hat, this
method provides a nearly undetectable method to map a network,
finding vulnerable hosts.
To be sure, passive mapping can be a time consuming process. Even
with automated tools like Siphon (a sufficient quantity packets to
arrive to build up a statistically significant reading of the
subjects' operating systems). Compare this to active OS
fingerprinting methods, using tools like nmap and queso, which can
operate in under a minute usually, and only more determined
attackers, or curious types, will be attracted to this method.
Siphon, nmap and queso are available from:
http://www.subterrain.net/projects/siphon/
http://www.insecure.org/nmap/
http://www.apostols.org/
Two major methods of operating system fingerprinting exist in
varying degrees of use, active and passive. Active scanning
involves the use of IP packets sent to the host and the scanner
then monitoring the replies to guess the operating systems.
Passive scanning, in contrast, allows the scanning party to obtain
information in the absence of any packets sent from the listening
system to the targets. Each method has their advantages, and
their limitations.
Active Scanning
===============
By now nearly everyone is familiar with active scanning methods.
The premier port scanning tool, nmap, has been equipped for some
time now with accurate active scanning measures. This code is
based off of an earlier tool, queso, from the group The Apostols.
Nmap's author, Fyodor, has written an excellent paper on this
topic in the e-zine Phrack (issue 54 article 9). Ofir Arkin has
been using ICMP bit handling to differentiate between certain
types of operating systems. Because ICMP usually slips below the
threshold of analysis, and most of the ICMP messages used are
legitimate, the detection of this scanning can be more difficult
than, say, queso or nmap fingerprinting.
The problems with active scanning are mainly twofold: first, we
can readily firewall the packets used to fingerprint our system,
obfuscating the information; secondly, we can detect it quite
easily. Because of this, it is less attractive for a truly
stealthy adversary.
Passive Scanning
================
In a message dated June 30, 1999, Photon posted to the
nmap-hackers list with some ideas of passive operating system
fingerprinting (this note is available from the MARC archives of
the nmap-hackers list). He set up a webpage with some of his
thoughts, which has since been taken down. In short, by using
default IP packet construction behavior, including default TTL
values, the presence of the DF bit, and the like, one can gain a
confident level of the system's OS.
These ideas were quickly picked up by others and several lines of
research have been active since then. Lance Spitzer's paper dated
May 24 2000:
http://www.enteract.com/~lspitz/pubs.html
on passive fingerprinting included many of the data needed to
build such a tool. In fact, two quickly appeared, one from Craig
Smith and another tool called p0f from Michael Zalewski:
http://www.enteract.com/~lspitz/passfing.tar.gz
http://lcamtuf.hack.pl/p0f.tgz
One very interesting tool that is under active development,
extending the earlier work, is Siphon. By utilizing not only IP
stack behavior, but also routing information and spanning tree
updates, a complete network map can be built over time. Passive
port scans also take place, adding to the data. This tool
promises to be truly useful for the white hat, and a patient black
hat.
One limitation of these methods, though, is that they only provide
a measure of the operating system. Vulnerabilities may or may not
exist, and further investigations must be undertaken to evaluate
if this is the case. While suitable for the white hat for most
purposes (like accounting), this is not suitable to a would-be
attacker. Simply put, more information is needed.
An Alternative Approach
=======================
An alternative method to merely fingerprinting the operating
system is to perform an identification by using client
applications. Quite a number of network clients send revealing
information about their host systems, either directly or
indirectly. We use application level information to map back to
the operating system, either directly or indirectly.
One very large advantage to the method described here is that in
some situations, much more accurate information can be gained
about the client. Because of stack similarities, most Windows
systems, including 95, 98 and NT 4.0, look too similar to
differentiate. The client application, however, is willing to
reveal this information.
This provides not only a measure of the target's likely operating
system, but also a likely vector for entrance. Most of these
client applications have numerous security holes, to which one
can point malicious data. In some cases, this can provide the key
information needed to begin infiltrating a network, and one can
proceed more rapidly. In most cases it provides a starting point
for the analysis of vulnerabilities of a network.
One major limitation of this method, however, comes when a system
is emulating another to provide access to client software. This
includes Solaris and SCO's support for Linux binaries. As such,
under these circumstances, the data should be taken with some
caution and evaluated in the presence of other information. This
limitation, however, is similar to the limitation that IP stack
tweaking can place on passive fingerprinting at the IP level, or
the effect on active scanning from these adjustments or
firewalling.
Four different type of network clients are discussed here which
provide suitable fingerprinting information. Email clients, which
leave telltale information in most cases on their messages;
Usenet clients, which, like mail applications, litter their posts
with client system information; web browsers, which send client
information with each request; and even the ubiquitous telnet
client, which sends such information more quietly, but can just
as effectively fingerprint an operating system.
Knowing this, one now only needs to harvest the network for this
information and map it to source addresses. Various tools,
including sniffers, both generic and specialized, and even web
searches will yield this information. A rapid analysis of systems
can be quickly performed. This works quite well for the white hat
and the black hat hacker, as well.
In this paper is described a low tech approach to fingerprinting
systems for both their operating system and a likely route to
gaining entry. By using application level data sent from them
over the network, we can quickly gather accurate data about a
system. In some cases, one doesn't even have to be on the same
network as the targets, they can gather the information from afar,
compile the information and use it at their discretion at a later
date.
Mail Clients
------------
One of the largest type of traffic the network sees is electronic
mail. Nearly everyone who uses the Internet on a regular basis
uses email in those transaction sessions. They not only receive
mail, but also send a good amount of mail, too. Because it is
ubiquitous, it makes an especially attractive avenue for system
fingerprinting and ultimately penetration.
Within the headers of nearly every mail message is some form of
system identification. Either through the use of crafted message
identification tags, as used by Eudora and Pine, or by explicit
header information, such as headers generated by OutLook clients
or CDE mail clients.
The scope of this method, both in terms of information gained and
the potential impact, should not be underestimated. If anything,
viruses that spread by email, including ones that are used to
steal passwords from systems, should illustrate the effectiveness
of this method.
Pine, for example, itself is one of the worst offenders of any
application for the system it is on. It gives away a whole host
of information useful to an attacker in one fell swoop. To wit:
Message-ID: <Pine.LNX.4.10.9907191137080.14866-100000@somehost.example.ca>
It is clear it's Pine, we know the version (4.10), and we know the
system type. Too much about it, in fact. This is a list of the
main ports of Pine as of 4.30:
a41 IBM RS/6000 running AIX 4.1 or 4.2
a32 IBM RS/6000 running AIX 3.2 or earlier
aix IBM S/370 AIX
aos AOS for IBM RT (untested)
mnt FreeMint
aux Macintosh A/UX
bsd BSD 4.3
bs3 BSDi BSD/386 Version 3 and Version 4
bs2 BSDi BSD/386 Version 2
bsi BSDi BSD/386 Version 1
dpx Bull DPX/2 B.O.S.
cvx Convex
d54 Data General DG/UX 5.4
d41 Data General DG/UX 4.11 or earlier
d-g Data General DG/UX (even earlier)
ult DECstation Ultrix 4.1 or 4.2
gul DECstation Ultrix using gcc compiler
vul VAX Ultrix
os4 Digital Unix v4.0
osf DEC OSF/1 v2.0 and Digital Unix (OSF/1) 3.n
sos DEC OSF/1 v2.0 with SecureWare
epx EP/IX System V
bsf FreeBSD
gen Generic port
hpx Hewlett Packard HP-UX 10.x
hxd Hewlett Packard HP-UX 10.x with DCE security
ghp Hewlett Packard HP-UX 10.x using gcc compiler
hpp Hewlett Packard HP-UX 8.x and 9.x
shp Hewlett Packard HP-UX 8.x and 9.x with Trusted Computer Base
gh9 Hewlett Packard HP-UX 8.x and 9.x using gcc compiler
isc Interactive Systems Unix
lnx Linux using crypt from the C library
lnp Linux using Pluggable Authentication Modules (PAM)
slx Linux using -lcrypt to get the crypt function
sl4 Linux using -lshadow to get the crypt() function
sl5 Linux using shadow passwords, no extra libraries
lyn Lynx Real-Time System (Lynxos)
mct Tenon MachTen (Mac)
osx Macintosh OS X
neb NetBSD
nxt NeXT 68030's and 68040's Mach 2.0
bso OpenBSD with shared-lib
sc5 SCO Open Server 5.x
sco SCO Unix
pt1 Sequent Dynix/ptx v1.4
ptx Sequent Dynix/ptx
dyn Sequent Dynix (not ptx)
sgi Silicon Graphics Irix
sg6 Silicon Graphics Irix >= 6.5
so5 Sun Solaris >= 2.5
gs5 Sun Solaris >= 2.5 using gcc compiler
so4 Sun Solaris <= 2.4
gs4 Sun Solaris <= 2.4 using gcc compiler
sun Sun SunOS 4.1
ssn Sun SunOS 4.1 with shadow password security
gsu SunOS 4.1 using gcc compiler
s40 Sun SunOS 4.0
sv4 System V Release 4
uw2 UnixWare 2.x and 7.x
wnt Windows NT 3.51
Pine system types used in Message-ID tags as of Pine 4.30. This
table was gathered from the supported systems listed in the Pine
source code documentation, in the file pine4.30/doc/pine-ports,
and was edited for brevity.
Hence, with the above message ID, one knows the target's hostname,
an account on that machine that reads mail using Pine, and that
it's Linux without shadowed passwords (the LNX host type). Hang
out on a mailing list, maybe something platform agnostic, and
collect targets. In this case, one could use a well known exploit
within the mail message, grab the system password file and send it
back to ourselves for analysis. This can easily scaled to as many
clients as has been fingerprinted; one mass mailing, and sit back
and wait for the password files to come in.
This is not to say that other mail clients are not vulnerable to
such information leaks. Most mail clients give out similar
information, either directly or indirectly. Direct information
would be an entry in the message headers, such as an X-Mailer tag.
Indirect information would be similar to that seen for Pine, a
distinctive message ID tag. When this information is coupled to
the information about the originating host, a fingerprint can
occur rapidly.
Some examples:
User-Agent: Mutt/1.2.4i
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
X-Mailer: dtmail 1.2.1 CDE Version 1.2.1 SunOS 5.6 sun4u sparc
X-Mailer: PMMail 2000 Professional (2.10.2010) For Windows 2000 (5.0.2195)
X-Mailer: QUALCOMM Windows Eudora Version 4.3.2
Message-ID: <4.3.2.7.2.20001117142518.043ad100@mailserver3.somewhere.gov>
While not all clients give out their host system or processors,
such as Mutt or Outlook Express, this information can be used by
itself to get a larger vulnerability assessment. For example, if
we know what version strings appear only on Windows, as opposed
to a MacOS system, we can determine the processor type. The
dtmail application is entirely too friendly to someone determining
vulnerabilities, giving up the processor and OS revision. Given
the problems that have appeared in the CDE suite, and in older
versions of Solaris, an attack would be all too easy to construct.
There are two main avenues for finding this information for lots
of clients quickly. First, we can sniff the network for this
information. Using a tool like mailsnarf, ngrep or any sniffer
with some basic filtering, a modest collection of host to client
application data can be gathered. The speed of collection and the
ultimate size of this database depends chiefly on the amount of
traffic your network segment sees. This is the main drawback to
this method, a limited amount of data.
A much more efficient method, and one that can make use of this
above information, is in offline (for the target with respect to
the potential attacker) system fingerprinting, with an exploit
path included. How do we do this? We search the web, with it's
repleat mailing list archives, and we turn up some boxes.
Altavista: 2,033 pages found (for pine.ult)
Google results 1-10 of about 141,000 for pine.lnx
Altavista: 16,870 pages found (for pine.osf)
You get the idea. Tens of thousands of hits, thousands of
potentially exploitable boxes ready to be picked. Simply evaluate
the source host information and map it to the client data and a
large database of vulnerable hosts is rapidly built.
The exploits are easy. Every week, new exploits are found in
client software, either mail applications like Pine, or methods
to deliver exploits using mail software. Examples of this
include the various buffer overflows that have appeared (and
persist) in Pine and OutLook, the delivery of malicious DLL files
using Eudora attachments, and such. We know from viruses like
ILOVEYOU and Melissa that more people than not will open almost
any mail message, and we know from spammers that it's trivial to
bulk send messages with forged headers, making traceback
difficult. These two items combine to make for a very readily
available exploit.
In a manner similar to electronic mail, Usenet clients leave
significant information in the headers of their posts which
reveal information about their host operating systems. One great
advantage to Usenet, as opposed to email or even web traffic, is
that posts are distributed. As such, we can be remote and collect
data on hosts without their knowledge or ever having to gain entry
into their network.
Among the various newsreaders commonly used, copious host info is
included in the headers. The popular UNIX newsreader 'tin' is
among the worst offenders of revealing host information. Operating
system versions, processors and applications are all listed in the
'User-Agent' field, and when coupled to the NNTP-Posting-Host
information, a remote host fingerprint has been performed:
User-Agent: tin/1.5.2-20000206 ("Black Planet") (UNIX) (SunOS/5.6(sun4u))
User-Agent: tin/pre-1.4-980226 (UNIX) (FreeBSD/2.2.7-RELEASE (i386))
User-Agent: tin/1.4.2-20000205 ("Possession") (UNIX) (Linux/2.2.13(i686))
NNTP-Posting-Host: host.university.edu
The standard web browsers also leave copious information about
themselves and their host systems, as they do with HTTP requests
and mail. We will elaborate on web clients in the next section,
but they are also a problem as Usenet clients:
X-Http-User-Agent: Mozilla/4.75 (Windows NT 5.0; U)
X-Mailer: Mozilla 4.75 (X11; U; Linux 2.2.16-3smpi686)
And several other clients also leave verbose information about
their hosts to varying degrees. Again, when combined with the
NNTP-Posting-Host or other identifying header, one can begin to
amass information about hosts without too much work:
Message-ID: <Pine.LNX.4.21.0010261126210.32652-100000@host.example.co.nz>
User-Agent: MT-NewsWatcher/3.0 (PPC)
X-Operating-System: GNU/Linux 2.2.16
User-Agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Bryce Canyon)
X-Newsreader: Microsoft Outlook Express 5.50.4133.2400
X-Newsreader: Forte Free Agent 1.21/32.243
X-Newsreader: WinVN 0.99.9 (Released Version) (x86 32bit)
Either directly or indirectly, we can fingerprint the operating
system over the source host. Other programs are not so
forthcoming, but still leak information about a host that can be
used to determine vulnerability analysis.
X-Newsreader: KNode 0.1.13
User-Agent: Pan/0.9.1 (Unix)
User-Agent: Xnews/03.02.04
X-Newsreader: trn 4.0-test74 (May 26, 2000)
X-Newsreader: knews 1.0b.0 (mrsam/980423)
User-Agent: slrn/0.9.5.7 (UNIX)
X-Newsreader: InterChange (Hydra) News v3.61.08
None of these header fields are required by the specifications
for NNTP, as noted in RFC 2980. They provide only some additional
information about the host which was the source of the data.
However, given that more transactions that concern the servers are
between servers, this data is entirely extraneous. It is, it
appears, absent from RFC 977, the original specification for NNTP.
On interesting possibility to exploiting a user agent like Mozilla
is to examine the accepted languages. In the below example, we
see not only English is supported, but that the browser is linked
to Acrobat. Given potential holes, and past problems, with
malicious PDF files, this could be another avenue to gaining entry
to a host.
X-Mailer: Mozilla 4.75 (Win98; U)
X-Accept-Language: en,pdf
While this may seem that we're limited to fingerprinting hosts, or
out of luck if they are using a proxy, this is not the case. We
can also retrieve proxy info from the headers:
X-Http-Proxy: 1.0 x72.deja.com:80 (Squid/1.1.22) for client 10.32.34.18
While in this case the proxy is disconnected from the client's
network, if this were a border proxy, we could use this to gain
information about a possible entry point to the network and, over
time and with enough sample data, information about the network
behind the protected border.
A remarkably simple and highly effective means of fingerprinting
a target is to follow the web browsing that gets done from it.
Most every system in use is a workstation, and nearly everyone
uses their web browsers to spend part of their day. And just
about every browser sends too much information in it's
'User-Agent' field.
RFC 1945 notes that the 'User-Agent' field is not required in an
HTTP 1.0 request, but can be used. The authors state, "user
agents should include this field with requests." They cite
statistics as well as on the fly tailoring of data to meet
features or limitations of browsers. The draft standard for HTTP
version 1.1 requests, RFC 2616, also notes similar usage of the
'User-Agent' field.
We can gather this information in two ways. First, we could run
a website and turn on logging of the User-Agent field from the
client (if it's not already on). Simply generate a lot of hits
and watch the data come in. Get on Slashdot, advertise some
pornographic material, or mirror some popular software (like
warez) and you're ready to go. Secondly, we can sniff web traffic
on our visible segment. While almost any sniffer will work, one
of the easiest for this type of work is urlsnarf from the dsniff
package from Dug Song. This package is available at
http://www.monkey.org/~dugsong/dsniff/
Examples of browsers that send not only their application
information, such as the browser and the version, but also the
operating system which the host runs include:
- Netscape (UNIX, MacOS, and Windows)
- Internet Explorer
One shining example of a browser that doesn't send extraneous
information is Lynx. On both 2.7 and 2.8 versions, only the
browser information is sent, no information about the host.
The User-Agent field can be important to the web server for
legitimate reasons. Due to implementations, both Netscape and
Explorer are not equivalent on many items, including how they
handle tables, scripting and style sheets. However, host
information is not needed and is sent gratuitously.
A typical request from a popular browser looks like this:
GET / HTTP/1.0
Connection: Keep-Alive
User-Agent: Mozilla/4.08 (X11; I; SunOS 5.7 sun4u)
Host: 10.10.32.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8
The User-Agent field is littered with extra information that we
don't need to know: the operating system type, version and even
the hardware being used.
Instantly we know everything there is to know about compromising
this host: the operating system, the host's architecture, and even
a route we could use to gain entry. For example a recent problems
in Netscape's JPEG handling.
Using urlsnarf to log these transactions is the easiest method to
sniff this information from the network. A typical line of output
is below:
10.10.1.232 - - "GET http://www.latino.com/
HTTP/1.0" - - "http://www.latino.com/" "Mozilla/4.07 (Win95; I ;Nav)"
We can also use the tool ngrep to listen to this information
on the wire. A simple filter to listen only to packets that
contain the information 'User-Agent' can be set up and used to
log information about hosts on the network. ngrep can be obtained
from the PacketFactory website:
http://www.packetfactory.net/Projects/Ngrep/
A simple regular expression filter can do the trick:
ngrep -qid ep1 'User-Agent' tcp port 80
This will print out all TCP packets which contain the case
insensitive string User-Agent in them. And, within this field,
for too many browsers, is too much information about the host.
With the above options to ngrep, typical output will look like
this:
T 10.10.11.43:1860 -> 130.14.22.107:80
GET /entrez/query/query.js HTTP/1.1..Accept: */*..Referer: http://www.
ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search DB=PubMed..Accept-Langua
ge: en-us..Accept-Encoding: gzip, deflate..If-Modified-Since: Thu, 29
Jun 2000 18:38:45 GMT; length=4558..User-Agent: Mozilla/4.0 (compatibl
e; MSIE 5.5; Windows 98)..Host: www.ncbi.nlm.nih.gov..Connection: Keep
-Alive..Cookie: WebEnv=FpEB]AfeA>>Hh^`Ba@<]^d]bCJfdADh@(j)@ =^a=T=EjIE=b<F
bg<....
Even more information is contained within the request than
urlsnarf showed us, information including cookies.
In much the same way as one can use the strings sent during
requests by the clients to determine what system type is in use,
one can follow the replies sent back by the server to determine
what type it is. Again we will use ngrep, this time matching the
expression 'server:' to gather the web server type:
T 192.168.0.5:80 -> 192.168.0.1:1033
HTTP/1.0 200 OK..Server: Netscape-FastTrack/2.01..Date: Mon, 30 Oct 20
00 00:15:31 GMT..Content-type: text/html....
While specifics about the operating system information are lost,
this works to passively gather vulnerability information about
the target server. This can be coupled to other information to
decide how best to proceed with an attack.
This information will not be covered as this paper is limited to
client applications and systems being fingerprinted.
While telnet is no longer in widespread use due to the fact that
all of its data is sent in plain text, including authentication
data, it is still used widely enough to be of use in
fingerprinting target systems. What is interesting is that it not
only gives us a mechanism to gather operating system data, it
gives us the particular application in use, which can be of value
in determining a mechanism of entry.
The specification for the telnet protocol describes a negotiation
between the client and the host for information such as line
speed, terminal type and echoing (for descriptive information on
these options and their negotiations, please see RFCs 857, 858,
859, 860, 1091, 1073, 1079, 1184, 1372, and 1408. Also, see TCP
Illustrated, Volume 1: The Protocols by W. Richard Stevens). What
is interesting to note is that each client behaves in a unique
way, even different client applications on the same host type.
Similarly, the telnet server, running a telnet daemon, can be
fingerprinted by following the negotiations with the client. This
information can be viewed from the telnet command line application
on a UNIX host by issuing the 'toggle options' command at the
telnet> prompt.
This information can be gather directly, using a wedge application
or a honey-pot as demonstrated on the network at Hope2k, or it can
be sniffed off the network in a truly passive fashion. We discuss
below gathering data about both the client system and the server
being connected to. The same principles apply to both host
identification methods.
The negotiations described above, and in the references listed,
can be used to fingerprint the client based upon the options set
and the order in which they are negotiated. Table 1 describes
the behavior of several telnet clients in these respects. Their
differences are immediately obvious, even for different clients
on the same operating system, such as Tera Term Pro and Windows
Telnet on a Windows 95 host.
In this table, all server commands and negotiation options are
ignored and only data originating from the client is shown.
Table is omitted in this version, please see:
http://www.crimelabs.net/docs/passive.html
for pdf or/and .ps
Obviously, the most direct method to fingerprint a server would
be to connect to it and examine the order of options and their
values as a telnet session was negotiated. However, as this
study is concerned with passive scanning of clients, we will
leave it to the reader to map this information and learn what to
do with it.
SOLUTION
In this paper has been illustrated the effectiveness of target
system identification by using the information provided by network
client applications. This provides a very efficient and precise
measure of the client operating system, as well as identifying a
vector for attack. This information is sent gratuitously and is
not essential to the normal operation of many of these
applications.
The main limitation of this information is found when a host is
performing emulation of another operating system to run the
client software. While this is rare, it could lead to a false
system identification. This mainly falls in the open software
world, however, and only for some operating systems.
For web browsers, which are ubiquitous and used by nearly everyone
on the Internet, the host operating system should not be sent.
Ideally, information about what protocols are spoken, what
standards are met and what language are supported (ie English,
German, French) should suffice. Lynx behaves nearly ideally in
this regard, and both Netscape and Explorer should follow this
lead.
With respect to Usenet and electronic mail clients, again only
what features are supported should be provided. Pine is an
example of how bad it can get, providing too much information
about a host too quickly. There is no reason why any legitimate
client should know what processor and OS is being run on the
sending host.
Telnet clients are far more difficult. It is tempting to say that
all telnet applications should support the same set of features,
but that is simply impossible.
Proxy hosts should be used, if possible, to strip off information
about the originating system, including the workstation address
and operating system information. This will help obscure needed
information to map a network from outside the perimeter. Coupled
with strong measures to catch viruses and malicious code, such as
in a web page script, the risks should be greatly reduced.
The best solution is for application authors to not send
gratuitous information in their headers or requests. Furthermore,
client applications should be scrutinized to the same degree as
daemons that run with administrative privilidges. The lessons of
RFC 1123 most certainly apply at this level.
First: The admin/user must not be able to alter/remove the ident
strings that are sent out by the application. This is the case
for most of Windows apps and even where it is possible people
usually do not take any measures in this departement. So we can
move on.
Second: The information displayed must actually be correct. This
is when the fun begins. To take a really good example, the Pine
on most Linux systems *always* sends messages with a Message-Id
that contains "LNX" although we think most are using shadow
passwords.
Also, most mail agents are quite good at rewriting headers if we
ask them to, MTAs being another hidden champions of this. If you
ever happen to receive a mail from me with a From: header that
says root, do not even for a second think that it was actually
sent from that account.
Also, you do not seem to adequately account for the fact that many
people are not running mail servers on their systems. So if you
eg see: Outlook Express, than fine, you know that the sender
machine was running this MUA. But it also makes it more than
likely that the email address you found is not actually that on
the sending machine but rather one on a big mail server, which may
be running anything. You have no knowledge of how mail collection
at that site works, so you cannot be sure that your exploit will
actually work. (eg person can make an email enquiry with their
browsers email client after clicking on a link but use someting
else for "normal" mail and may not even be aware of the difference
and yes, we have seen a setup like this.)
Also, emulation is (or rather can be) a quite big issue with
lesser known OSs for which not enough native applications exist.
Eg there is no Netscape binary of the current release available
for any BSD operating system. (With BSDi support having been
dropped after 4.75) so if you want to use Netscape on any BSD,
you have to use emulation. But if you go after the presumably
old, 2.0.x kernel based Linux system it reports itself as, you
will be in for a surprise). But the real kicker is using Wine
(Windows emulation package for UNIX) and a windows-based
web-browser... (yes ppl have done things like this. Sometimes
you are forced to, eg if there is not even a Linux port of the sw
you need to run).
For proxies: It is known that there exist proxies that hide your
real IP address and cannot be detected any easy way (because they
do not insert an X-Forwarded-IP field). The proxy may or may not
be local, so you do not necessarily have the entrance to the
network either.
Also, web search engines can be helpful for finding
vulnerabilities in servers but to compile lists of target hosts
from mailing list archives is fragile... there may not be many
live hits from those. (Even for server fingerprinting, some
surpriese are in the game: eg Walmart was suspected of forging
their server signature because at least on one occassion they
reported themselves as Microsoft-IIS/4.0 (Unix) mod_ssl/2.6.6
OpenSSL/0.9.5.. outright funny. So the point is: although the
information may be there, it may already be forged intentionally
or otherwise incorrect.
Also, the fact that you found eg Mutt does not tell a lot to you
unless you have a specific exploit... because many of these
programs run on many UNIX/UNIX-like systems plus on DOS/Windows.
So you do not know a lot.
And finally: with all this information you have to go out and do
some actual scanning to verify/gather more information and this is
where you can already get caught.
But, yes, even with the above points made, considering the average
Windows/Mac user and admin, information leakage can be a cause
for many an interesting occurance... why, at this rate we got the
idea for a paper titled: "utilizing information gleaned from
Internet-accessible support pages of various big organizations
and institutions in network incidents..." It is at least as
interesting a topic as this one.