Of course, additional arguments would probably be used. Using multiple Telnet connections to a host or multiple windows in an X Window session allows you to record in one window while taking actions to generate traffic in another window. This approach can be very helpful in some circumstances. An alternative is to use telnet to connect to the probe computer. The session could be logged with many of the versions of telnet that are available. Be aware, however, that the Telnet connection will generate considerable traffic that may become part of your log file unless you are using filtering. (Filtering, which is discussed later in this chapter, allows you to specify the type of traffic you want to examine.) The additional traffic may also overload the connection, resulting in lost packets. Another alternative is to run tcpdump as a detached process by including an & at the end of the command line. Here is an example:bsd1# tcpdump -l | tee outfile
The command starts tcpdump, prints a process number, and returns the user prompt along with a message that tcpdump has started. You can now enter commands to generate the traffic you are interested in. (You really have a prompt at this point; the message from tcpdump just obscures it.) Once you have generated the traffic of interest, you can terminate tcpdump by issuing a kill command using the process number reported when tcpdump was started. (You can use the ps command if you have forgotten the process number.)bsd1# tcpdump -w outfile & [1] 70260 bsd1# tcpdump: listening on xl0
You can now analyze the capture file. (Running tcpdump as a detached process can also be useful when you are trying to capture traffic that might not show up for a while, e.g., RADIUS or DNS exchanges. You might want to use the nohup command to run it in the background.) Yet another approach is to use the -w option to write the captured data directly to a file. This option has the advantage of collecting raw data in binary format. The data can then be replayed with tcpdump using the -r option. The binary format decreases the amount of storage needed, and different filters can be applied to the file without having to recapture the traffic. Using previously captured traffic is an excellent way of fine-tuning filters to be sure they work as you expect. Of course, you can selectively analyze data captured as text files in Unix by using the many tools Unix provides, but you can't use tcpdump filtering on text files. And you can always generate a text file from a tcpdump file for subsequent analysis with Unix tools by simply redirecting the output. To capture data you might type:bsd1# kill 70260 153 packets received by filter 0 packets dropped by kernel [1] Done tcpdump -w outfile
The data could be converted to a text file with:bsd1# tcpdump -w rawfile
This approach has several limitations. Because the data is being written directly to a file, you must know when to terminate recording without actually seeing the traffic. Also, if you limit what is captured with the original run, the data you exclude is lost. For these reasons, you will probably want to be very liberal in what you capture, offsetting some of the storage gains of the binary format. Clearly, each approach has its combination of advantages and disadvantages. If you use tcpdump very much, you will probably need each from time to time.bsd1# tcpdump -r rawfile > textfile
While limiting packet capture can be useful in some circumstances, it is generally difficult to predict accurately how many packets need to be collected. If you are running tcpdump on a host with more than one network interface, you can specify which interface you want to use with the -i option. Use the command ifconfig -a to discover what interfaces are available and what networks they correspond to if you aren't sure. For example, suppose you are using a computer with two class C interfaces, xl0 with an IP address of 205.153.63.238 and xl1 with an IP address of 205.153.61.178. Then, to capture traffic on the 205.153.61.0 network, you would use the command:bsd1# tcpdump -c100
Without an explicitly identified interface, tcpdump defaults to the lowest numbered interface. The -p option says that the interface should not be put into promiscuous mode. This option would, in theory, limit capture to the normal traffic on the interface -- traffic to or from the host, multicast traffic, and broadcast traffic. In practice, the interface might be in promiscuous mode for some other reason. In this event, -p will not turn promiscuous mode off. Finally, -s controls the amount of data captured. Normally, tcpdump defaults to some maximum byte count and will only capture up to that number of bytes from individual packets. The actual number of bytes depends on the pseudodevice driver used by the operating system. The default is selected to capture appropriate headers, but not to collect packet data unnecessarily. By limiting the number of bytes collected, privacy can be improved. Limiting the number of bytes collected also decreases processing and buffering requirements. If you need to collect more data, the -s option can be used to specify the number of bytes to collect. If you are dropping packets and can get by with fewer bytes, -s can be used to decrease the number of bytes collected. The following command will collect the entire packet if its length is less than or equal to 200 bytes:bsd1# tcpdump -i xl1
Longer packets will be truncated to 200 bytes. If you are capturing files using the -w option, you should be aware that the number of bytes collected will be what is specified by the -s option at the time of capture. The -s option does not apply to files read back with the -r option. Whatever you captured is what you have. If it was too few bytes, then you will have to recapture the data.bsd1# tcpdump -s200
Clearly, the -a option is the default. Not using name resolution can eliminate the overhead and produce terser output. If the network is broken, you may not be able to reach your name server and will find yourself with long delays, while name resolution times out. Finally, if you are running tcpdump interactively, name resolution will create more traffic that will have to be filtered out. The -t and -tt options control the printing of timestamps. The -t option suppresses the display of the timestamp while -tt produces unformatted timestamps. The following shows the output for the same packet using tcpdump without an option, with the -t option, and with the -tt option, respectively:bsd1# tcpdump -c1 host 192.31.7.130 tcpdump: listening on xl0 14:16:35.897342 sloan.lander.edu > cio-sys.cisco.com: icmp: echo request bsd1# tcpdump -c1 -a host 192.31.7.130 tcpdump: listening on xl0 14:16:14.567917 sloan.lander.edu > cio-sys.cisco.com: icmp: echo request bsd1# tcpdump -c1 -n host 192.31.7.130 tcpdump: listening on xl0 14:17:09.737597 205.153.63.30 > 192.31.7.130: icmp: echo request bsd1# tcpdump -c1 -N host 192.31.7.130 tcpdump: listening on xl0 14:17:28.891045 sloan > cio-sys: icmp: echo request bsd1# tcpdump -c1 -f host 192.31.7.130 tcpdump: listening on xl0 14:17:49.274907 sloan.lander.edu > 192.31.7.130: icmp: echo request
The -t option produces a more terse output while the -tt output can simplify subsequent processing, particularly if you are writing scripts to process the data.12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF) sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF) 934303014.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF)
This additional information might be useful in a few limited contexts, while the quiet mode provides shorter output lines. In this instance, there was no difference between the results with -v and -vv, but this isn't always the case. The -e option is used to display link-level header information. For the packet from the previous example, with the -e option, the output is:12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: tcp 0 (DF) 12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF) 12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF) (ttl 128, id 45836) 12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF) (ttl 128, id 45836)
0:10:5a:a1:e9:8 is the Ethernet address of the 3Com card in sloan.lander.edu, while 0:10:5a:e3:37:c is the Ethernet address of the 3Com card in 205.153.63.238. (We can discover the types of adapters used by looking up the OUI portion of these addresses, as described in Chapter 2, "Host Configurations".) For the masochist who wants to decode packets manually, the -x option provides a hexadecimal dump of packets, excluding link-level headers. A packet displayed with the -x and -vv options looks like this:12:36:54.772066 0:10:5a:a1:e9:8 0:10:5a:e3:37:c ip 60: sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 3259091394 win 8647 (DF)
Please note that the amount of information displayed will depend on how many bytes are collected, as determined by the -s option. Such hex listings are typical of what might be seen with many capture programs. Describing how to do such an analysis in detail is beyond the scope of this book, as it requires a detailed understanding of the structure of packets for a variety of protocols. Interpreting this data is a matter of taking packets apart byte by byte or even bit by bit, realizing that the interpretation of the results at one step may determine how the next steps will be done. For header formats, you can look to the appropriate RFC or in any number of books. Table 5-1 summarizes the analysis for this particular packet, but every packet is different. This particular packet was a DNS lookup for www.microsoft.com. (For more information on decoding packets, see Eric A. Hall's Internet Core Protocols: The Definitive Guide.)13:57:12.719718 bsd1.lander.edu.1657 > 205.153.60.5.domain: 11587+ A? www. microsoft.com. (35) (ttl 64, id 41353) 4500 003f a189 0000 4011 c43a cd99 3db2 cd99 3c05 0679 0035 002b 06d9 2d43 0100 0001 0000 0000 0000 0377 7777 096d 6963 726f 736f 6674 0363 6f6d 0000 0100 01
Raw data in hex | Interpretation |
---|---|
IP header | |
First 4 bits of 45 | IP version -- 4 |
Last 4 bits of 45 | Length of header multiplier -- 5 (times 4 or 20 bytes) |
00 | Type of service |
00 3f | Packet length in hex -- 63 bytes |
a1 89 | ID |
First 3 bits of 00 | 000 -- flags, none set |
Last 13 bits of 00 00 | Fragmentation offset |
40 | TTL -- 64 hops |
11 | Protocol number in hex -- UDP |
c4 3a | Header checksum |
cd 99 3d b2 | Source IP -- 205.153.61.178 |
cd 99 3c 05 | Destination IP -- 205.153.60.5 |
UDP header | |
06 79 | Source port |
00 35 | Destination port -- DNS |
00 2b | UDP packet length -- 43 bytes |
06 d9 | Header checksum |
DNS message | |
2d 43 | ID |
01 00 | Flags -- query with recursion desired |
00 01 | Number of queries |
00 00 | Number of answers |
00 00 | Number of authority RRs |
00 00 | Number of additional RRs |
Query | |
03 | Length -- 3 |
77 77 77 | String -- "www" |
09 | Length -- 9 |
6d 69 63 72 6f 73 6f 66 74 | String -- "microsoft" |
03 | Length -- 3 |
63 6f 6d | String -- "com" |
00 | Length -- 0 |
00 01 | Query type -- IP address |
00 01 | Query class -- Internet |
This analysis was included here primarily to give a better idea of how packet analysis works. Several programs that analyze packet data from a tcpdump trace file are described later in this chapter. Unix utilities like strings, od, and hexdump can also make the process easier. For example, in the following example, this makes it easier to pick out www.microsoft.com in the data:
The -vv option could also be used to get as much information as possible. Hopefully, you will have little need for the -x option. But occasionally you may encounter a packet that is unknown to tcpdump, and you have no choice. For example, some of the switches on my local network use a proprietary implementation of a spanning tree protocol to implement virtual local area networks (VLANs). Most packet analyzers, including tcpdump, won't recognize these. Fortunately, once you have decoded one unusual packet, you can usually easily identify similar packets.bsd1# hexdump -C tracefile 00000000 d4 c3 b2 a1 02 00 04 00 00 00 00 00 00 00 00 00 |................| 00000010 c8 00 00 00 01 00 00 00 78 19 06 38 66 fb 0a 00 |........x..8f...| 00000020 4d 00 00 00 4d 00 00 00 00 00 a2 c6 0e 43 00 60 |M...M........C.`| 00000030 97 92 4a 7b 08 00 45 00 00 3f a1 89 00 00 40 11 |..J{..E..?....@.| 00000040 c4 3a cd 99 3d b2 cd 99 3c 05 06 79 00 35 00 2b |.:..=...<..y.5.+| 00000050 06 d9 2d 43 01 00 00 01 00 00 00 00 00 00 03 77 |..-C...........w| 00000060 77 77 09 6d 69 63 72 6f 73 6f 66 74 03 63 6f 6d |ww.microsoft.com| 00000070 00 00 01 00 01 |.....| 00000075
This command captures all traffic to and from the host with the IP address 205.153.63.30. The host may be specified by IP number or name. Since an IP address has been specified, you might incorrectly guess that the captured traffic will be limited to IP traffic. In fact, other traffic, such as ARP traffic, will also be collected by this filter. Restricting capture to a particular protocol requires a more complex filter. Nonintuitive behavior like this necessitates a thorough testing of all filters. Addresses can be specified and restricted in several ways. Here is an example that uses the Ethernet address of a computer to select traffic:bsd1# tcpdump host 205.153.63.30
Capture can be further restricted to traffic flows for a single direction, either to a host or from a host, using src to specify the source of the traffic or dst to specify the destination. The next example shows a filter that collects traffic sent to the host at 205.153.63.30 but not from it:bsd1# tcpdump ether host 0:10:5a:e3:37:c
Note that the keyword host was omitted in this example. Such omissions are OK in several instances, but it is always safer to include these keywords. Multicast or broadcast traffic can be selected by using the keyword multicast or broadcast, respectively. Since multicast and broadcast traffic are specified differently at the link level and the network level, there are two forms for each of these filters. The filter ether multicast captures traffic with an Ethernet multicast address, while ip multicast captures traffic with an IP multicast address. Similar qualifiers are used with broadcast traffic. Be aware that multicast filters may capture broadcast traffic. As always, test your filters. Traffic capture can be restricted to networks as well as hosts. For example, the following command restricts capture to packets coming from or going to the 205.153.60.0 network:bsd1# tcpdump dst 205.153.63.30
The following command does the same thing:bsd1# tcpdump net 205.153.60
Although you might guess otherwise, the following command does not work properly due to the final .0:bsd1# tcpdump net 205.153.60.0 mask 255.255.255.0
Be sure to test your filters!bsd1# tcpdump net 205.153.60.0
Of course, IP traffic will include TCP traffic, UDP traffic, and so on. To capture just TCP traffic, you would use:bsd1# tcpdump ip
Recognized keywords include ip, igmp, tcp, udp, and icmp. There are many transport-level services that do not have recognized keywords. In this case, you can use the keywords proto or ip proto followed by either the name of the protocol found in the /etc/protocols file or the corresponding protocol number. For example, either of the following will look for OSPF packets:bsd1# tcpdump tcp
Of course, the first works only if there is an entry in /etc/protocols for OSPF. Built-in keywords may cause problems. In these examples, the keyword tcp must either be escaped or the number must be used. For example, the following is fine:bsd1# tcpdump ip proto ospf bsd1# tcpdump ip proto 89
On the other hand, you can't use tcp with proto.bsd#1 tcpdump ip proto 6
will generate an error. For higher-level services, services built on top of the underlying protocols, you must use the keyword port. Either of the following will collect DNS traffic:bsd#1 tcpdump ip proto tcp
In the former case, the keyword domain is resolved by looking in /etc/services. When there may be ambiguity between transport-layer protocols, you may further restrict ports to a particular protocol. Consider the command:bsd#1 tcpdump port domain bds#1 tcpdump port 53
This will capture DNS name lookups using UDP but not DNS zone transfers using TCP. The two previous commands would capture both.bsd#1 tcpdump udp port domain
This command collects packets longer than 200 bytes. Looking inside packets is a little more complicated in that you must understand the structure of the packet's header. But despite the complexity, or perhaps because of it, this technique gives you the greatest control over what is captured. (If you are charged with creating a firewall using a product that requires specifying offsets into headers, practicing with tcpdump could prove invaluable.) The general syntax is proto [ expr : size ]. The field proto indicates which header to look into -- ip for the IP header, tcp for the TCP header, and so forth. The expr field gives an offset into the header indexed from 0. That is, the first byte in a header is number 0, the second byte is number 1, and so forth. Alternately, you can think of expr as the number of bytes in the header to skip over. The size field is optional. It specifies the number of bytes to use and can be 1, 2, or 4.bsd1# tcpdump greater 200
looks into the IP header at the tenth byte, the protocol field, for a value of 6. Notice that this must be quoted. Either an apostrophe or double quotes should work, but a backquote will not work.bsd1# tcpdump "ip[9] = 6"
is an equivalent command since 6 is the protocol number for TCP. This technique is frequently used with a mask to select specific bits. Values should be in hex. Comparisons are specified using the syntax & followed by a bit mask. The next example extracts the first byte from the Ethernet header (i.e., the first byte of the destination address), extracts the low-order bit, and makes sure the bit is not 0:[25]bsd1# tcpdump tcp
[25]The astute reader will notice that this test could be more concisely written as =1 rather than !=0. While it doesn't matter for this example, using the second form simplifies testing in some cases and is a common idiom. In the next command, the syntax is simpler since you are testing to see if multiple bits are set.
This will match multicast and broadcast packets. With both of these examples, there are better ways of matching the packets. For a more realistic example, consider the command:bsd1# tcpdump 'ether[0] & 1 != 0'
This filter skips the first 13 bytes in the TCP header, extracting the flag byte. The mask 0x03 selects the first and second bits, which are the FIN and SYN bits. A packet is captured if either bit is set. This will capture setup or teardown packets for a TCP connection. It is tempting to try to mix in relational operators with these logical operators. Unfortunately, expressions like tcp src port > 23 don't work. The best way of thinking about it is that the expression tcp src port returns a value of true or false, not a numerical value, so it can't be compared to a number. If you want to look for all TCP traffic with a source port with a value greater than 23, you must extract the port field from the header using syntax such as "tcp[0:2] & 0xffff > 0x0017".bsd1# tcpdump "tcp[13] & 0x03 != 0"
If you really only want IP traffic in this case, use the command:bsd1# tcpdump host 205.153.63.30
On the other hand, if you want all traffic to the host except IP traffic, you could use:bsd1# tcpdump host 205.153.63.30 and ip
If you need to capture all traffic to and from the host and all non-IP traffic, replace the and with an or. With complex expressions, you have to be careful of the precedence. Consider the two commands:bsd1# tcpdump host 205.153.63.30 and not ip
The first will capture all UDP traffic to or from lnx1 and all ARP traffic. What you probably want is the second, which captures all UDP or ARP traffic to or from lxn1. But beware, this will also capture ARP broadcast traffic. To beat a dead horse, be sure to test your filters. I mentioned earlier that running tcpdump on a remote station using telnet was one way to collect data across your network, except that the Telnet traffic itself would be captured. It should be clear now that the appropriate filter can be used to avoid this problem. To eliminate a specific TCP connection, you need four pieces of information -- the source and destination IP addresses and the source and destination port numbers. In practice, the two IP addresses and the well-known port number is often enough. For example, suppose you are interested in capturing traffic on the host lnx1, you are logged onto the host bsd1, and you are using telnet to connect from bsd1 to lnx1. To capture all the traffic at lnx1, excluding the Telnet traffic between bsd1 and lnx1, the following command will probably work adequately in most cases:bsd1# tcpdump host lnx1 and udp or arp bsd1# tcpdump "host lnx1 and (udp or arp)"
We can't just exclude Telnet traffic since that would exclude all Telnet traffic between lnx1 and any host. We can't just exclude traffic to or from one of the hosts because that would exclude non-Telnet traffic as well. What we want to exclude is just traffic that is Telnet traffic, has lnx1 as a host, and has bsd1 as a host. So we take the negation of these three requirements to get everything else. While this filter is usually adequate, this filter excludes all Telnet sessions between the two hosts, not just yours. If you really want to capture other Telnet traffic between lnx1 and bsd1, you would need to include a fourth term in the negation giving the ephemeral port assigned by telnet. You'll need to run tcpdump twice, first to discover the ephemeral port number for your current session since it will be different with every session, and then again with the full filter to capture the traffic you are interested in. One other observation -- while we are not reporting the traffic, the traffic is still there. If you are investigating a bandwidth problem, you have just added to the traffic. You can, however, minimize this traffic during the capture if you write out your trace to a file on lnx1 using the -w option. This is true, however, only if you are using a local filesystem. Finally, note the use of the -n option. This is required to prevent name resolution. Otherwise, tcpdump would be creating additional network traffic in trying to resolve IP numbers into names as noted earlier. Once you have mastered the basic syntax of tcpdump, you should run tcpdump on your own system without any filters. It is worthwhile to do this occasionally just to see what sorts of traffic you have on your network. There are likely to be a number of surprises. In particular, there may be router protocols, switch topology information exchange, or traffic from numerous PC-based protocols that you aren't expecting. It is very helpful to know that this is normal traffic so when you have problems you won't blame the problems on this strange traffic. This has not been an exhaustive treatment of tcpdump, but I hope that it adequately covers the basics. The manpage for tcpdump contains a wealth of additional information, including several detailed examples with explanations. One issue I have avoided has been how to interpret tcpdump data. Unfortunately, this depends upon the protocol and is really beyond the scope of a book such as this. Ultimately, you must learn the details of the protocols. For TCP/IP, Richard W. Stevens' TCP/IP Illustrated, vol. 1, The Protocols has extensive examples using tcpdump. But the best way to learn is to use tcpdump to examine the behavior of working systems.lnx1# tcpdump -n "not (tcp port telnet and host lnx1 and host bsd1)"
5.3. Capturing Data | 5.5. Analysis Tools |
Copyright © 2002 O'Reilly & Associates. All rights reserved.