A fundamental assumption of this book is that troubleshooting should
be proactive. It is preferable to avoid a problem than have to
correct it. Proper management practices can help. While some of this
section may, at first glance, seem unrelated to troubleshooting,
there are fundamental connections. Management practices will
determine what you can do and how you do it. This is true both for
avoiding problems and for dealing with problems that can't be
avoided. The remainder of this chapter reviews some of the more
important management issues.
1.3.2.2. Ego management
We would all like to think that we
are irreplaceable, and that no one else could do our jobs as well as
we do. This is human nature. Unfortunately, some people take steps to
make sure this is true. The most obvious way an administrator may do
this is hide what he actually does and how his system works.
This can be done many ways. Failing
to document the system is one approach -- leaving comments out of
code or configuration files is common. The goal of such an
administrator is to make sure he is the only one who truly
understands the system. He may try to limit others access to a system
by restricting accounts or access to passwords. (This can be done to
hide other types of unprofessional activities as well. If an
administrator occasionally reads other users' email, he may not
want anyone else to have standard accounts on the email server. If he
is overspending on equipment to gain experience with new
technologies, he will not want any technically literate people
knowing what equipment he is buying.)
This behavior is usually well disguised, but it is extremely common.
For example, a technician may insist on doing tasks that users could
or should be doing. The problem is that this keeps users dependent on
the technician when it isn't necessary. This can seem very
helpful or friendly on the surface. But, if you repeatedly ask for
details and don't get them, there may be more to it than meets
the eye.
Common
justifications are security and privacy. Unless you are in a
management position, there is often little you can do other than
accept the explanations given. But if you are in a management
position, are technically competent, and still hear these excuses
from your employees, beware! You have a serious problem.
No one knows everything. Whenever information is suppressed, you lose
input from individuals who don't have the information. If an
employee can't control her ego, she should not be turned loose
on your network with the tools described in this book. She will not
share what she learns. She will only use it to further entrench
herself.
The problem is basically a personnel
problem and must be dealt with as such. Individuals in technical
areas seem particularly prone to these problems. It may stem from
enlarged egos or from insecurity. Many people are drawn to technical
areas as a way to seem special. Alternately, an administrator may see
information as a source of power or even a weapon. He may feel that
if he shares the information, he will lose his leverage. Often
individuals may not even recognize the behavior in themselves. It is
just the way they have always done things and it is the way that
feels right.
If you are a manager, you should deal with this problem immediately.
If you can't correct the problem in short order, you should
probably replace the employee. An irreplaceable employee today will
be even more irreplaceable tomorrow. Sooner or later, everyone
leaves -- finds a better job, retires, or runs off to Poughkeepsie
with an exotic dancer. In the meantime, such a person only becomes
more entrenched making the eventual departure more painful. It will
be better to deal with the problem now rather than later.
1.3.2.4. Economic considerations
Solutions to problems have economic
consequences, so you must understand the economic implications of
what you do. Knowing how to balance the cost of the time used to
repair a system against the cost of replacing a system is an obvious
example. Cost management is a more general issue that has important
implications when dealing with failures.
One particularly difficult task for many system administrators is to
come to terms with the economics of networking. As long as everything
is running smoothly, the next biggest issue to upper management will
be how cost effectively you are doing your job. Unless you have
unlimited resources, when you overspend in one area, you take
resources from another area. One definition of an engineer that I
particularly like is that "an engineer is someone who can do
for a dime what a fool can do for a dollar." My best guess is
that overspending and buying needlessly complex systems is the single
most common engineering mistake made when novice network
administrators purchase network equipment.
One problem is that some traditional economic models do not apply in
networking. In most engineering projects, incremental costs are less
than the initial per-unit cost. For example, if a 10,000-square-foot
building costs $1 million, a 15,000-square-foot building will cost
somewhat less than $1.5 million. It may make sense to buy additional
footage even if you don't need it right away. This is justified
as "buying for the future."
This kind of reasoning, when applied to computers and networking,
leads to waste. Almost no one would go ahead and buy a computer now
if they won't need it until next year. You'll be able to
buy a better computer for less if you wait until you need it.
Unfortunately, this same reasoning isn't applied when buying
network equipment. People will often buy higher-bandwidth equipment
than they need, arguing that they are preparing for the future, when
it would be much more economical to buy only what is needed now and
buy again in the future as needed.
Moore's Law lies at the heart of the
matter. Around 1965, Gordon Moore, one of the founders of Intel, made
the empirical observation that the density of integrated circuits was
doubling about every 12 months, which he later revised to 24 months.
Since the cost of manufacturing integrated circuits is relatively
flat, this implies that, in two years, a circuit can be built with
twice the functionality with no increase in cost. And, because
distances are halved, the circuit runs at twice the speed -- a
fourfold improvement. Since the doubling applies to previous
doublings, we have exponential growth.
It is generally estimated
that this exponential growth with chips will go on for another 15 to
20 years. In fact, this growth is nothing new. Raymond Kurzweil, in
The Age of Spiritual Machines: When Computers Exceed Human
Intelligence, collected information on computing speeds
and functionality from the beginning of the twentieth century to the
present. This covers mechanical, electromechanical (relay), vacuum
tube, discrete transistor, and integrated circuit technologies.
Kurzweil found that exponential growth has been the norm for the last
hundred years. He believes that new technologies will be developed
that will extend this rate of growth well beyond the next 20 years.
It is certainly true that we have seen even faster growth in disk
densities and fiber-optic capacity in recent years, neither of which
can be attributed to semiconductor technology.
What does this mean economically? Clearly, if you wait, you can buy
more for less. But usually, waiting isn't an option. The real
question is how far into the future should you invest? If the price
is coming down, should you repeatedly buy for the short term or
should you "invest" in the long term?
The general answer
is easy to see if we look at a few numbers. Suppose that $100,000
will provide you with network equipment that will meet your
anticipated bandwidth needs for the next four years. A simpleminded
application of Moore's Law would say that you could wait and
buy similar equipment for $25,000 in two years. Of course, such a
system would have a useful life of only two additional years, not the
original four. So, how much would it cost to buy just enough
equipment to make it through the next two years? Following the same
reasoning, about $25,000. If your growth is tracking the growth of
technology,
[4] then two years ago it would have cost $100,000 to buy
four years' worth of technology. That will have fallen to about
$25,000 today. Your choice: $100,000 now or $25,000 now and $25,000
in two years. This is something of a no-brainer. It is summarized in
the first two lines of
Table 1-1.
Table 1-1. Cost estimates
|
Year 1
|
Year 2
|
Year 3
|
Year 4
|
Total
|
Four-year plan
|
$100,000
|
$0
|
$0
|
$0
|
$100,000
|
Two-year plan
|
$25,000
|
$0
|
$25,000
|
$0
|
$50,000
|
Four-year plan with maintenance
|
$112,000
|
$12,000
|
$12,000
|
$12,000
|
$148,000
|
Two-year plan with maintenance
|
$28,000
|
$3,000
|
$28,000
|
$3,000
|
$62,000
|
Four-year plan with maintenance and 20% MARR
|
$112,000
|
$10,000
|
$8,300
|
$6,900
|
$137, 200
|
Two-year plan with maintenance and 20% MARR
|
$28,000
|
$2,500
|
$19,500
|
$1,700
|
$51,700
|
If
this argument isn't compelling enough, there is the issue of
maintenance. As a general rule of thumb, service contracts on
equipment cost about 1% of the purchase price per month. For
$100,000, that is $12,000 a year. For $25,000, this is $3,000 per
year. Moore's Law doesn't apply to maintenance for
several reasons:
-
A major part of maintenance is labor costs and these, if anything,
will go up.
-
The replacement parts will be based on older technology and older
(and higher) prices.
-
The mechanical parts of older systems, e.g., fans, connectors, and so
on, are all more likely to fail.
-
There is more money to be made selling new equipment so there is no
incentive to lower maintenance prices.
Thus, the $12,000 a year for maintenance on a $100,000 system will
cost $12,000 a year for all four years. The third and fourth lines of
Table 1-1 summarize these numbers.
Yet another consideration
is the time value of money. If you don't need the $25,000 until
two years from now, you can invest a smaller amount now and expect to
have enough to cover the costs later. So the $25,000 needed in two
years is really somewhat less in terms of today's dollars. How
much less depends on the rate of return you can expect on
investments. For most organizations, this number is called the
minimal acceptable rate of return (MARR). The
last two lines of
Table 1-1 use a MARR of 20%.
This may seem high, but it is not an unusual number. As you can see,
buying for the future is more than two and a half times as expensive
as going for the quick fix.
Of course, all this is a gross
simplification. There are a number of other important considerations
even if you believe these numbers. First and foremost, Moore's
Law doesn't always apply. The most important exception is
infrastructure. It is not going to get any cheaper to pull cable. You
should take the time to do infrastructure well; that's where
you really should invest in the future.
Most of the
other considerations seem to favor short-term investing. First, with
short-term purchasing, you are less likely to invest in dead-end
technology since you are buying later in the life cycle and will have
a clearer picture of where the industry is going. For example, think
about the difference two years might have made in choosing between
Fast Ethernet and ATM for some organizations. For the same reason,
the cost of training should be lower. You will be dealing with more
familiar technology, and there will be more resources available. You
will have to purchase and install equipment more often, but the
equipment you replace can be reused in your network's
periphery, providing additional savings.
On the downside, the equipment you buy won't have a lot of
excess capacity or a very long, useful lifetime. It can be very
disconcerting to nontechnical management when you keep replacing
equipment. And, if you experience sudden unexpected growth, this is
exactly what you will need to do. Take the time to educate upper
management. If frequent changes to your equipment are particularly
disruptive or if you have funding now, you may need to consider
long-term purchases even if they are more expensive. Finally,
don't take the two-year time frame presented here too
literally. You'll discover the appropriate time frame for your
network only with experience.
Other problems come when comparing plans.
You must consider the total economic picture. Don't look just
at the initial costs, but consider ongoing costs such as maintenance
and the cost of periodic replacement. As an example, consider the
following plans. Plan A has an estimated initial cost of $400,000,
all for equipment. Plan B requires $150,000 for equipment and
$450,000 for infrastructure upgrades. If you consider only initial
costs, Plan A seems to be $200,000 cheaper. But equipment needs to be
maintained and, periodically, replaced. At 1% per month, the
equipment for Plan A would cost $48,000 a year to maintain, compared
to $18,000 per year with Plan B. If you replace equipment a couple of
times in the next decade, that will be an additional $800,000 for
Plan A but only $300,000 for Plan B. As this quick,
back-of-the-envelope calculation shows, the 10-year cost for Plan A
was $1.68 million, while only $1.08 million for Plan B. What appeared
to be $200,000 cheaper was really $600,000 more expensive. Of course,
this was a very crude example, but it should convey the idea.
You shouldn't take this example
too literally either. Every situation is different. In particular,
you may not be comfortable deciding what is adequate surplus capacity
in your network. In general, however, you are probably much better
off thinking in terms of scalability than raw capacity. If you want
to hedge your bets, you can make sure that high-speed interfaces are
available for the router you are considering without actually buying
those high-speed interfaces until needed.
How does
this relate to troubleshooting? First, don't buy overly complex
systems you don't really need. They will be much harder to
maintain, as you can expect the complexity of troubleshooting to grow
with the complexity of the systems you buy. Second, don't spend
all your money on the system and forget ongoing maintenance costs. If
you don't anticipate operational costs, you may not have the
funds you need.