# Untitled Page 7

- Page ID
- 148345

## 2.2 Safe use of infinitesimals

The idea of infinitesimally small numbers has always irked purists. One prominent critic of the calculus was Newton's contemporary George Berkeley, the Bishop of Cloyne. Although some of his complaints are clearly wrong (he denied the possibility of the second derivative), there was clearly something to his criticism of the infinitesimals. He wrote sarcastically, “They are neither finite quantities, nor quantities infinitely small, nor yet nothing. May we not call them ghosts of departed quantities?”

Infinitesimals seemed scary, because if you mishandled them, you could prove absurd things. For example, let *du* be
an infinitesimal. Then 2*du* is also infinitesimal. Therefore both 1/*du* and 1/(2*du*) equal infinity, so
1/*du* = 1/(2*du*). Multiplying by *du* on both sides, we have a proof that 1=1/2.

In the eighteenth century, the use of infinitesimals became like adultery: commonly practiced, but shameful to admit
to in polite circles. Those who used them learned certain rules of thumb for handling them correctly. For instance,
they would identify the flaw in my proof of 1=1/2 as my assumption that there was only one size of infinity,
when actually 1/*du* should be interpreted as an infinity twice as big as 1/(2*du*). The use of the symbol
∞ played into this trap, because the use of a single symbol for infinity implied that infinities only came
in one size. However, the practitioners of infinitesimals had trouble articulating a clear set of principles
for their proper use, and couldn't prove that a self-consistent system could be built around them.

By the twentieth century, when I learned calculus, a clear consensus had formed that infinite and infinitesimal
numbers weren't numbers at all. A notation like *dx*/*dt*, my calculus teacher told me, wasn't really
one number divided by another, it was merely a symbol for something called a limit,

where Δ *x* and Δ *t* represented finite changes. I'll give a formal definition (actually two different formal
definitions) of the term “limit” in section 3.2, but intuitively the concept is that we can get as good
an approximation to the derivative as we like, provided that we make Δ *t* small enough.

That satisfied me until we got to a certain topic
(implicit differentiation) in which we were encouraged to break the *dx* away from the *dt*, leaving them on
opposite sides of the equation. I buttonholed my teacher after class and asked why he was now doing what he'd
told me you couldn't really do, and his response was that *dx* and *dt* weren't really numbers,
but most of the time you could get away with treating them as if they were, and you would get the right
answer in the end. *Most of the time!?* That bothered me. How was I supposed to know when it *wasn't*
“most of the time?”

But unknown to me and my teacher, mathematician Abraham Robinson
had already shown in the 1960's that it
was possible to construct a self-consistent number system that included infinite and infinitesimal numbers.
He called it the hyperreal number system,
and it included the real numbers as a subset.^{3}

Moreover, the
rules for what you can and can't do with the hyperreals turn out to be extremely simple.
Take any true statement about the real numbers. Suppose it's possible to translate it into a statement about
the hyperreals in the most obvious way, simply by replacing the word “real” with the word “hyperreal.”
Then the translated statement is also true. This is known as the *transfer principle*.

Let's look back at my bogus proof of 1=1/2 in light of this simple principle. The final step of the proof, for example, is perfectly valid: multiplying both sides of the equation by the same thing. The following statement about the real numbers is true:

For any real numbers *a*, *b*, and *c*, if *a*=*b*, then *ac*=*bc*.

This can be translated in an obvious way into a statement about the hyperreals:

For any hyperreal numbers *a*, *b*, and *c*, if *a*=*b*, then *ac*=*bc*.

However, what about the statement that both 1/*du* and 1/(2*du*) equal infinity, so they're
equal to each other? This isn't the translation of a statement that's true about the reals, so there's
no reason to believe it's true when applied to the hyperreals --- and in fact it's false.

What the transfer principle tells us is that the real numbers as we normally think of them are not unique in obeying the ordinary rules of algebra. There are completely different systems of numbers, such as the hyperreals, that also obey them.

How, then, are the hyperreals even different from the reals, if everything that's true of one is true of the other? But recall that the transfer principle doesn't guarantee that every statement about the reals is also true of the hyperreals. It only works if the statement about the reals can be translated into a statement about the hyperreals in the most simple, straightforward way imaginable, simply by replacing the word “real” with the word “hyperreal.” Here's an example of a true statement about the reals that can't be translated in this way:

For any real number *a*, there is an integer *n* that is greater than *a*.

This one can't be translated so simplemindedly, because it refers to a subset of the reals called
the integers. It might be possible to translate it somehow, but it would require some insight into
the correct way to translate that word “integer.” The transfer principle doesn't apply to this
statement, which indeed is false for the hyperreals, because the hyperreals contain infinite
numbers that are greater than all the integers. In fact, the contradiction of this statement can be
taken as a definition of what makes the hyperreals special, and different from the reals: we assume
that there is at least one hyperreal number, *H*, which is greater than all the integers.

As an analogy from everyday life, consider the following statements about the student body of the high school I attended:

1. Every student at my high school had two eyes and a face.

2. Every student at my high school who was on the football team was a jerk.

Let's try to translate these into statements about the population of California in general.
The student body of my high school is like the set of real numbers, and the present-day population
of California is like the hyperreals. Statement 1 can be translated mindlessly into a statement
that every Californian has two eyes and a face; we simply substitute “every Californian” for
“every student at my high school.” But statement 2 isn't so easy, because it refers to the
subset of students who were on the football team, and it's not obvious what the corresponding
subset of Californians would be. Would it include everybody who played high school, college,
or pro football? Maybe it shouldn't include the pros, because they belong to an organization
covering a region bigger than California. Statement 2 is the kind of statement that the
transfer principle doesn't apply to.^{4}

### Example 6

As a nontrivial example of how to apply the transfer principle, let's consider how to handle expressions like the one that occurred when we wanted to differentiate*t*

^{2}using infinitesimals: I argued earlier that 2

*t*+

*dt*is so close to 2

*t*that for all practical purposes, the answer is really 2

*t*. But is it really valid in general to say that 2

*t*+

*dt*is the same hyperreal number as 2

*t*? No. We can apply the transfer principle to the following statement about the reals:

For any real numbers *a* and *b*, with *b*≠ 0, *a*+*b*≠ *a*.

Since *dt* isn't zero, 2*t*+*dt*≠ 2*t*.

More generally, example 14 leads us to visualize every number as being surrounded by
a “halo”
of numbers that don't equal it, but differ from it by only an infinitesimal amount.
Just as a magnifying glass would allow you to see the fleas on a dog, you would need an infinitely
strong microscope to see this halo. This is similar to the idea that every integer is surrounded by a bunch of fractions that
would round off to that integer. We can define the *standard part* of a finite hyperreal
number, which means the unique real number that differs from it infinitesimally. For instance, the
standard part of 2*t*+*dt*, notated \st(2t+dt), equals 2*t*. The derivative of a function
should actually be defined as the standard part of *dx*/*dt*, but we often write *dx*/*dt*
to mean the derivative, and don't worry about the distinction.

One of the things Bishop Berkeley disliked about infinitesimals was the idea that
they existed in a kind of hierarchy, with *dt*^{2} being not just infinitesimally small, but infinitesimally
small compared to the infinitesimal *dt*.
If *dt* is the flea on a dog, then *dt*^{2} is a submicroscopic
flea that lives on the flea, as in Swift's doggerel: “Big fleas have little fleas/ On their backs to ride 'em,/
and little fleas have lesser fleas,/And so, ad infinitum.” Berkeley's criticism was off the mark here: there is
such a hierarchy. Our basic assumption about the hyperreals was that they contain at least one infinite number,
*H*, which is bigger than all the integers. If this is true, then 1/*H* must be less than 1/2, less than
1/100, less then 1/1,000,000 --- less than 1/*n* for any integer *n*. Therefore the hyperreals are guaranteed
to include infinitesimals as well, and so we have at least three levels to the hierarchy: infinities comparable
to *H*, finite numbers, and infinitesimals comparable to 1/*H*. If you can swallow that, then it's not too much
of a leap to add more rungs to the ladder, like extra-small infinitesimals that are comparable to 1/*H*^{2}.
If this seems a little crazy, it may comfort you to think of statements about the hyperreals as descriptions
of limiting processes involving real numbers. For instance, in the sequence of numbers 1.1^{2}=1.21,
1.01^{2}=1.0201, 1.001^{2}=1.002001, ..., it's clear that the number represented by the digit 1 in the final decimal place is getting
smaller faster than the contribution due to the digit 2 in the middle.

One subtle issue here, which I avoided mentioning in the differentiation of the sine function on page 28,
is whether the transfer principle is sufficient to let us define all the functions that
appear as familiar keys on a calculator: *x*^{2}, √x, sin *x*, cos *x*, *e*^{x}, and so on.
After all, these functions were originally defined as rules that would take a real number as an input
and give a real number as an output. It's not trivially obvious that their definitions can naturally be extended
to take a hyperreal number as an input and give back a hyperreal as an output. Essentially the answer is that
we can apply the transfer principle to them just as we would to statements about simple arithmetic, but I've discussed
this a little more on page 149.