Misleading Definitions and Unanswerable Problems in Spivak’s Calculus

definitionelementary-set-theoryfunctions

Spivak's Calculus 4th edition, Chapter 3 Functions, Page 47:

DEFINITION: A function is a collection of pairs of numbers with the following property: if (a,b) and (a,c) are both in the collection, then b = c; in other words, the collection must not contain two different pairs with the same first element.

He continues:

This is our first full-fledged definition, and illustrates the format we shall always use to define significant new concepts. These definitions are so important (at least as important as theorems) that it is essential to know when one is actually at hand, and to distinguish them from comments, motivating remarks, and casual explanations. They will be preceded by the word DEFINITION, contain the term being defined in boldface letters, and constitute a paragraph unto themselves.

Ok got it. Fast forward to the problem set, I'm up to problem 25)

25) Find a function f(x) such that g(f(x)) = x for some g(x), but such that there is no function h(x) with f(h(x)) = x

I asked about this on MSE (see Spivak's Calculus Chapter 3 Problem 25), and after much discussion came to the resolution that the problem 25 is impossible given the above definition of a function.

In summary, the problem is that in order to find f(x), you need to choose f(x) to be a non-surjective function, which requires the concept of codomain. However the above definition does not include the concept of codomain, only domain and image, hence it is impossible.

My question is: What was Spivak thinking (and I mean literally)? How did Spivak intend for us to solve this problem? Is it an error on his part? Or is there a way to intuitively infer from the above definition that functions must be specified a codomain (codomain, not image) in order to be defined?

If I had the solutions book to Calculus 4th Edition, this would greatly help answer my question, but I can't find it anywhere for free.

Best Answer

Spivak's definition of a function is limited to a very special case. In fact, at the beginning of Chapter 3 Spivak gives the following "provisional definition":

A function is a rule which assigns, to each of certain real numbers, some other real number.

This describes the scope of his book. He does not deal with the general set-theoretic concept of a function, his naive concept only requires two explicit components:

  1. A subset $A \subset \mathbb R$ of "certain real numbers".

  2. A rule assigning to each $x \in A$ an element $f(x) \in \mathbb R$.

On p. 40 he explicitly defines the domain of a function as the set of numbers to which it does apply. In other words, $\operatorname{domain}(f) = A$. He does not introduce the modern concept of codomain, but he says that all functions take values in $\mathbb R$ which may be regarded as an antiquated way to express that they have codomain $\mathbb R$. In modern terms we would therefore write a function in the sense of Spivak as $f : A \to \mathbb R$ with $A \subset \mathbb R$. However, note that Spivak does not need an explicit concept of codomain to introduce functions. All he needs are 1. and 2. above.

On p. 44 Spivak defines the composition $f \circ g$ of two functions $f$ and $g$ as follows: The domain of $f \circ g$ is the set $\{ x \in \operatorname{domain}(g) \mid g(x) \in \operatorname{domain}(f) \}$, the assigment rule is given by the formula $(f \circ g)(x) = f(g(x))$.

Now let us analyze Spivak's formal definition of a function as a certain collection of pairs of numbers. Thus, more precisely, a function $f$ is a subset of $\mathbb R \times \mathbb R$ with the uniqueness property $(a,b), (a,c) \in f \implies b = c$. The next definition introduces the domain of $f$ as the set $\operatorname{domain}(f) = \{a \in \mathbb R \mid \exists b \in \mathbb R : (a,b) \in f \}$ and introduces the rule of assignment $a \mapsto f(a)$ occurring in the provisional definition by simply observing that for each $a \in \operatorname{domain}(f)$ there exists a unique $f(a) \in \mathbb R$ such that $(a,f(a)) \in f$.

Since functions are defined as subsets of $\mathbb R \times \mathbb R$, it is automatically clear what it means that two functions are equal: This is simply equality of sets. If we want, we can write Spivak's functions in the usual form $f : A = \operatorname{domain}(f) \to \mathbb R$, but we can also avoid this notation.

For later use let us add the definition of the image of $f$ as $$\operatorname{image}(f) = \{f(x) \mid x \in A = \operatorname{domain}(f) \} \subset \mathbb R$$ which is also written as $f(A)$.

The composition $f \circ g$ of two functions $f$ and $g$ is now formally defined as follows (Spivak does not do that, probably because it is obvious in the light of the above informal definition): $$f \circ g = \{ (x,f(g(x))) \in \mathbb R \times \mathbb R \mid x \in \operatorname{domain}(g) \text{ such that } g(x) \in \operatorname{domain}(f) \} .$$

Note that always $$\operatorname{domain}(f \circ g) \subset \operatorname{domain}(g) \tag{1}$$ $$\operatorname{image}(f \circ g) \subset \operatorname{image}(f) \tag{2}$$

Now problem 25 makes perfectly sense, but you did not state it properly in your question. It uses the identity function $I$ introduced on p. 43 which has domain $\mathbb R$. Formally $I = \{(x,x) \mid x \in \mathbb R \}$. Concerning the domain of $I$ recall the convention on the bottom of p. 41: "Unless the domain is restricted further, it is understood to consist of all numbers for which the definition makes sense at all."

The problem asks to find a function $f$ such that

  1. $f$ has a left inverse $g$ which means that $g \circ f = I$.

  2. $f$ does not have a right inverse $h$ which would mean that $f \circ h = I$.

Note that a necessary condition for $f$ having a left inverse is that $\operatorname{domain}(f) = \mathbb R$. In fact, if $g$ is a left inverse for $f$, then $\operatorname{domain}(g \circ f) = \operatorname{domain}(I) = \mathbb R$, hence $\operatorname{domain}(f) = \mathbb R$ by $(1)$.

Similarly a necessary condition for $f$ having a right inverse is that $\operatorname{image}(f) = \mathbb R$, i.e. that $f$ is surjective. In fact, if $h$ is a right inverse for $f$, then $\operatorname{image}(f \circ h) = \operatorname{image}(I) = \mathbb R$, hence $\operatorname{image}(f) = \mathbb R$ by $(2)$.

The functions $f$ satisfying the above conditions are precisely the injective but non-surjective functions $f : \mathbb R \to \mathbb R$, or if you want to avoid this notation, the injective functions $f$ such that $\operatorname{domain}(f) = \mathbb R$ and $\operatorname{image}(f) \ne \mathbb R$.

However, I must admit that Spivak is not really precise concerning $I$. It seems that in one case he also uses the symbol $I$ for a restricted function $I_A = \{(x,x) \mid x \in A \}$ with a general $A \subset \mathbb R$. In problem 24 (a) he claims that each injective function $g$ has a left inverse $f$ such that $f \circ g = I$. But this is only true if $\operatorname{domain}(g) = \mathbb R$. In case $\operatorname{domain}(g) = A \subsetneqq \mathbb R$ we can only achieve that $f \circ g = I_A$. In all other parts of the problems 23 - 26 the standard $I$ fits.

Remark:

I do not claim that Spivak's approach is a particular good one. It is, however, consistent in itself. I would prefer to introduce a function as a triple $(X,Y,f)$ consisting of a set $X$ called domain, a set $Y$ called codomain and a subset $f \subset X \times Y$ such that for each $x \in X$ there exists a unique $y \in Y$ such that $(x,y) \in f$. This is much more flexible and allows a real understanding of surjectivity. In fact, Spivak's approach does not allow us to consider functions $f : A \to B$ where $A, B \subset \mathbb R$ are arbitrary. He only covers the case $B = \mathbb R$. Moreover, his functions are always subsets of $\mathbb R \times \mathbb R$ and not of $A \times \mathbb R$ or $A \times B$.

Related Question