Formalizations of Functions in Logic and Type Theory

categorical-logicformal-languagesformalizationlo.logictype-theory

I'll state my questions upfront and attempt to motivate/explain them afterwards.

Q1: Is there a direct way of expressing the relation "$y$ is a function of $x$" inside set theory?
More precisely: Can you provide a formula of first order logic + $\in$, containing only two free variables $y$ and $x$, which directly captures the idea that "$y$ is a function of $x$"?

In case the answer to Q1 is negative, here's

Q2: Do any other foundations of mathematics (like univalent) allow one to directly formalize the relation "$y$ is a function of $x$"?
Or have there been any attempts to formalize (parts of) mathematics with a language where the relation is taken as primitive/undefined?

From discussing Q1 with colleagues I've learned that it's hard to convey what my problem is, causing frustration on both sides. I suspect this is to a certain extent because we all only learned the modern definition of function (which is not the answer to Q1) and because neither the people I talk to nor myself are experts in logic/type theory/category theory. So please bear with me in (or forgive me for) this lengthy attempt at an

Explanation/Motivation:

The relation "$y$ is a function of $x$" between two things $y$ and $x$, was the original (and apparently only) way of using the word function in mathematics until roughly 1925. The things $y$ and $x$ were traditionally called variable quantities, and the same relation was sometimes worded differently as "$y$ depends on $x$", "$y$ is determined by $x$" or "$y$ changes with $x$". This was used as a genuine mathematical proposition: something that could be proved, refuted or assumed. People would say "let $y$ be a function of $x$" just like today we might say "let $U$ be subgroup of $G$".

I could cite more than 40 well known mathematicians from Bernoulli to Courant who gave definitions of this relation, but I'll limit myself to quote eight at the end of my question. As far as I can tell, these definitions cannot be directly translated into first order logic + $\in$.

Although the word function assumed a different meaning with the rise of set theory an formal logic, the original relation is still used a lot among physicist, engineers or even mathematicians. Think of statements like "The pressure is a function of the volume", "The area of the circle is a function of its radius", "The number of computations is a function of the size of the matrix", "The fiber depends on the base point" etc. This even crops up in scientific communities where I wouldn't expect it. One finds it for instance in Pierce's Types and Programming Languages or Harper's Practical foundations of programming languages.

So it seems that something being a function of something else (or something depending on something else) is a very natural notion for many people. In fact, I have the impression that for physicists, engineers and most scientist who apply mathematics, this relation is closer to the ideas they want to express, than the modern notion of function.

Yet, I don’t see a direct way of formalizing the idea inside set theory. (The modern notion of a map $f\colon X \to Y$ is not what I'm looking for, since by itself it's not a predicate on two variables.)

I know how to capture the idea at the meta-level, by saying that a formula of first order logic is a function of $x$, iff its set of free variables contains at most $x$. But this is not a definition inside FOL. When a physicist says “The kinetic energy is a function of the velocity” he’s certainly making a physical claim and not a claim about the linguistic objects he uses to talk about physics. So this syntactic interpretation of “$y$ is a function of $x$” is not what I’m looking for.

I also know a way to encode the idea inside set theory. But I’m not completely happy with it.

First, here’s a naive and obviously wrong approach: Let $x\in X$ and $y\in Y$, call $y$ a function of $x$, iff there exists a map $f:X\to Y$, such that $y=f(x)$. Since every such $y\in Y$ is a function of every $x$ (use a constant map $f=(u\mapsto y))$, this is not the right definition.

Here’s a better approach: Let $x$ and $y$ be maps with equal domain, say $x:A\to X$ and $y: A \to Y$. Before giving the definition let me switch terminology: Instead of calling $x$ a "map from $A$ to $X$" I'll call it a "variable quantity of type $X$ in context $A$". (This change of terminology is borrowed from categorical logic/type theory. In categorical logic people say “$x$ is a generalized element of $X$ with stage of definition $A$”. But don’t assume from this, that I have a thorough understanding of categorical logic or type theory.)

Definition: Let $x$ and $y$ be variable quantities of type $X$ and $Y$ in the same context $A$. We call $y$ a function of $x$, iff there exists a map $f: X\to Y$ such that $y=f\circ x$.

(It would be suggestive to switch notation from $f{\circ} x$ to $f(x)$, so we could write $y=f(x)$ when $y$ is a function of $x$. I'll refrain from doing this, since $f(x)$ has an established meaning in set theory.)

Since my post is getting long, I won’t explain why this definition captures the original idea quite well. Let me only say why I don't consider it a direct way of capturing it: It seems backwards from a historical perspective. Mathematicians first had the notion of something being a function of something else, and only from there did they arrive at maps and sets. With this approach we first need to make sense of maps and sets, in order to arrive at the original idea. This might not be a strong counter argument, but if I wanted to use the original idea when teaching calculus, I would need a lot of preparation and overhead with this approach. What I’d like to have instead, is a formalization of mathematics where the relation can be used "out of the box".

The other thing I don’t like about this (maybe due to my lack of knowledge of categorical logic) has to do with the context $A$ and what Anders Kock calls an “important abuse of notation”. To illustrate: suppose I have two variables quantities $x,y$ of type $\mathbb{R}$ in some context $A$. If I now assume something additional about these variables, like the equation $y=x^2$, this assumption should change the context from $A$ to a new context $B$. This $B$ is the domain of the equalizer $e:B\to A$ of the two maps $x^2,y\colon A\to \mathbb{R}$. The abuse of notation consist in denoting the "new" variable quantities $x\circ e, y\circ e$ in context $B$, with the same letters $x,y$. I suspect this abuse is considered important, since in everyday mathematics it's natural to keep the names of mathematical objects, even when additional assumptions are added to the context. In fact, if I’m not mistaken, in a type theory with identity types there is no abuse of notation involved when changing the context from $A\vdash (x,y) \colon \mathbb{R}^2$ to $A, e\colon y=x^2 \vdash (x,y) \colon \mathbb{R}^2$. So maybe type theorist also already know a language where one can talk of "functions of something" in a way that's closer to how way mathematicians did until the 1920's?

Some historical definitions of "$y$ is a function of $x$":

Johann Bernoulli 1718,
in Remarques sur ce qu’on a donné jusqu’ici de solutions des Problêmes sur les isoprimetres,
p. 241:

Definition. We call a function of a variable quantity, a quantity composed in any way whatsoever of the variable quantity and constants.

(I'd call this the first definition. Leibniz, who initiated the use of the word function in mathematics around 1673, gave a definition even earlier. But his is not directly compatible with Bernoulli's, even though he later approved of Bernoulli's definition.)

Euler, 1755 in Institutiones calculi differentialis, Preface p.VI:

Thus when some quantities so depend on other quantities, that if the latter are changed the former undergo change, then the former quantities are called functions of the latter ; this definition applies rather widely, and all ways, in which one quantity could be determined by others, are contained in it. If therefore $x$ denotes a variable quantity, then all quantities, which depend upon $x$ in any way, or are determined by it, are called functions of it.

Cauchy, 1821 in Cours d'analyse, p. 19:

When variable quantities are related to each other such that the
values of some of them being given one can find all of the others, we
consider these various quantities to be expressed by means of several
among them, which therefore take the name independent variables. The
other quantities expressed by means of the independent variables are
called functions of those same variables.

Bolzano, ca. 1830 in Erste Begriffe der allgemeinen Grössenlehre,

The variable quantity $W$ is a function of one or more variable quantities $X,Y,Z$, if there exist certain propositions of the form: "the quantity $W$ has the properties $w,w_{1},w_{2}$,", which can be deduced from certain propositions of the form "the quantity $X$ has the properties $\xi,\xi',\xi''$, — the quantity $Y$ has the properties $\eta,\eta',\eta''$; the quantity $Z$ has the properties $\zeta,\zeta',\zeta''$, etc.

Dirichlet, 1837 in Über die Darstellung ganz willkürlicher Functionen durch Sinus- und Cosinusreihen:

Imagine $a$ and $b$ two fixed values and $x$ a variable quantity, which continuously assumes all values between $a$ and $b$. If now a unique finite $y$ corresponds to each $x$, in such a way that when $x$ ranges continuously over the interval from $a$ to $b$, ${y=f(x)}$ also varies continuously, then $y$ is called a continuous function of $x$ for this interval.

(Many historians call this the first modern definition of function. I disagree, since Dirichlet calls $y$ the function, not $f$.)

Riemann, 1851 in Grundlagen für eine allgemeine Theorie der Functionen einer veränderlichen complexen Grösse

If one thinks of $z$ as a variable quantity, which may gradually assume all possible real values, then, if to any of its values corresponds a unique value of the indeterminate quantity $w$, we call $w$ a function of $z$.

Peano, 1884 in Calcolo differenziale e principii di calcolo integrale p.3:

Among the variables there are those to which we can assign arbitrarily and successively different values, called independent variables, and others whose values depend on the values given to the first ones. These are called dependent variables or functions of the first ones.

Courant, 1934 in Differential and Integral Calculus Vol. 1, p.14:

In order to give a general definition of the mathematical concept of function, we fix upon a definite interval of our number scale, say the interval between the numbers $a$ and $b$, and consider the totality of numbers $x$ which belong to this interval, that is, which, satisfy the relation $$ a\leq x \leq b. $$ If we consider the symbol $x$ as denoting at will any of the numbers in this interval, we call it a (continuous) variable in the interval.

If now to each value of $x$ in this interval there corresponds a single definite value $y$, where $x$ and $y$ are connected by any law whatsoever, we say that $y$ is a function of $x$

(It's funny how at after Cauchy many mathematicians talk of values of variables, which is not something we're allowed to do in set theory. (What's the value of a set or of the element of a set?). Yet, if one looks at modern type theory literature, it's full of talk of "values" again.)

Best Answer

First of all, it seems to me as though the real question here is "what is a variable quantity?" Most of the definitions you quote from pre-20th century mathematicians assume that the notion of "variable quantity" is already understood. But this is already not a standard part of modern formalizations of mathematics; so it's unsurprising that definitions of a subsidiary notion, like when one variable quantity is a function of another one, are hard to make sense of.

So what is a variable quantity? If we want to define the notion of variable quantity "analytically" inside some standard foundational system, then I think we cannot do better than your second suggestion: given a fixed "state space" $A$, an $R$-valued quantity varying over $A$ is a map $A \to R$. Far from worrying that this is historically backward, I think we should be proud that modern mathematics supplies a precise way to make sense of a previously vague concept, and we should not be surprised that in stumbling towards precision people took historically a more roundabout route than the eventual geodesic route that we now know. I think if you pressed any modern mathematician using the phrase "is a function of" to say what they mean by it, this is what they would say (for some suitable $A$, e.g. in "the area of a circle is the function of its radius" the space $A$ is the space of circles, while in "the number of computations is a function of the size of the matrix" the space $A$ is the space of matrices).

However, you seem to be looking for something somewhat different, such as a formalism which the notion of "variable quantity" is basic rather than defined in terms of other things — a "synthetic theory of variable quantities" if you will. Here I think topos theory as well as type theory does indeed help. Given a fixed state space $A$, consider the category ${\rm Sh}(A)$ of sheaves on $A$; this is a topos and hence has an internal logic that is a type theory in which we can do arbitrary (constructive) mathematics. If inside this "universe of $A$-varying mathematics" we construct the (Dedekind) real numbers $R_A$, what we obtain externally is the sheaf of continuous real-valued functions on $A$. Thus, internally "a real number", i.e. a section of this sheaf, is externally a continuous map $A\to \mathbb{R}$, i.e. a real-valued quantity varying over $A$ in the analytic sense. So here we have a formalism in which all quantities are variable. (This point of view, that objects of an arbitrary topos should be regarded as "variable sets" has been promulgated particularly by Lawvere.)

This isn't sufficient to define "function of", however, because as you point out, internally in this type theory, for any "variable quantities" $x,y:R$ there exists a map $f:R\to R$ such that $f(x)=y$, namely the constant map at $y$. If we rephrase this externally, it says that given $x:A\to \mathbb R$ and $y:A\to \mathbb R$, there always exists $f:A\times \mathbb R\to \mathbb R$ such that $f(a,x(a)) = y(a)$ for all $a$, namely $f(a,r) \equiv y(a)$. So the problem is that although $x$ and $y$ are variable quantities, we don't want the function $f$ to be a "variable function"!

Thus we need a formalism in which not only are "variable quantities" basic, there is also a contrasting basic notion of "constant quantity". Categorically, a natural way to talk about this is to think about not just the category ${\rm Sh}(A)$, but the geometric morphism $\Gamma:{\rm Sh}(A)\leftrightarrows \rm Set: \Delta$, which compares the "variable world" ${\rm Sh}(A)$ with the "constant world" $\rm Set$. Just as a single topos has an internal logic that is a type theory, a geometric morphism has an internal logic that is a modal type theory, in which there are two "modes" of types (here the "variable" and "constant" ones) and operators that shift back and forth between them (here the "global sections" $\Gamma$ and the "constant/discrete" $\Delta$).

Now inside this modal type theory, we can construct the object $R^v$ of "variable real numbers" and also the object $R^c$ of "constant real numbers", by copying the usual Dedekind construction in the variable and constant word, and there is a map $\Delta R^c \to R^v$ saying that every constant real number can be regarded as a "trivially" variable one. This gives us a way to say in modal type theory that $y:R^v$ is a function of $x:R^v$, namely that there exists a non-variable function $f:R^c\to R^c$ such that $\Delta f : \Delta R^c \to \Delta R^c$ extends to a function $\bar{f}:R^v\to R^v$ such that $\bar{f}(x)=y$. Or, equivalently, that there is a function $g:R^v\to R^v$ such that $g(x)=y$ and $g$ "preserves constant real numbers", i.e. restricts to a map $\Delta R^c \to \Delta R^c$.

I'm not sure exactly what you hope to achieve with the issue involving assumptions like $y=x^2$ (maybe you can elaborate), but it seems to me that this setup also handles that problem just fine, in roughly the way you sketch: if variable quantities are just elements of $R^v$, then assuming some property of them, like $y= x^2$, doesn't change those elements themselves, internally.