There are a lot of questions here, but I'll try to answer them all.
Should every mathematical theory take place in a ∞-category? Or is 'real' mathematics basically evil?
I would say that all mathematics should take place in its natural context. Sometimes you have things that are sets where equality makes sense, like an ordinary presheaf, and then you work in a 1-category. Sometimes you have things where only isomorphism makes sense, like a presheaf of categories, and then you work in a 2-category. Etc.
It is true that any n-category for finite n can be considered a special case of an ∞-category with only identity cells above n, so in this degenerate sense all n-categories are ∞-categories, and thus one might say that "all mathematics takes place in an ∞-category" — at least if one believes that all mathematics takes place in an n-category for some n! But even that is not clear, e.g. some mathematics naturally takes place in other categorical structures, such as a double category or a proarrow equipment. Some mathematics uses no category theory at all (at least as far as anyone has noticed so far), and so it would be a stretch to say that it takes place in any sort of category.
Anyway, may we think of it as a usual functor, without turning into troubles? Or is it important, in practice, to have this higher category theoretic point of view? Or is it possible to turn this functor into a honest functor, by choosing the tensor products $M\otimes_A B$ carefully?
I would say qualified yes, yes, and yes, respectively. You can think of it as a usual functor as long as doing so doesn't cause you to think that it behaves in any way that a pseudofunctor doesn't! Which is sort of a vacuous statement, but the point is that pseudofunctors really shouldn't be a very scary concept (as opposed to a technical definition, which might be a bit complicated, though cf. Harry's comment) — they really are just like ordinary functors, except that you're dealing with things (e.g. categories) for which it doesn't really make sense to ask morphisms to be equal, only isomorphic.
On the other hand, the "higher category theoretic" fact that pseudofunctors are not all strict functors is very important. I believe that Benabou, the inventor of bicategories, once said that the important thing about bicategories is not that they themselves are "weak," but that the morphisms between them are weak. In particular, although every bicategory is equivalent to a strict 2-category, not every pseudofunctor between bicategories is equivalent to a strict functor.
But on the third hard, it is true that any pseudofunctor with values in the 2-category Cat is equivalent to a strict functor. In the language of fibrations, this says that any fibration is equivalent to a split one. Tyler mentioned one construction of an equivalent strict functor in the case of modules and tensor products. There is also a general construction which, applied to the case of modules, will replace $Mod_A$ by a category whose objects are pairs (M,φ) where M is an R-module and φ:R→A is a ring homomorphism. We regard such a pair as a formal representative of $M\otimes_R A$ and define morphisms between them accordingly, to get a category eequivalent to $Mod_A$. Now the extension-of-scalars functor $\psi_!:Mod_A \to Mod_B$ is represented by the functor taking a pair (M,φ) to (M,ψφ), which is strictly functorial since composition of ring homomorphisms is so.
There is indeed a very close connection between Lawvere's fixed-point theorem and Recursion theorem, but one has to look at it the right way. Namely, it all becomes clear once we do it in the effective topos.
Let us start by recalling Lawvere's theorem. (I use $X \to Y$ and $Y^X$ as synonyms for the set of all functions from $X$ to $Y$.)
Theorem (Lawvere): If $e : A \to B^A$ is onto then every $f : B \to B$ has a fixed point.
Proof. There is $x \in A$ such that $e(x)(y) = f(e(y)(y))$ for all $y \in A$, because $e$ is onto and $x \mapsto f(e(y)(y))$ is a map from $A$ to $B$. Then $e(x)(x) = f(e(x)(x))$ so $e(x)(x)$ is a fixed point of $f$. QED.
Now here is Recursion theorem written so that it is most similar to Lawvere's theorem. I explain below why this is the Recursion theorem.
Recursion theorem: Suppose countable choice holds and $e : \mathbb{N} \to B^{\mathbb{N}}$ is onto. Then every total relation $R \subseteq B \times B$ has a fixed point.
We say that $x \in B$ is the fixed point of $R$ if $R(x,x)$. Note that total relations can also be viewed as multivalued maps, so this is a fixed point theorem which generalizes the instance of Lawvere's fixed point theorem in which $A = \mathbb{N}$.
Proof.
Because $R$ is total, for every $n \in \mathbb{N}$ there is $y \in B$ such that $R(e(n)(n), y)$. Therefore, by countable choice, there is a map $c : \mathbb{N} \to B$ such that $R(e(n)(n), c(n))$ for all $n \in \mathbb{N}$. As $e$ is onto there exists $k \in \mathbb{N}$ such that $e(k) = c$. But then $e(k)(k)$ is a fixed point of $R$ because $e(k)(k) = c(k)$. QED.
Of course, you are asking yourself what the theorem has to do with Recursion theorem from computability theory. Note that the proof is intuitionistic and uses countable choice, therefore it is valid in the effective topos. To get the connection with the classical recursion theorem, we need to understand what the object of partial computable maps looks like in the effective topos. In fact, it is just the function space $\mathbb{N} \to \mathbb{N}_\bot$ where I do not really want to get into the internal definition of $\mathbb{N} _{\bot}$, let me describe it as a numbered set instead: the underlying set of $\mathbb{N} _\bot$ is $\mathbb{N} \cup \lbrace \bot \rbrace$. A number $r$ realizes $\bot \in \mathbb{N} _\bot$ if the $r$-th Turing machine diverges on input $0$, and it realizes $n \in \mathbb{N} _\bot$ if the $r$-th Turing machine halts and outputs $n$ on input $0$.
Another way to explain the object of partial computable maps $\mathbb{N} \to \mathbb{N} _\bot$ in the effective topos is that this is the object of those partial maps whose domain is a countable subset of $\mathbb{N}$ (which of course is just the internal version of the classic theorem that partial computable maps have c.e. sets as their domains).
Anyhow, $\mathbb{N} \to \mathbb{N}_\bot$ is countable in the effective topos. This can be proved from the axioms of synthetic computability, but a shortcut is just to observe that there is an effective enumeration of partial computable maps, which realizes an enumeration $\varphi : \mathbb{N} \to (\mathbb{N} \to \mathbb{N} _\bot)$ in the effective topos.. But then, since by $\lambda$-calculus $$(\mathbb{N} \to (\mathbb{N} \to \mathbb{N} _\bot)) \cong (\mathbb{N} \times \mathbb{N} \to \mathbb{N} _\bot) \cong \mathbb{N} \to \mathbb{N} _\bot$$
we see that we may apply Recursion theorem to $\mathbb{N} \to \mathbb{N} _\bot$. So, given any $f : \mathbb{N} \to \mathbb{N}$, consider the total relation $R$ defined on $\mathbb{N} \to \mathbb{N} _\bot$ by
$$R(u,v) \iff \exists k \in \mathbb{N} . u = \varphi_k \land v = \varphi_{f(k)}.$$
There is a fixed point $u$ and so by definition of $R$ there is $k$ such that $u = \varphi_k$ and $u = \varphi_{f(k)}$. And we have the usual recursion theorem as a consequence.
Let's do another one, just to convince you this is the recursion theorem. There is an enumeration $W$ of all countable subsets of $\mathbb{N}$ (yes, there are countably many countable subsets of $\mathbb{N}$ in the effective topos, and that is a way cool axiom if you like to smoke weird stuff). A typical exercise in recursion theorem asks for $n$ such that $W_n = \lbrace{ n \rbrace}$. Because the countable subsets of $\mathbb{N}$ satisfy the condition of recursion theorem, we get such a set simply by considering the total relation $R$ defined by
$$R(S,T) \iff \exists m \in \mathbb{N} . S = W_m \land T = \lbrace m\rbrace.$$
Indeed, a fixed point is a countable set $S$ such that for some $m$ we have $S = W_m$ and $S = \lbrace m \rbrace$.
I could go on, but I am in fact preparing a paper about this which should appear on arXiv in a couple of days. See also my materials on synthetic computability (older material has suboptimal proofs of recursion theorem).
Best Answer
The notion of a "category of Being" that Lawvere discusses there is the notion that more recently he has been calling a category of cohesion . I'll try to illuminate a bit what's going on .
I'll restrict to the case that the category is a topos and say cohesive topos for short. This is a topos that satisfies a small collection of simple but powerful axioms that are supposed to ensure that its objects may consistently be thought of as geometric spaces built out of points that are equipped with "cohesive" structure (for instance topological structure, or smooth structure, etc.). So the idea is to axiomatize big toposes in which geometry may take place.
Further details and references can be found here:
http://nlab.mathforge.org/nlab/show/cohesive+topos .
Let's walk through the article:
One axiom on a cohesive topos $\mathcal{E}$ is that the global section geometric morphism $\Gamma : \mathcal{E} \to \mathcal{S}$ to the given base topos $\mathcal{S}$ has a further left adjoint $\Pi_0 := \Gamma_! : \mathcal{E} \to \mathcal{S}$ to its inverse image $\Gamma^{\ast}$, which I'll write $\mathrm{Disc} := \Gamma^{\ast}$, for reasons discussed below. This extra left adjoint has the interpretation that it sends any object $X$ to the set $\Pi_0(X)$ "of connected components". What Lawvere calls a connected object in the article (p. 4) is hence one that is sent by $\Pi_0$ to the terminal object.
Another axiom is that $\Pi_0$ preserves finite products. This implies by the above that the collection of connected objects is closed under finite products. This appears on page 6. What he mentions there with reference to Hurewicz is that given a topos with such $\Pi_0$, it becomes canonically enriched over the base topos in a second way, a geometric way.
I believe that this, like various other aspects of cohesive toposes, lives up to its full relevance as we make the evident step to cohesive $\infty$-toposes. More details on this are here
http://nlab.mathforge.org/nlab/show/cohesive+(infinity,1)-topos
(But notice that this, while inspired by Lawvere, is not due to him.)
In this more encompassing context the extra left adjoint $\Pi_0$ becomes $\Pi_\infty$ which I just write $\Pi$: it sends, one can show, any object to its geometric fundamental $\infty$-groupoid, for a notion of geometric paths intrinsic to the $\infty$-topos. The fact that this preserves finite products then says that there is a notion of concordance of principal $\infty$-bundles in the $\infty$-topos.
The next axiom on a cohesive topos says that there is also a further right adjoint $\mathrm{coDisc} := \Gamma^! : \mathcal{S} \to \mathcal{E}$ to the global section functor. This makes in total an adjoint quadruple
$$ (\Pi_0 \dashv \mathrm{Disc} \dashv \Gamma \dashv \mathrm{coDisc}) := (\Gamma_! \dashv \Gamma^* \dashv \Gamma_* \dashv \Gamma^!) : \mathcal{E} \to \mathcal{S} $$
and another axiom requires that both $\mathrm{Disc}$ as well as $\mathrm{coDisc}$ are full and faithful.
This is what Lawvere is talking about from the bottom of p. 12 on. The downward functor that he mentions is $\Gamma : \mathcal{E} \to \mathcal{S}$. This has the interpretation of sending a cohesive space to its underlying set of points, as seen by the base topos $\mathcal{S}$. The left and right adjoint inclusions to this are $\mathrm{Disc}$ and $\mathrm{coDisc}$. These have the interpretation of sending a set of points to the corresponding space equipped with either discrete cohesion or codiscrete (indiscrete) cohesion . For instance in the case that cohesive structure is topological structure, this will be the discrete topology and the indiscrete topology, respectively, on a given set. Being full and faithful, $\mathrm{Disc}$ and $\mathrm{coDisc}$ hence make $\mathcal{S}$ a subcategory of $\mathcal{E}$ in two ways (p. 7), though only the image of $\mathrm{coDisc}$ will also be a subtopos, as he mentions on page 7.
(This has, by the way, an important implication that Lawvere does not seem to mention: it implies that we are entitled to the corresponding quasi-topos of separated bipresheaves, induced by the second topology that is induced by the sub-topos. That, one can show, may be identified with the collection of concrete sheaves, hence concrete cohesive spaces (those whose cohesion is indeed supported on their points). In the case of the cohesive topos for differential geometry, the concrete objects in this sense are precisely the diffeological spaces . )
He calls the subtopos given by the image of $\mathrm{coDisc} : \mathcal{S} \to \mathcal{E}$ that of "pure Becoming" further down on p. 7, whereas the subcategory of discrete objects he calls that of "non Becoming". The way I understand this terminology (which may not be quite what he means) is this:
whereas any old $\infty$-topos is a collection of spaces with structure , a cohesive $\infty$-topos comes with the extra adjoint $\Pi$, which I said has the interpretation of sending any space to its path $\infty$-groupoid. Therefore there is an intrinsic notion of geometric paths in any cohesive $\infty$-topos. This allows notably to define parallel transport along paths and higher paths, hence a kind of dynamics . In fact there is differential cohomology in every cohesive $\infty$-topos.
Now, in a discrete object there are no non-trivial paths (formally because $\Pi \; \mathrm{Disc} \simeq \mathrm{Id}$ by the fact that $\mathrm{Disc}$ is full and faithful), so there is "no dynamics" in a discrete object hence "no becoming", if you wish. Conversely in a codiscrete object every sequence of points whatsoever counts as a path, hence the distinction between the space and its "dynamics" disappears and so we have "pure becoming", if you wish.
Onwards. Notice next that every adjoint triple induces an adjoint pair of a comonad and a monad. In the present situation we get
$$ (\mathrm{Disc} \;\Gamma \dashv \mathrm{coDisc}\; \Gamma) : \mathcal{E} \to \mathcal{E} $$
This is what Lawvere calls the skeleton and the coskeleton on p. 7. In the $\infty$-topos context the left adjoint $\mathbf{\flat} := \mathrm{Disc} \; \Gamma$ has the interpretation of sending any object $A$ to the coefficient for cohomology of local systems with coefficients in $A$.
The paragraph wrapping from page 7 to 8 comments on the possibility that the base topos $\mathcal{S}$ is not just that of sets, but something richer. An example of this that I am kind of fond of is that of super cohesion (in the sense of superalgebra and supergeometry): the topos of smooth super-geometry is cohesive over the base topos of bare super-sets.
What follows on page 9 are thoughts of which I am not aware that Lawvere has later formalized them further. But then on the bottom of p. 9 he gets to the axiomatic identification of infinitesimal or formal spaces in the cohesive topos. In his most recent article on this what he says here on p. 9 is formalized as follows: he says an object $X \in \mathcal{E}$ is infinitesimal if the canonical morphism $\Gamma X \to \Pi_0 X$ is an isomorphism. To see what this means, suppose that $\Pi_0 X = *$, hence that $X$ is connected. Then the isomorphism condition means that $X$ has exactly one global point. But $X$ may be bigger: it may be a formal neighbourhood of that point, for instance it may be $\mathrm{Spec} \;k[x]/(x^2)$. A general $X$ for which $\Gamma X \to \Pi_0 X$ is an iso is hence a disjoint union of formal neighbourhoods of points.
Again, the meaning of this becomes more pronounced in the context of cohesive $\infty$-toposes: there objects $X$ for which $\Gamma X \simeq * \simeq \Pi X$ have the interpretation of being formal $\infty$-groupoids , for instance formally exponentiated $L_\infty$-algebras. And so there is $\infty$-Lie theory canonically in every cohesive $\infty$-topos.
I'll stop here. I have more discussion of all this at:
http://nlab.mathforge.org/schreiber/show/differential+cohomology+in+a+cohesive+topos