Whether the halting of a particular Turing machine is provable (or disprovable) depends on which axioms you would accept a proof starting from, of course.
However, Gödel's Incompleteness Theorem shows how to construct, for every "reasonable" theory (that is, set of axioms) $T$, a particular logical formula $G$ where it is not provable from $T$ whether $G$ itself has a proof from $T$.
Thus, once you've chosen your theory $T$ (which could be for example Peano Arithmetic or ZFC set theory) construct a Turing machine $M$ which iterates through all possible strings of symbols and checks whether each of them is a valid proof of $G$. If it finds a valid proof it halts; otherwise it keeps searching.
Since $T$ cannot tell us whether $G$ has a proof, it cannot tell us whether $M$ will halt either.
The Gödel sentence itself is rather long and complicated if we want to write it out completely -- but fortunately we don't need to do that here: We can just program $M$ such that it starts by computing $G$ according to Gödel's instructions, and then begins looking for proofs. Writing an algorithm that will output $G$ is much more tractable than writing down $G$ itself. So if you want to see $M$ explicitly, a competent CS graduate student would probably be able to write it down in at most a few thousand lines of Scheme or Haskell.
Shorter solution: If we strip away the parts of Gödel's construction that are not immediately relevant here, we get this machine:
1. Let M be my own source code (using any standard quine technique)
2. Search for a proof in ZFC of "M does not halt".
3. If one is found, stop.
If ZFC proves that the machine doesn't halt, then it will halt (so ZFC can't be true). On the other hand if ZFC proves that it does halt, then (assuming ZFC is true) it will halt, which must be because it eventually finds a proof that it doesn't halt, so ZFC is inconsistent.
There is no single mathematical definition of an "algorithm". There is a well accepted definition of a computable function - the class of computable functions can be defined in terms of Turing machines, register machines, and many other models of computation.
However, this does not answer the problem of defining an "algorithm", because every computable function has many different programs to compute it, and it is not clear how to tell whether two programs use the same algorithm or use different algorithms.
This is also why "program" or "Turing machine" cannot be used as a definition of an algorithm. A key aspect of the term "algorithm" is that the same algorithm can be turned into many different programs.
So the definition of "computable function" is too coarse to capture the meaning of "algorithm", while the definition of "program" is too fine.
In computability theory and computer science, "algorithm" is used only as an informal term, or to refer to a specific set of instructions. In other words, we know particular algorithms, such as Djiktra's algorithm, but we don't have a definition of "algorithm" in general.
For a partial list of attempts to define "algorithm", see algorithm characterizations on Wikipedia. At least 15 different attempts are described there.
Best Answer
Here I'm giving a type-0 grammar of the propositional logic (in the end):
Thoughts on generative grammars and their use in automated theorem proving based on neural networks
The grammar contains 60 non-terminal symbols and 264 production rules. I think this is just a routine to construct a "real" grammar of some first-order theory (say, group theory). Meanwhile, this is actually important, since, how I've shown in the article, this gives us a good arrangement of our data to prove theorems automatically by machine learning.