In Section 7.7 "The Finite 2-transitive Groups" of the book Permutation groups by John D. Dixon and Brian Mortimer, the authors describe the complete list of finite 2-transitive groups without proofs but with references.
They list eight infinite families: the alternating, symmetric, affine and projective groups in their natural actions, as well as the less known groups of Lie type: the symplectic groups, the Suzuki groups, the unitary groups and the Ree groups. The symplectic groups have two distinct 2-transitive actions, the last three classes are 2-transitive on the sets of points in their action on appropriate Steiner systems. Additional there are 10 sporadic examples of 2-transitive groups.
No. This is not true.
For example, set
$$A=\frac{1}{2}\begin{bmatrix}
1&0&1&0\\
0&0&1&1\\
1&1&0&0\\
0&1&0&1
\end{bmatrix}.$$
Then $\text{Per}(A)=1/8$ and $\text{Per}(A^2)=9/64.$
Since the collection of all matrices $B$ where $\text{per}(B^2)>\text{per}(B)$ is open, there is some $\epsilon$ where if $\|B-A\|<\epsilon$, then $\text{per}(B^2)>\text{per}(B)$. In this case, one can select a doubly stochastic $B$ with strictly positive entries with $\|B-A\|<\epsilon$ and therefore $\text{per}(B^2)>\text{per}(B)$.
It looks like the class of arithmetic means of two permutation matrices is the best source of examples of doubly stochastic matrices $A$ with $\text{Per}(A^2)>\text{Per}(A).$
More generally, if $P$ is a permutation matrix, and $A=\frac{P+P^{*}}{2}$, then in the computer experiments that I have ran, we usually have $\text{Per}(A^2)>\text{Per}(A).$
Testing a conjecture on convex sets
We observe that the permutation matrices are the extreme points among all doubly stochastic matrices. Recall that the Krein-Milman theorem states that every compact convex subset of a Hausdorff locally convex topological vector space is in the convex closure of its extreme points. Furthermore, recall that Caratheodory's theorem states that if $C\subseteq\mathbb{R}^{n}$ is the convex closure of a set $A$, then every point in $C$ is a convex combination of at most $n+1$ many points of $A$. By combining these two theorems together, we conclude that for every compact convex subset $C$ of $\mathbb{R}^n$, every element in $C$ is in the convex closure of at most $n+1$ many extreme points in $C$.
In general, if $U$ is a totally bounded convex subset of a real affine space $V$ with $\dim(V)=n$, and one is trying to find an element in $U$ that satisfies a property using a computer search, one should test elements near the extreme points of $\overline{U}$. If this does not work, then whenever $1\leq d\leq n+1$, one should test points near convex combinations of $d+1$ many extreme points.
For this kind of problem, it also appears that it is a good idea to run all these tests to see if there is a symmetric counterexample. If this still did not work, then I would use a gradient descent to minimize $\text{per}(B)-\text{per}(B^2).$
Best Answer
See the Wikipedia page for the Birkhoff polytope. It says that equivalent results were obtained by Steinitz in 1894 and by Kőnig in 1916.