The solution requires special classes of functions, namely regularized incomplete beta functions $I_{\sin^2 \Phi} (\frac{n-1}{2}, \frac{1}{2})$ and Gamma functions. The result and its derivation can e.g. be found in the very nice article by S. Li here. Let $\Phi$ be the angle of the cap, so $a = R\sin \Phi$.
Then the surface area of the cap is
$$
A(\Phi) = R^{n-1} \frac{\pi^{n/2}}{\Gamma(n/2)} I_{\sin^2\Phi} (\frac{n-1}{2}, \frac{1}{2})
$$
This is actually an interesting question. It involves how to define "area" on a curved surface. The examples you have provided are surfaces that are developable (can be flattened onto a plane) after a few cuts. And you can compute the flattened area. You can never do this to a sphere, because no matter how small a patch from a sphere is, it can never be flattened onto a plane. The idea is to break down the sphere to small patches such that each is flat enough and you compute the area as if it is flat, and then add up the areas of the patches.
Mathematically, suppose $S$ is a sphere. The above procedure is stated as:
Break up $S$ into patches $P_1,\dots,P_n$, where each $P_i$ is a patch that is flat enough, and $n$ is the number of patches you have.
Compute $\operatorname{Area}(P_i)$ as if each $P_i$ is flat. As suggest by levap, one way to do it is to project each patch onto one of its tangent planes. Note that I am not saying this is the only way to approximate a patch, and I am also not saying that one way that would seem correct at first glance would really be correct, see Update 2 for an example, there's also discussion about this in the comments.
Use $\operatorname{Area}(P_1)+\dots+\operatorname{Area}(P_n)$ as an approximation of the area of $S$.
If the patches are small enough, then the approximation should be a good one. But if you want better precision, use smaller patches and do the above again.
This is to make the math precise, I can't guarantee that a third-grade student can understand this: As you take smaller and smaller patches, the value of the approximation above should tend to a fixed number, which is the mathematical definition of the area.
P.S. For a visualization of this approximation, you can search online for sphere parametrization, or simply think of a football (soccer ball).
Update 1: Thanks to Leander, we have a visualization:
One might notice that this visualization is slightly different from cutting up a sphere; it takes sample points on the sphere and attach triangles to these sample points. I want to remark that there is no essential difference between this and my method. The idea is the same: approximation.
Update 2: A comment (by Tanner Swett) mention that the method of using a polygon mesh may be flawed. Indeed, the example of Schwarz lantern shows that some pathological choice of the polygon mesh may produce a limit different from the surface area. The following explanation should be helpful:
As I have mentioned in step 2 above, if we are not careful with how we approximate the areas of the patches, the approximation may not work. The Schwarz lantern is an example where a careful choice of the approximating triangles can lead to the following result: Suppose $T$ is a triangle we use to approximate a patch $P$, then it is possible ${\rm Area}(T)/{\rm Area}(P)\to a\neq1$. To illustrate this, consider a single triangle on the Schwarz lantern:
We assume the cyclinder has total height $1$ and radius $1$. We take $n+1$ axial slices, and on each slice $m$ points. The area enclosed by the red curves is a patch on the cylinder, and the triangle enclosed by the blue dashed lines is the one used to approximate the patch. Let $P$ and $T$ denote the patch and the triangle respectively. We see that the bottom edge of $P$ and $T$ has ratio $1$ as $m\to\infty$. What really makes a difference is the ratio of their heights. Suppose along the vertical direction the height of $P$ is
$$h=1/n$$
Then the height of the triangle is
$$h_T=\sqrt{1/n^2+a^2}$$
By a simple computation we know $a=1-\cos(\pi/m)\approx(\pi^2/m^2)/2$. Therefore,
$$h_T/h=\sqrt{1+\frac{\pi^4n^2}{m^4}}$$
If $n$ has higher order than $m^2$, then the limit is bigger than $1$, and consequently ${\rm Area}(T)/{\rm Area}(P)\not\to1$.
This problem would have a smaller probability of occurring in practice. Imagine if you do cut the cyclinder into patches, you'd use $h$ instead of $h_T$ to estimate the area. But again, it is hard to make this (what approximation is acceptable) precise without using the language of calculus.
Best Answer
The following is due to Archimedes of Syracuse (287 BC - 212 BC).
First, remeber that the lateral surface area of a right conical frustum is $$\pi(r_1+r_2)S=2\pi mS\tag 1$$ where $r_1$ is radius of the base, $r_2$ is the radius of the top, $m=\frac{r_1+r_2}2$, the "mid-radius", and $S$ is the lateral (slant) height.
Now, consider the following figure.
Here a horizontal slice of the black sphere has been replaced by a red frustum. Also there is a blue cylinder containing the sphere. Consider the blue slice of the cylinder matching the frustum in its height and matching the sphere in its radius. We want to show that the lateral surface area of the blue cylindrical slice equals the lateral surface area of the red frustum.
The lateral surface area of the blue cylindrical slice is$$2\pi rd.$$
How come that $2\pi rd=2\pi mS$? The red filled triangle is similar to the triangle $MNC$. So $$\frac{m}{r'}=\frac d S,$$ from where $$d=\frac{mS}{r'}.$$ Substituting this into the formula for the lateral cylindrical surface, we get $$2\pi \frac{r}{r'}mS.$$ Compare this to $(1).$
Then think: If the spherical slice is very thin then $r'$ is very close to $r$ and the lateral surface area of the frustum is very close to the lateral surface area of the spherical slice... This is all independent from the actual position of the slice we chose.
Then slice the whole spherical cap. The lateral surface area of the slices will equal the corresponding lateral surface area of the cylinder.
Calculus is needed only if the argumentation above seems to be shaky. But the genius did not need calculus.