By the formula for the MLE, I understand that you are dealing with the variant of the Geometric distribution where the random variables can take the value $0$. In this case we have
$$E(X_1) = \frac {1-p}{p},\,\,\, \text {Var}(X_1) = \frac {1-p}{p^2}$$
The Fisher Information of a single observation can be derived by applying its definition :
$$I_1(p) = \operatorname{E} \left[\left. \left(\frac{\partial}{\partial p} \ln f(X_1;p)\right)^2\right|p\right] = \operatorname{E} \left[\left. \left(\frac{\partial}{\partial p} \ln(1-p)^{X_1}p \right)^2\right|p \right]$$
$$=\operatorname{E} \left(-\frac {X_1}{1-p}+\frac 1p \right)^2 = \operatorname{E} \left(\frac {X_1^2}{(1-p)^2}+\frac 1{p^2}-2\frac {X_1}{(1-p)p}\right)$$
$$=\frac 1{p^2} - \frac {2}{(1-p)p} E(X_1)+ \frac {1}{(1-p)^2}\left(\text {Var}(X_1) + (E[X_1])^2\right)$$
$$=\frac 1{p^2}- \frac {2}{(1-p)p}\cdot \frac {1-p}{p} + \frac {1}{(1-p)^2}\left( \frac {1-p}{p^2} + \frac {(1-p)^2}{p^2}\right)$$
$$=\frac 1{p^2}- \frac {2}{p^2}+\frac {1}{(1-p)p^2}+\frac 1{p^2} = \frac {1}{(1-p)p^2}$$
We also have
$$\frac {d\theta}{dp} = \frac {1}{(1-p)^2}$$
So
$$I_1(\theta) = I_1(p)\cdot \left(\frac {d\theta}{dp} \right)^{-2} = \frac {1}{(1-p)p^2}\cdot (1-p)^4 = \frac {(1-p)^3}{p^2}$$
There is no MLE of $\theta$.
$$L(\theta) = \prod_{i=1}^n p^{x_i} (1-p)^{1-x_i}$$
This expression of the likelihood function is incorrect. The left-hand side expresses the likelihood as a function of $\theta$, but the right hand side is a function of $p$.
You can not fix this either because $\theta$ is a non-injective function of $p$ and can not be inverted. Given $\theta$ we have two possible values of $p$
$$p=\frac{1}{2}\pm\sqrt{\frac{1}{4}-\theta}$$
so you can not compute the probability of the data given $\theta$ unless you have a second parameter.
how do I compute the information matrix?
If you would have a valid MLE then you could start with the information matrix of some parameter and apply a scaling according to the square of the derivative of the transformation.
However, note that some transformed variable can be biased and the Fisher Information alone is not an indication of asymptotic variance. See this example: Why the variance of Maximum Likelihood Estimator(MLE) will be less than Cramer-Rao Lower Bound(CRLB)?
The special case of finding the variance of the distribution of the statistic $\hat\theta = \hat{p}(1-\hat{p}) = \hat{p}-\hat{p}^2$ can be done more directly by computing $$\begin{array}{}
Var(\hat\theta) &=& E[(\hat{p}-\hat{p}^2)] - E[\hat{p}-\hat{p}^2]^2 \\
&=& E[\hat{p}^4-2\hat{p}^3+\hat{p}^2] - E[\hat{p}-\hat{p}^2]^2\\
&=& E[\hat{p}^4]-2E[\hat{p}^3]+E[\hat{p}^2] - (E[\hat{p}]-E[\hat{p}^2])^2
\end{array}$$
which can be expressed in terms of the raw moments of $\hat{p}$ (a binomial distributed variable, scaled by $1/n$)
$$\begin{array}{}
E[\hat{p}] &=& p \\
E[\hat{p}^2] &=& p^2 + \frac{p(1-p)}{n} \\
E[\hat{p}^3] &=& p^3 + 3 \frac{p^2(1-p)}{n} \\
E[\hat{p}^4] &=& p^4 + 6 \frac{p^3(1-p)}{n} + 3 \frac{p^2(1-p)^2}{n^2} \\
\end{array}$$
and the variance can be written as
$$Var(\hat\theta)
= \frac{2p^4-4p^3+2p^2}{n^2} + \frac{-4p^4+8p^3-5p^2+p}{n}$$
where the second term becomes dominant for large $n$ and is the same result as using the Fisher Information matrix
$$p(1-p)/n \cdot \left(\frac{\text{d}\theta}{\text{d}p}\right)^2 = \frac{p(1-p)(1-2p)^2}{n}$$
In the case of $p=0.5$ this would lead to zero variance (or an infinite value in the information matrix). In that case you can still use the Delta method with a second order derivative as demonstrated in this question: Implicit hypothesis testing: mean greater than variance and Delta Method
Best Answer
MLE of $a$ is indeed the first order statistic $X_{(1)}=\min\limits_{1\le i\le n}X_i$ because the likelihood is non-decreasing in $a$ subject to the restriction $a<X_{(1)}$. Because the population distribution is Pareto, you can verify that $X_{(1)}$ also has a Pareto distribution from which you can get its exact variance.
UMVUE of $a$ however depends on whether $\theta$ is known or not. In any case, it is found using the Lehmann-Scheffé theorem.
If $\theta$ is known, then $X_{(1)}$ is a complete sufficient statistic and UMVUE of $a$ is of the form $c(\theta) X_{(1)}$ for some function $c$.
If $\theta$ is not known, then $\left(\prod\limits_{i=1}^n X_i,X_{(1)}\right)$ or equivalently $\left(U, X_{(1)}\right)$ is a complete sufficient statistic where $U=\sum\limits_{i=1}^n (\ln X_i-\ln X_{(1)})$. Here $U$ has a certain Gamma distribution, and $U$ and $X_{(1)}$ can be shown to be independent. The resulting UMVUE of $a$ would be of the form $g(U)X_{(1)}$ for some function $g$.
Using the points above, you can find the exact variance of both UMVUE and MLE of $a$. Asymptotic relative efficiency of MLE with respect to UMVUE is then the limit of the ratio $\operatorname{Var}(\hat a)/\operatorname{Var}(X_{(1)})$ as $n\to \infty$ where $\hat a$ is UMVUE of $a$. Note that Fisher information is not usually defined for non-regular distributions like this where support of the distribution depends on the parameter of interest.