MaxEnt is a method for designing models, whereas SoftMax is a model in itself.
MaxEnt is a method that describes an observer state of knowledge about some system and it's variables. For instance, if I'm interested on studying some situation depending only on one real parameter $x$ and I know (from experimental data or from my theoretical model) that the only relevant characteristic of the data distribution of this parameter is it's mean, I can do:
$$ \mathbb{E}_{p(x)}[x] = \int_{-\infty}^\infty \mathrm{d}x\ x p(x) \equiv \lambda $$
where $\lambda$ is defined experimentally. Then, using the MaxEnt approach, the probability distribution "more reasonable" (that assumes less conditions over $p(x)$), is the exponential distribution:
$$ p(x|\lambda) = \lambda e^{-\lambda x}$$
This method is extremely useful and has many applications on statistical physics, information theory, statistics, machine learning et cetera. More information can be found on Wikipedia and on many different sources.
More generally, one can use Discrete MaxEnt with the constraints $\mathbb{E}_p[f_i(y_j)] = \sum_{j=1}^C f_i(y_j)p(y_j) \equiv F_i$ for $i = 1, \dots, K$ to obtain the probability distribution:
$$ p_j = p(y_j) = \frac1Z \exp\left( \sum_{i=1}^K \lambda_i f_i(y_j) \right) $$
which can be developed to become a softmax function (I haven't done it myself but I suspect it must be something along the lines of this paper.
tl;dr MaxEnt is a method for developing probabilistic models, so it can provide us other classification models that are not SoftMax. It all depends on the (informational) assumptions of your model
Best Answer
A neural network can be considered as a networked set of logistic regression units.
While a single logistic regression can perform as a classifier on it's own it's not suited for problems where input dimensions are very high and your data is not linearly separable.
By using multiple such units, a neural network attempts to approximate any given function. The more important aspect in using a neural network however is in knowing how to train each of these units.
Here's another question similar to yours: Difference between logistic regression and neural networks