You seem to be muddling different things together.
$h$ is NOT the cost function. $h$ is the object you're trying to fit (estimate parameter values for). It is $h$ that's the linear regression function. You fit it by choosing some cost function $J$, to measure (for a given $\theta_0$,$\theta_1$) how well or badly you fit the data relative to other values for $h_\theta$ ($J$ is big when it's a bad fit and small when it's a good fit). Minimizing the cost means you have the 'least costly' fit (best fit to the data by your cost criterion). You minimize $J$ to get $h$ 'close' to the data.
Now $\theta_0$ and $\theta_1$ don't have "minimum values", they have values that minimize $J$ -- it's $J$ that's at a minimum, not the $\theta$'s. The parameter estimates are at the argmin.
So you choose some $J$ that measures the overall 'badness of fit' - some measure of how far the data is from the given $h$.
The $J$ you have is the sum of squares of differences between $h$ (the line) and $y$ (the data). As you can see, it gets bigger when the fit is worse. It turns out to be a particularly convenient choice, as well as often satisfying people's notions of how a cost function should look.
The expression is just one way (though not the usual way for most of use us; most statisticians would use a different notation) to write that sum of squares. Since $J$ is the sum of squares of residuals, choosing $h$ to minimize $J$ makes the fitted $h$ the least squares regression line.
Other choices of J are definitely possible; see, for example, $L_1$ (least absolute values) regression, or regression based on M-estimators for some alternatives.
Best Answer
Linear regression minimizes squared error
$$ J(\theta) = \sum_{i=1}^n (y_i - \theta^T x_i)^2 $$
You might want to put $1/n$ in front, so that it's units don't depend on sample size $n$. The $1/2$ comes from the derivative
$$ \frac{d}{dx} \big(x^2\big) = 2x $$
and with
$$ \frac{d}{dx} \Big( \frac{1}{2} x^2 \Big) = x $$
so putting $1/2$ in front makes writing the derivatives simpler because you don't need to add $2$ in front. $1/(2n)$ does both.
No matter which form you choose (sum of squared errors, $1/n$, $1/2$, $1/(2n)$), they have the same minimum, because multiplying function by a positive constant does not change its minimum, so they are equivalent.