Censored vs. inflated vs. hurdle
Censored, hurdle, and inflated models work by adding a point mass on top of an existing probability density. The difference lies in where the mass is added, and how. For now, just consider adding a point mass at 0, but the concept generalizes easily to other cases.
All of them imply a two-step data generating process for some variable $Y$:
- Draw to determine whether $Y = 0$ or $Y > 0$.
- If $Y > 0$, draw to determine the value of $Y$.
Inflated and hurdle models
Both inflated (usually zero-inflated) and hurdle models work by explicitly and separately specifying $\operatorname{Pr}(Y = 0) = \pi$, so that the DGP becomes:
- Draw once from $Z \sim Bernoulli(\pi)$ to obtain realization $z$.
- If $z = 0$, set $y = z = 0$.
- If $z = 1$, draw once from $Y^* \sim D^*(\theta^*)$ and set $y = y^*$.
In an inflated model, $\operatorname{Pr}(Y^* = 0) > 0$. In a hurdle model, $\operatorname{Pr}(Y^* = 0) = 0$. That's the only difference.
Both of these models lead to a density with the following form:
$$
f_D(y) = \mathbb{I}(y = 0) \cdot \operatorname{Pr}(Y = 0) + \mathbb{I}(y \geq 0) \cdot \operatorname{Pr}(Y \geq 0) \cdot f_{D^*}(y)
$$
where $\mathbb{I}$ is an indicator function. That is, a point mass is simply added at zero and in this case that mass is simply $\operatorname{Pr}(Z = 0) = 1 - \pi$. You are free to estimate $p$ directly, or to set $g(\pi) = X\beta$ for some invertible $g$ like the logit function. $D^*$ can also depend on $X\beta$. In that case, the model works by "layering" a logistic regression for $Z$ under another regression model for $Y^*$.
Censored models
Censored models also add mass at a boundary. They accomplish this by "cutting off" a probability distribution, and then "bunching up" the excess at that boundary. The easiest way to conceptualize these models is in terms of a latent variable $Y^* \sim D^*$ with CDF $F_{D^*}$. Then $\operatorname{Pr}(Y^* \leq y^*) = F_{D^*}(y^*)$. This is a very general model; regression is the special case in which $F_{D^*}$ depends on $X\beta$.
The observed $Y$ is then assumed to be related to $Y^*$ by:
$$
Y = \begin{align}\begin{cases}
0 &Y^* \leq 0 \\
Y^* &Y^* > 0
\end{cases}\end{align}
$$
This implies a density of the form
$$
f_D(y) = \mathbb{I}(y = 0) \cdot F_{D^*}(0) + \mathbb{I}(y \geq 0) \cdot \left(1 - F_{D^*}(0)\right) \cdot f_{D^*}(y)
$$
and can be easily extended.
Putting it together
Look at the densities:
$$\begin{align}
f_D(y) &= \mathbb{I}(y = 0) \cdot \pi &+ &\mathbb{I}(y \geq 0) \cdot \left(1 - \pi\right) &\cdot &f_{D^*}(y) \\
f_D(y) &= \mathbb{I}(y = 0) \cdot F_{D^*}(0) &+ &\mathbb{I}(y \geq 0) \cdot \left(1 - F_{D^*}(0)\right) &\cdot &f_{D^*}(y)
\end{align}$$
and notice that they both have the same form:
$$
\mathbb{I}(y = 0) \cdot \delta + \mathbb{I}(y \geq 0) \cdot \left(1 - \delta\right) \cdot f_{D^*}(y)
$$
because they accomplish the same goal: building the density for $Y$ by adding a point mass $\delta$ to the density for some $Y^*$. The inflated/hurdle model sets $\delta$ by way of an external Bernoulli process. The censored model determines $\delta$ by "cutting off" $Y^*$ at a boundary, and then "clumping" the left-over mass at that boundary.
In fact, you can always postulate a hurdle model that "looks like" a censored model. Consider a hurdle model where $D^*$ is parameterized by $\mu = X\beta$ and $Z$ is parameterized by $g(\pi) = X\beta$. Then you can just set $g = F_{D^*}^{-1}$. An inverse CDF is always a valid link function in logistic regression, and indeed one reason logistic regression is called "logistic" is that the standard logit link is actually the inverse CDF of the standard logistic distribution.
You can come full circle on this idea, as well: Bernoulli regression models with any inverse CDF link (like the logit or probit) can be conceptualized as latent variable models with a threshold for observing 1 or 0. Censored regression is a special case of hurdle regression where the implied latent variable $Z^*$ is the same as $Y^*$.
Which one should you use?
If you have a compelling "censoring story," use a censored model. One classic usage of the Tobit model -- the econometric name for censored Gaussian linear regression -- is for modeling survey responses that are "top-coded." Wages are often reported this way, where all wages above some cutoff, say 100,000, are just coded as 100,000. This is not the same thing as truncation, where individuals with wages above 100,000 are not observed at all. This might occur in a survey that is only administered to individuals with wages under 100,000.
Another use for censoring, as described by whuber in the comments, is when you are taking measurements with an instrument that has limited precision. Suppose your distance-measuring device could not tell the difference between 0 and $\epsilon$. Then you could censor your distribution at $\epsilon$.
Otherwise, a hurdle or inflated model is a safe choice. It usually isn't wrong to hypothesize a general two-step data generating process, and it can offer some insight into your data that you might not have had otherwise.
On the other hand, you can use a censored model without a censoring story to create the same effect as a hurdle model without having to specify a separate "on/off" process. This is the approach of Sigrist and Stahel (2010), who censor a shifted gamma distribution just as a way to model data in $[0, 1]$. That paper is particularly interesting because it demonstrates how modular these models are: you can actually zero-inflate a censored model (section 3.3), or you can extend the "latent variable story" to several overlapping latent variables (section 3.1).
Truncation
Edit: removed, because this solution was incorrect
Best Answer
The wiki describes the Tobit model as follows:
$$y_i = \begin{cases} y_i^* &\text{if} \quad y_i^* > 0 \\ \ 0 &\text{if} \quad y_i^* \le 0 \end{cases}$$
$$y_i^* = \beta x_i + u_i$$
$$u_i \sim N(0,\sigma^2)$$
I will adapt the above model with to your context and offer a plain english interpretation of the equations which may be helpful.
$$y_i = \begin{cases}\ y_i^* &\text{if} \quad y_i^* \le 30 \\ 30 &\text{if} \quad y_i^* > 30 \end{cases}$$
$$y_i^* = \beta x_i + u_i$$
$$u_i \sim N(0,\sigma^2)$$
In the above set of equations, $y_i^*$ represents a subject's ability. Thus, the first set of equations state the following:
Our measurements of ability is cut-off on the higher side at 30 (i.e., we capture the ceiling effect). In other words, if a person's ability is greater than 30 then our measurement instrument fails to record the actual value but instead records 30 for that person. Note that the model states $y_i = 30 \quad \text{if} \quad y_i^* > 30$.
If on the other hand a person's ability is less than 30 then our measurement instrument is capable of recording the actual measurement. Note that the model states $y_i = y_i^* \quad \text{if} \quad y_i^* \le 30$.
We model the ability, $y_i^*$, as a linear function of our covariates $x_i$ and an associated error term to capture noise.
I hope that is helpful. If some aspect is not clear feel free to ask in the comments.