Dampening can be thought of as a special case of shrinkage methods; these methods as a whole tend to reduce uncertainty in estimates (yet another circumstance of trading bias for variance, an ever-recurring theme in statistics, though in some cases, such as many involving variable-selection, shrinkage can reduce both bias and variance).
There are many methods by which forecasts are produced, but they have a wide variety of characteristics.
Consider, for example, a case where we just observe independent random values with a constant mean. Then dampening would involve moving our forecasts a little toward zero from the sample mean.
In methods that estimate a local-mean effect element (some form of adaptive level, say), dampening could move the forecast of that recent level toward 0 (or, for slightly different models, toward the overall mean), so that forecast trends tended to 'fall back' from the most recent excursion over longer forecast horizons.
In methods that have an element of recent local linear trend (some form of adaptive trend, say), dampening could move the forecast of that recent trend toward 0, so that forecast trends tended to 'flatten out' over longer forecast horizons.
Now consider a case where, for example, there's seasonal effects around some other trend - dampening of the seasonal component would tend to shrink the forecast strength of the seasonality, "smoothing out" the wiggle over longer forecast horizons.
The text by Hyndman and Athanasopoulos, Forecasting principles and practice, (freely available on-line as well as in dead-tree form) has a section on dampening, but you need some of the preceding sections for context - the models in sec 7.4 are damped versions of the models in sec 7.2, Holt linear trend models). I highly recommend investigating this book. [You may find the older book on forecasting by Makridakis, Wheelwright and Hyndman in libraries or on forecaster's bookshelves. It's also very handy.]
That section of text on dampening includes something like the kind of thing you're asking for - a tunable smoothing parameter, $\phi$, between $0$ and $1$, to produce a dampening effect ... but as you'll see if you look at that section, it's of a different form for different models - the additive trend and multiplicative trend models use different formulas!
So how might we "dampen" a linear least squares fit*?
*(Note that I don't particularly regard this as a suitable approach for very many forecasting problems, but the Hyndman and Athanasopoulos text is a better place to investigate the merits of various approaches. Nevertheless, let us proceed, since a discussion of the issues covers many of the issues one must consider when trying to dampen models more generally)
A linear regression doesn't have any 'local' component to it at all, it's a global model. In any case, as I mentioned at the start, dampening can be thought of as shrinkage, and we can still shrink that estimate of the linear component toward 0.
But we then have the question of about which center do we 'tilt' the line? The obvious approach for an actually global model would be to do it about the mean:
The line in center-slope form would be $y-\bar{y} = \beta (x-\bar{x})+\varepsilon_t$.
The point forecasts are $\hat{y_t} = \bar{y} + b (x_t-\bar{x})$, where $b=\hat{\beta}$, the slope estimate.
If $T$ is the last observed time, then the point forecasts are
$\hat{y_{T+k}} = \bar{y} + b (x_{T+k}-\bar{x})$.
One simple and commonly used dampening is to multiply it at each step by a constant $\phi$, where $0<\phi<1$ which would let the slope shrink with time:
$\hat{y_{T+k}} = \bar{y} + b_{T+k} (x_{T+k}-\bar{x})$,
where $b_{T+k} = \phi b_{T+k-1} = \phi^k b_{T} = \phi^k b$.
However, as I said, a global model doesn't necessarily make as much sense as some other options if it's a situation in which we'd want to apply such dampening; we might consider instead pivoting the slope around a point somewhere near the end (nearer to $t=T$), to make it more local... but if we think the model should be local in that sense, we should probably be looking at locally-linear models to start with, and then dampen those trends.
There's some distinct similarities in general approach I took here to the one used for the additive trend model, though that's a local model, not a global one - and thereby more suited to forecasting trends that tend to be linear for a while, but where the linear trend isn't constant in the long term.
That approach of progressive multiplication of a model-component by $\phi$ could be used for any number of models if applied to the parts it makes sense to shrink to be smaller - it could be applied to seasonal components, or AR parameters, or any number of other things. Different components of forecasting models may even be shrunk at different rates.
[That progressive multiplication by a constant (geometric shrinkage) isn't the only way to shrink components of a model, but it's the most common.]
Best Answer
One approach I've used for this problem is to define the MAPE as
(A-F)/(average of A and F)
instead of
(A-F)/A.
This measure (which I think I borrowed from Mosteller and Tukey's book, but I don't have it at hand right now) is symmetric and bounded by -200% and +200%. I know you wanted it to be 0 through 100, but I got you partway there with a measure I may be able to find a reference for.
I have used this where (a) I wanted a symmetric measure, and (b) where I wanted to cap the errors ['whether they were horrible (200%) or atrocious (5000%) didn't matter]. The image below compares a standard MAPE with this calculation (AdjMAPE). Later edit: because the errors are signed, they should be a form of MPE, not MAPE. See also comments below by me and whuber.