One problem is that you've written
$$Y=α+β⋅X$$
That is a simple deterministic (i.e. non-random) model. In that case, you could back transform the coefficients on the original scale, since it's just a matter of some simple algebra. But, in usual regression you only have $E(Y|X)=α+β⋅X $ ; you've left the error term out of your model. If transformation from $Y$ back to $Y_{orig}$ is non-linear, you may have a problem since $E\big(f(X)\big)≠f\big(E(X)\big)$, in general. I think that may have to do with the discrepancy you're seeing.
Edit: Note that if the transformation is linear, you can back transform to get estimates of the coefficients on the original scale, since expectation is linear.
The best solution is, at the outset, to choose a re-expression that has a meaning in the field of study.
(For instance, when regressing body weights against independent factors, it's likely that either a cube root ($1/3$ power) or square root ($1/2$ power) will be indicated. Noting that weight is a good proxy for volume, the cube root is a length representing a characteristic linear size. This endows it with an intuitive, potentially interpretable meaning. Although the square root itself has no such clear interpretation, it is close to the $2/3$ power, which has dimensions of surface area: it might correspond to total skin area.)
The fourth power is sufficiently close to the logarithm that you ought to consider using the log instead, whose meanings are well understood. But sometimes we really do find that a cube root or square root or some such fractional power does a great job and it has no obvious interpretation. Then, we must do a little arithmetic.
The regression model shown in the question involves a dependent variable $Y$ ("Collections") and two independent variables $X_1$ ("Fees") and $X_2$ ("DIR"). It posits that
$$Y^{1/4} = \beta_0 + \beta_1 X_1 + \beta_2 X_2 +\varepsilon.$$
The code estimates $\beta_0$ as $b_0=2.094573355$, $\beta_1$ as $b_1=0.000075223$, and $\beta_2$ as $b_2=0.000022279$. It also presumes $\varepsilon$ are iid normal with zero mean and it estimates their common variance (not shown). With these estimates, the fitted value of $Y^{1/4}$ is
$$\widehat{Y^{1/4}} = b_0 + b_1 X_1 + b_2 X_2.$$
"Interpreting" regression coefficients normally means determining what change in the dependent variable is suggested by a given change in each independent variable. These changes are the derivatives $dY/dX_i$, which the Chain Rule tells us are equal to $4\beta_iY^3$. We would plug in the estimates, then, and say something like
The regression estimates that a unit change in $X_i$ will be associated with a change in $Y$ of $4b_i\widehat{Y}^3$ = $4b_i\left(b_0+b_1X_1+b_2X_2\right)^3$.
The dependence of the interpretation on $X_1$ and $X_2$ is not simply expressed in words, unlike the situations with no transformation of $Y$ (one unit change in $X_i$ is associated with a change of $b_i$ in $Y$) or with the logarithm (one percent change in $X_i$ is associated with $b_i$ percent change in $Y$). However, by keeping the first form of the interpretation, and computing $4b_1$ = $4\times 0.000075223$ = $0.000301$, we might state something like
A unit change in fees is associated with a change in collections of $0.000301$ times the cube of the current collections; for instance, if the current collections are $10$, then a unit increase in fees is associated with an increase of $0.301$ in collections and if the current collections are $20$, then the same unit increase in fees is associated with an increase of $2.41$ in collections.
When taking roots other than the fourth--say, when using $Y^p$ as the response rather than $Y$ itself, with $p$ nonzero--simply replace all appearances of "$4$" in this analysis by "$1/p$".
Best Answer
The coefficients and their associated results apply on the scale on which they are estimated. The only transformation that makes sense is to transform predictions of the response back to the original scale by squaring and subtracting. That said, transforming just because you have a skewed response is a dubious thing to do, and a transformation that looks very ad hoc is dubious to a further extent.
Comments reveal that the range of the response is from 40 to 146. Although the function won't seem challenging to those who remember their high school mathematics, it is always a good idea to plot the function to see exactly what it does:
The transformation is problematic in various ways, in increasing worry order:
It reverses the scale from high to low. This in itself is not a major issue, and the main consequence is to note that coefficients of predictors accordingly have reversed sign. But in general it is best avoided unless essential on other grounds.
It is approximately linear over most of its range, so does little to change anything over that range.
But conversely it necessarily behaves differently over the range from about 130 to 146. Does that correspond to any physical (biological, economic, whatever) knowledge of how the response will behave over that range?
It is not easily transferable to other similar datasets. Unless it is known that values above 147 are always impossible, it is impossible to compare easily results using this transformation with those for other data for which the upper values might differ.
So, contradictory though it may seem, I doubt that this transform is strong enough to change how your data behaves very much except that it is likely to work on the highest values in ways that may not make substantive sense.
Also, ad hoc transformations are a bad idea. Almost always, we should want to analyse data in ways that would always make perfect sense for other similar bodies of data.
Further comment would depend on learning more. The precise consequences in terms of increased or decreased skewness or linearity will depend on the exact distribution between the extremes and on the values of the predictors.