I have a dataset. Assume that y is the dependent variable and x is the independent variable. My goals for this analysis is mainly on the following hypothesis:
- Expecting x=0 to imply y=0
- Expecting a significant relationship between x and y
To achieve this, I am trying to determine the best transformation of x and y to fit the best linear model in R. So, the final model I got is $\sqrt y$ against ln(x). When I fit the model in R, I obtain the following for the coefficients:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.319615 0.028743 11.12 2.93e-10 ***
x 0.150139 0.009959 15.08 9.76e-13 ***
---
Questions:
-
I am trying to interpret the Intercept term. Since the p-value is much less than 5% significance level, can I say that the intercept is significantly different from 0? However, this model is undefined for x=0, hence I'm not sure if this interpretation is valid. I was thinking of will it be OK if I were to refit the linear model for smaller x. < Solved >
-
To address the above question, the problem as seen from this model is that I can't test for hypothesis 1. Would be very thankful if anyone could provide some help.
Best Answer
The intercept term does not refer to when x=0, since your x is actually ln(x). Instead, the intercept refers to when ln(x)=0, which occurs when the old x=1. At that point (in the new space), $\hat y$ (i.e., $\widehat{\sqrt{y}}$) differs significantly from 0.
It may help you to read this excellent CV thread: Interpretation of log transformed predictor.