I would like help with R code to estimate theta for the Inverse Hyperbolic Sine Transformation. This transformation is useful to transform skewed data that contain negative values or zeros.
There are a couple of related posts that discuss the IHS and suggest there is a maximum likelihood approach to estimating theta but I can't work out how to apply it. These related posts are:
Please see example code below. I've made skewed data by using the inverse of the IHS. Now, given the skewed data, and no prior knowledge of theta, how can I work out what theta should be? I would be most grateful for R code to undertake this analysis.
# Define the IHS transformation and its inverse
IHS <- function(x, theta){ # Inverse IHS transformation
(1/theta)*asinh(theta * x)
}
Inv.IHS <- function(x, theta){ # IHS transformation
(1/theta)*sinh(theta * x)
}
set.seed(1)
# generate some normal data
x <- rnorm(1000)
hist(x, breaks='FD')
# skew it by applying the Inverse of the IHS transformation
xt <- Inv.IHS(x, theta=2)
hist(xt, breaks='FD') # yep this is skewed. How could we estimate theta?
Best Answer
After a couple more days of thinking about the problem, I have two tentative answers.
Select theta so that the transformed data is close to normal as measured by goodness of fit. For example choose theta to maximize the p-value of the Shapiro-Wik test.
Maximum likelihood estimation of theta.
Looking at the paper by Burbidge et al. I think the likelihood function for a single variable can be expressed as follows.
In both cases we get close to the expected theta, which is encouraging. But when I try the maximum likelihood approach on my real data it doesn't seem to give reasonable answers.