Solved – Which constant to add when applying ‘Box-Cox transformation’ to negative values

data transformationnormal distributionnormalization

Questions

  1. How big constant should we add to negative values when applying the 'Box-Cox transformation'? The data that I am handling is 'daily return of stocks'
  2. Shouldn't we subtract some amount after the Box-Cox transformation if we add a constant to negative values to apply the Box-Cox transformations?

Background Information

  • Since daily return of stocks does not follow the normal distribution, I tried to apply Box-Cox transformation. However, some of daily returns are negative so I could not transform them.
  • According to the article about the Box-Cox transformation, I can add a constant to make those negative numbers non-negative. The excerpt from the article is as follows:

Box-Cox transformations are designed for non-negative responses, but can be applied to data that have occassional zero or negative values by adding a constant α to the response before applying the power transformation. Although α could be estimated, in practice one often uses a small value such as a half or one (depending, obviously, on the scale of the response).

  • My data is the daily return of the company Apple (ticker : AAPL) from 2000.01.01 to 2011.12.31. The minimum value was -0.518692 (Yes, it's true. The price of Apple stock dropped 52% in one of the days in Sep. 2000) The summary and histogram of my data before the Box-Cox transformation is as follows:

enter image description here

  • To make my data just not negative, I added 0.52 to all of daily returns. The number 0.5 also appeared on the article above that I hyperlinked, so I thought it would be an appropriate constant. But the problem is that the center of the histogram moved a lot from zero, which is far away from the center of the original histogram. The summary and histogram of the data after the Box-Cox transformation is as follows:

enter image description here

  • If I used the constant 1 instead of 0.5 as a constant to add my data, the center of the histogram would be around 0, not far away from the original histogram (the histogram of data before Box-Cox transformation). Then my question is how big a constant should be?

  • Also, shouldn't we subtract a constant after the transformation? If you add a constant to data to make them possible to get Box-Cox transformed, the original data's amount gets bigger. It's bigger than the original data. After the transformation, don't we need to do some operation to make them smaller to its original size? But the thing is after the Box-Cox transformation, the histogram of data moved to the left, which means that the data got smaller than the original data already…

  • So I am curious to know whether changing the data by adding a constant requires any other operation (e.g. subtraction) and if it does not require anything, why not?

Best Answer

Determine the smallest number in your time series , say -10. for example . The constant you would need is then 10.0000000001 or larger in order to make all the adjusted values positive. It doesn't make make any difference as the reverse transformation needed to obtain forecasts will use the same adjustment factor.

Please see https://www.ime.usp.br/~abe/lista/pdfm9cJKUmFZp.pdf where Box & Cox suggested an alternative to deal with negative numbers without adding a constant .

EDITED after OP's question re logs ...

From the above reference ...perhaps this helps ... ..logs are used

enter image description here