MATLAB: How matlab determine skewness

skewness

hi all, i tabulated out my data residual and i do a boxplot on the residual, what i find is the median line is closer to the 75th percentile so by right it should be positively skewed and i use matlab function skewness(data) and got a positive value which validate the residual boxplot as positively skewed. However, if its positively skewed it should mean there are more observation on the positive side, but of all the residual that i have(124 of them) only 30+ of them is positive. I wondering if it still considered as positively skewed ? or is there something i missed out ? (and yes, my mean is smaller than median)

Best Answer

Please post a screenshot, and attach your data file and m-code to read it in and plot it, if you can.

Be aware that skewness is determined not only by how many data points are to the right and left of the mode but also how far away they are . So more points that are on the left but close to the mode, may not overwhelm a few points that are on the right but much farther away, giving an overall positive skewness even though more points are on the left. Here is the formula for an image:

% Get the skew.
skewness = sum((GLs - meanGL) .^ 3 .* pixelCounts) / ((numberOfPixels - 1) * stdDev^3);

Related Solutions

MATLAB: PCA function outputting scores different to that expected, am i missing something

Hi Matteo,

I don't believe there is really a problem here. Let Vt denote the transpose of the data matrix Values, so that Vt is 4x17 like you want. With

[coef score latent] = pca(Vt)

pca computes** the eigenvalue decomposition of the covariance matrix of Vt. The resulting eigenvectors are the columns of coef. Then score is computed with

score = Vt0*coef

where Vt0 is shown in the code below. Each eigenvector is real and normalized to 1, but is still arbitrary to within an overall factor of +-1. Since all the columns of coef are orthogonal to each other, there is no foolproof way to assign those signs uniquely. It looks like the example you are using disagrees with Matlab on the overall sign of the first column of coef. So the score matrix comes up with different signs as well. Nothing wrong with that.

What matters is that you can still relate the scores to the data. No matter what the overall signs of the columns of coef are, after you calculate scores that way it should still be true that

Vt0 = score*coef'

You can change the overall signs of coef columns and make your own coef, as in the example below. The resulting scores agree with the example.

Forget about salad. After looking at the png file, this all makes me want to fly to Belfast and eat fish and chips.

load('Example.mat')
Vt = Values';
[coef score lat] = pca(Vt);
% create new coef matrix and a new score matrix
coefnew = coef;
coefnew(:,1) = -coefnew(:,1);
Vt0 = Values' - mean(Values');   % covariance matrix calculation does this anyway
scorenew = Vt0*coefnew;
figure(1);scatter(score(:,1), score(:,2))
figure(2);scatter(scorenew(:,1), scorenew(:,2))   % same as example
Vt0_check = score*coef'
Vt0_check_new = scorenew*coefnew'
max(max(abs(Vt0-Vt0_check)))
max(max(abs(Vt0-Vt0_check_new)))

** it accomplishes this more accurately using svd instead of eig but with the same intent

MATLAB: Median vs Mean averages using skewness

It depends on what you want to do. In a symmetric distribution, the mean and median will be close if not equal. The mean is affected by extreme values, while the median is not. If you have any doubts as to the ‘best’ parameter, I would simply choose the median.

To illustrate:

x1 = [1 2 3 10];
x2 = [1 2 3 99];
x1_stats = [mean(x1) median(x1)]
x2_stats = [mean(x2) median(x2)]
x1_stats =
            4          2.5
x2_stats =
        26.25          2.5

Best Answer

Related Solutions

MATLAB: PCA function outputting scores different to that expected, am i missing something

MATLAB: Median vs Mean averages using skewness

Related Question