MATLAB: Do I receive a different mean of the inverse CDF for a generalized pareto distribution using GPINV as I theoretically would expect in Statistics Toolbox 7.1 (R2009a)

generalizedinversemeanparetoStatistics and Machine Learning Toolbox

I am using some random data to generate the inverse cdf for a generalized pareto distribution with tail index (shape) parameter k, scale parameter sigma and threshold (location) parameter theta such as:

p = rand(10000,1);
sigma =  0.2531;
k = 0.9982;
theta = 0;
pp = gpinv(p,k,sigma,theta);

Comparing the theoretical mean "theta + sigma/(1-k)" with the calculated mean I get a result which differs a lot from the expected.

ppmean   = mean(pp)
expmean = theta + sigma/(1-k)

ppmean is about 1.98 and expmean is about 140.

Best Answer

This is to be expected. Try the same experiment with p length of 10^6.

p = rand(10^6,1);
sigma =  0.2531;
k = 0.9982;
theta = 0;
pp = gpinv(p,k,sigma,theta);

You'll find that mean(pp) creeps up a bit. If you try 10^7, it would creep up a little more. Now try

hist(p2,100)

You'll see that the distribution is VERY skewed. The problem is that the distribution is so badly skewed that the sample mean, even in fairly large samples, is also badly skewed, with a long right tail. So almost all of the time, you're seeing the a sample mean that is much smaller than the theoretical mean. Getting an observed p value of, say (by making it up), 10^12 is rare, but when it happens it gives a very large sample mean, and compensates on average for all those sample means that are small.

This is a case where the central limit theorem requires an impractically large sample size to have any relevance. The fact that the variance of the distribution

[m,v] = gpstat(0.9982,0.2531,0)

resulting in

 m =
        140.61
 v =
    Inf

overflows is another indication that you're in trouble. So either using different values for k, sigma and theta or increasing the numbers of values in p will be the best approach here and will result in a better sample mean compared to the theoretical mean.

Related Solutions

MATLAB: How to go about finding the standard normal probability based on the z-score

doc normcdf
doc normpdf

When you know what you want but not sure the name, try something like

>> lookfor normal
realmin                        - Smallest positive normalized floating point number.
randn                          - Normally distributed pseudorandom numbers.
sprandn                        - Sparse normally distributed random matrix.
surfnorm                       - Surface normals.
isonormals                     - Isosurface normals.
cde                            - cd elliptic function with normalized complex argument.
sne                            - sn elliptic function with normalized complex argument.
addfreqcsmenu                  - Add a cs menu to switch between linear and normalized frequency
convertfrequnits               - converts between Normalized, Hz, kHz, etc
histfit                        - Histogram with superimposed fitted normal density.
jbtest                         - Jarque-Bera hypothesis test of composite normality.
lhsnorm                        - Generate a latin hypercube sample with a normal distribution
logncdf                        - Lognormal cumulative distribution function (cdf).
lognfit                        - Parameter estimates and confidence intervals for lognormal data.
logninv                        - Inverse of the lognormal cumulative distribution function (cdf).
lognlike                       - Negative log-likelihood for the lognormal distribution.
lognpdf                        - Lognormal probability density function (pdf).
lognrnd                        - Random arrays from the lognormal distribution.
lognstat                       - Mean and variance for the lognormal distribution.
mvncdf                         - Multivariate normal cumulative distribution function (cdf).
mvnpdf                         - Multivariate normal probability density function (pdf).
mvnrnd                         - Random vectors from the multivariate normal distribution.
normcdf                        - Normal cumulative distribution function (cdf).
normfit                        - Parameter estimates and confidence intervals for normal data.
norminv                        - Inverse of the normal cumulative distribution function (cdf).
normlike                       - Negative log-likelihood for the normal distribution.
normpdf                        - Normal probability density function (pdf).
normplot                       - Displays a normal probability plot.
normrnd                        - Random arrays from the normal distribution.
normspec                       - Plots normal density between specification limits.
normstat                       - Mean and variance for the normal distribution.
logn3fit                       - Fit a 3-param lognormal dist'n using cumulative probabilities.
wgtnormfit                     - Fitting example for a weighted normal distribution.
wgtnormfit2                    - Fitting example for a weighted normal distribution (log(sigma) parameterization).
>>

Judicious search terms help but seeing the list of things related to "normal" lets you find the two functions of interest (plus a lot more depending upon which toolboxes are available, maybe) that might be of use/interest...

MATLAB: Marking a single point on a graph

This first finds the approximate indices for the zero-crossings, then interpolates to find the exact values:

sigma_x = 8;
sigma_y = -12;
tau_xy = -6;
theta = [-90:.05:90];
%Normal Stress
sigma_x1 = ((sigma_x + sigma_y)/2)+(((sigma_x - sigma_y)/2)*(cosd(2*theta)))+(tau_xy*(sind(2*theta)));
sigma_y1 = ((sigma_x + sigma_y)/2)-(((sigma_x - sigma_y)/2)*(cosd(2*theta)))-(tau_xy*(sind(2*theta)));
%Shear Stress
tau_x1y1 = -(((sigma_x - sigma_y)/2)*(sind(2*theta)))+(tau_xy*(cosd(2*theta)));
zci = @(v) find(v(:).*circshift(v(:), [-1 0]) <= 0);                    % Returns Approximate Zero-Crossing Indices Of Argument Vector
tauzix = zci(tau_x1y1);                                                 % Approximate Zero-Crossing Indices
for k = 1:numel(tauzix)
    xval(k) = interp1(tau_x1y1(tauzix(k)-1:tauzix(k)+1), theta(tauzix(k)-1:tauzix(k)+1), 0);    % ‘Theta’ Values At Zero-Crossings
end
%plot
plot(theta, tau_x1y1)
hold on
plot(xval, zeros(size(xval)), 'pg')
hold off
xlabel('Degree of Rotation'), ylabel('Shear Stress'),title('Shear Stress Based on Rotation')
grid on, axis equal

Best Answer

Related Solutions

MATLAB: How to go about finding the standard normal probability based on the z-score

MATLAB: Marking a single point on a graph

Related Question