MATLAB: CONFIDENCE INTERVAL FOR A GUMBEL DISTRIBUTION

confidence intervalconfidenceintervalevfitextreme valueextremevaluegumbel

I have a vector of waves' height values sorted descendly and I've used a Gumbel distribution (extreme value distribution) in order to fit them. I need to determinate the equtions of the two lines that delitmitate the confidence interval but i don't know how to estimate te CI for each value of my distribution. I've tried in this way but i'm sure it's wrong.
Tr_Gumbel=[1.0001:0.1:2 2.5:1:500];%return time vector
P_Gumbel=1-1./Tr_Gumbel; %CDF for Gumbel
H_Gumbel=epsilon+teta*(-log(-log(P_Gumbel))); %Equation of the model that describes waves' height
figure
semilogx(Tremp,campioneHm0,'k.','markersize',10);xlabel('T_r [anni]','FontWeight','bold');
ylabel('Hm_{0} [m]','FontWeight','bold');title('ANALISI DEGLI ESTREMI');grid on;
hold on
semilogx(Tr_Gumbel,H_Gumbel,'r-');
xticks([0 2 5 10 20 30 50 100 250 500]);
legend('Hm_{0} osservata',['Gumbel \theta=' num2str(teta) ' \epsilon= ' num2str(epsilon)],'FontWeight','bold','Color','w','Location','southeast')
c1=0.64;c2=9;c3=0.93;c4=0;c5=1.33;
v=1;%quanto vale questo parametro ???
s_z=sqrt((1+c1*(H_Gumbel-c4+c5*log(v)^2*exp(c2/N^(4/3)+c3*sqrt((-log(v)))))/(N)));
s_tr=varianza^(0.5)*s_z;
curva_up=H_Gumbel+1.96*s_tr;
curva_down=H_Gumbel-1.96*s_tr;
figure
semilogx(Tremp,campioneHm0,'k.','markersize',10);xlabel('T_r [anni]','FontWeight','bold');
ylabel('Hm_{0} [m]','FontWeight','bold');title('STIMA DELLA CONFIDENZA AL 90%');grid on;
hold on
semilogx(Tr_Gumbel,H_Gumbel,'r-');
semilogx(Tr_Gumbel,curva_up,'b--');
semilogx(Tr_Gumbel,curva_down,'b--');

Best Answer

I checked out the plot using the data you provided.
Let me clarify something: the confidence intervals (CI) you're computing are the confidence intervals of the parameter estimates, not the CI of your data. By producing two additional curves using the upper and lower bounds of the CIs, you're showing the range of parameter values.
The two parameters are the 1) the location parameter (mu) and 2) the scale parameter (sigma). The plot below shows the 95% CI range for mu values while using the sigma vlaues from your fit. It doesn't make sense to arbitrarily combine the mu and sigma CI values.
[parmhat,parmci] = evfit(-campioneHm0,0.05)
teta1=parmhat(2);
epsilon1=-parmhat(1)
H_Gumbel1=(epsilon1+teta1*(-log(-log(P_Gumbel))));
teta2=parmhat(2); %teta2=parmci(1,2); % <------
epsilon2=-parmci(1,1);
H_Gumbel2=(epsilon2+teta2*(-log(-log(P_Gumbel))));
teta3=parmhat(2); %teta3=parmci(2,2); % <------
epsilon3=-parmci(2,1);
H_Gumbel3=(epsilon3+teta3*(-log(-log(P_Gumbel))));
figure
semilogx(Tremp,campioneHm0,'k.','markersize',10);xlabel('T_r [anni]','FontWeight','bold');
ylabel('Hm_{0} [m]','FontWeight','bold');
title('STIMA DELLA CONFIDENZA AL 90%');
grid on;
hold on
% semilogx(Tr_Gumbel,H_Gumbel,'r-');
semilogx(Tr_Gumbel,H_Gumbel1,'k--');
semilogx(Tr_Gumbel,H_Gumbel2,'g--');
semilogx(Tr_Gumbel,H_Gumbel3,'b--');
If you want to show the full range of curves using both the mu and sigma CI ranges, run the function on several combinations of parameters and plot the max and min values. These lines are not the upper and lower bounds of the underlying population of data. They are the upper and lower bounds of the 95% confidence intervals of your parameter estimates. In other words, the true gumbel function line for the given data is somewhere between the blue and cyan lines give the CIs.
[mu, sigma] = meshgrid(linspace(-parmci(1,1),-parmci(2,1), 10), linspace(parmci(1,2),parmci(2,2), 10));
allValues = nan(numel(mu), numel(P_Gumbel));
for i = 1:numel(mu)
allValues(i,:)=(mu(i)+sigma(i)*(-log(-log(P_Gumbel))));
end
upperBound = max(allValues,[],1);
lowerBound = min(allValues,[],1);
figure
semilogx(Tremp,campioneHm0,'k.','markersize',10);xlabel('T_r [anni]','FontWeight','bold');
hold on
plot(Tr_Gumbel, lowerBound, 'b-')
plot(Tr_Gumbel, upperBound, 'c-')