MATLAB: How to create an N*1 matrix for n individual fixed effects under unbalanced panel data

unbalancedvectorization

I have a panel data set for individual i=1,2,…,n. The panel is unbalanced, so individual i shows up in the data Ti times, leading to a total of N=T1+T2+…+Tn observations. I also have a n*1 matrix of individual fixed effects, i.e. A=[theta1, theta2, …, theta_n]'. Then, I want to create a N*1 matrix B of individual fixed effects that fit into the original panel.
For example, if T1=3, T2=1, T3=2,…, my goal is to create B=[theta1, theta1, theta1, theta2, theta3, theta3, …]'. I can create B using a loop, but n is too big to rely on coding with a loop. Is there any efficient vectorization way to overcome this hurdle?
Thank your very much for your help.

Best Answer

I whipped up this code. I hope it does what you were thinking:
tic
n = 10000000; % Number of individuals (persons).
% Generate a random number of observations for each of the n persons.
% Each person may have up to 5 observations (measurements).
Ti = randi(5, [1 n]);
% Let's print out sum T so we know how big B needs to be.
T = sum(Ti);
% Generate the A matrix.
% Let's just have it be 10 through 10*n in steps of 10
A = 10 : 10 : 10*n;
% Make up the B matrix where each element of A
% is replicated Ti times.
% Preallocate B
B = zeros(1, T);
% Brute force loop
index = 1; % Let's start at the beginning.
for k = 1 : n
B(index:index + Ti(k) -1) = A(k); % Vectorized assignment
index = index + Ti(k);
end
B; % Display B
toc
Here's what it does for an n of 10:
Ti =
3 1 1 2 5 4 1 1 1 4
T =
23
B =
Columns 1 through 14
10 10 10 20 30 40 40 50 50 50 50 50 60 60
Columns 15 through 23
60 60 70 80 90 100 100 100 100
Elapsed time is 0.001806 seconds.
When n was 10 million, it took 10.6 seconds.