@gung is right. You want the variables to be correlated. The idea of PCA is to reduce the dimension of the input variables. You have 10; if the first 2 PC's take up most of the variation, you can reduce your input variables to 2. But this is only possible when the variables are correlated.
And that being the case, why not throw the original waveform into the PCA? I assume that you have values for your waveform. Then see if the PCA pulls out some features of interest - such as amplitude or slope - rather than compute them directly.
A good reference is Ramsay's Functional Data Analysis with R and Matlab
With 500 patients, you are well positioned to look at your waveforms through that sort of analysis.
The other methods that you mention could be helpful - also growth curve analysis - but I would need to know what your research question is to advise further.
An important question is to evaluate how consistent the principal axes are across individual teachers. If they are mostly consistent, you can just use PCA on the whole dataset (ignoring teacher segregation) and be happy. If it happens that different teachers have fundamentally different axes of variance, then you likely have to reconsider the entire approach.
So the question I will try to answer is how to test if PCA are consistent across individuals. This is a solution I used some time ago, I have no guarantees that it is in any way optimal.
Step 1: Decide on the relative amount of explained variance $\alpha$ (e.g. 90%) that your PCA should cover
Step 2: For one teacher $i$, find the first $n$ PCA's that explain $\alpha$ variance. This will give you the eigenvalues $e_i$ and eigenvectors $V_i$. Importantly, the eigenvectors form the mapping that projects original data $x_i$ onto PCA space $\psi_{ii}$
$$\psi_{ii} = V_i^T x_i$$
Step 3: What we want to do is to calculate for another teacher $j$, how much of its variance would be explained by using the PCA space of teacher $i$. The prediction for original data given its PCA is
$$\hat{x}_{ii} = V_i\psi_{ii} = V_iV_i^Tx_i$$
and the relative explained variance is then
$$l^2_{ii} = \frac{||\hat{x}_{ii} - x_i||}{||x_i||} $$
where $||\cdot||$ is a 2-norm (aka euclidean distance). The same principle can be used to project the data of teacher $j$ onto basis of $i$
$$\hat{x}_{ij} = V_i \psi_{ij} = V_i V_i^T x_j$$
and the relative explained variance
$$l^2_{ij} = \frac{||\hat{x}_{ij} - x_j||}{||x_j||} $$
Step 4: Finally, we can repeat this procedure to obtain the matrix $l^2_{ij}$ for every pair of teachers $ij$. This matrix can be used to judge the extent to which a common PCA would capture individual variance, and find out detailed information on which clusters of teachers use similar variance axis, and which are outliers.
Best Answer
user11852: your suggestion of doing a phylogenetic PCA worked beautifully thank you!
Interesting that the REvell paper and corresponding R package discuss this technique only in terms of correcting for relatedness when comparing traits among species. Seems to be much more widely applicable than this - ie works for nested data in general.
Edit: giving an example of using the approach, as asked for in @nan's comment below:
First, an explanation of the data. My dataframe is called 'Rat', and it consists of rat capture rates and vegetation variables measured in two forest patches, 'B86Grazed' and 'LittleTutu'. Variables were measured at multiple sites within each patch, and my individual sites were named 'LittleTutu-1', 'LittleTutu-2' etc. My aim was to create PCA axes from the vegetation variables that were uncorrelated once the nesting of sites within patches was taken into account. For example, modelling axis1 as a function of axis2 while specifying patch identity as a random factor would find no significant effect of axis2. [Note also that I wouldnt be fitting a random-effect with only two patches; I've simplified the dataset for the example].
This is the code I used:
This approach produced axes that were correlated with my response variable (rat capture rates) but which were uncorrelated with each other when nesting of sites within patches was taken into account. Note that I haven't been able to make this approach work when the data contains 'singleton' patches (i.e. patches with only one site). Note also that the 'read.tree' command removes whitespace from the tree - so best not to have spaces in patch or site names.
Lastly, when I did this phylogenetic PCA it was for a specific application which required uncorrelated axes after nesting was accounted for. Whether this approach should be done any time a PCA is done on nested data is another question (which I'd be interested to hear people's thoughts on).