MATLAB: Were the ProbeSetNumber and ProbePairNumber columns removed from structs returned by AFFYREAD in MATLAB 7.5 (R2007b)

affyreadBioinformatics Toolboxstructurestructures

In structs returned by the Bioinformatic Toolbox V3.0 function AFFYREAD in MATLAB 7.5 (R2007b), the columns named ProbeSetNumber and ProbePairNumber (of the ProbeSets field) were replaced by columns named GroupNumber and Direction. This change applies only when using AFFYREAD to read CDF files.
I want to extract these quantities from the new struct format.

Best Answer

The AFFYREAD structure has been changed for the Bioinformatics Toolbox V3.0 - which is the toolbox release in conjunction with MATLAB 7.5 (R2007b).
The new GroupNumber column gives the same information as the ProbeSetNumber subfield did, except that the GroupNumber indexing is 1-based while the ProbeSetNumber was 0-based (in accordance with Affymetrix standards).
ProbePairNumber, however, was removed due to the perceived redundancy in the information it provided; the probe pair numbers are simply the row indices of the matrices contained in the ProbeSets field of the returned struct (each row represents the readings of one probe pair).
While the ProbePairNumber column was removed, it is possible to compute the probe pair numbers from the ProbeSets matrices, and if desired, insert them back into the returned struct. Shown below is code that would perform this:
for n = 1:length(affy_struct.ProbeSets)
% Extract current ProbePair matrix
CurPairs = affy_struct.ProbeSets(n).ProbePairs;
% Generate indices
CurPairNumbers = (1:length(CurPairs));
% Force column vector
CurPairNumbers = CurPairNumbers(:);
% ProbePairNumbers are inserted here
CurPairs = [CurPairs(:,1) CurPairNumbers CurPairs(:,2:end)];
% Reassign to ProbePairs
affy_struct.ProbeSets(n).ProbePairs = CurPairs;
end
where affy_struct is the struct obtained by reading a CDF file with AFFYREAD.
You may also then insert 'ProbePairNumber' as one of the names in affy_struct.ProbeSetColumnNames by executing the following code:
CurNames = affy_struct.ProbeSetColumnNames;
CurNames = {CurNames{1} 'ProbePairNumber' CurNames{2:end}};
affy_struct.ProbeSetColumnNames = CurNames;