MATLAB: Matlab Binary Parsing and Type Casting Speedup

binaryparsingspeedup

I am looking for some help on speeding up binary processing I am doing and/or general guidelines. I also would like to know if MEX and a working knowledge of C/C++ would net me anything.
I have read in the entire file as unit8. The data files can be 100's of MBs of assorted types of data. Some locations remain unint8's. Others need to be interpreted as int16's/int32,uint32,uint16,etc. The majority of the data are 16 bit signed audio samples I am picking out, casting to int16 and adding to an array.
Here is my file read
a = uint8(fread(fid,(512*length_blocks),'uint8=>uint8'));
Here is my loop pulling out audio samples and casting. k is the byte index I'm using. I was using concatenation but changed to preallocated audio_segment_block_r with index hoping for speedup. Below I show the code and the profiler results. This is for left samples. I do this all again for right samples in another loop. The profiler indicates the same results for the other channel extraction. In the below profiler run I process 24MB of data. It takes ~ 5 minutes. This is a bit too long for me.
for m = begin_sample:double(AUDIO_WORD_BYTES*NUM_ACTIVE_MICS_ACTIVE):end_sample
segment = a(m:m+AUDIO_WORD_BYTES-1);
segment = typecast(segment,'int16');
audio_segment_block_r(block_r_i) = segment;
block_r_i = block_r_i + 1;
end
5769019 calls. 14% run time on
segment = typecast(segment,'int16');
5769019 calls. 13% run time on
audio_segment_block_r(block_r_i) = segment;
5769019 calls. 9.4% run time on
segment = a(m:m+AUDIO_WORD_BYTES-1);

Best Answer

One issue is that the lines you highlight are basically data copying. I.e., you copy the data to pick out the segment, then you copy the data again to typecast it, then you copy the data again to put it into your result. You could boil all of this down to only one data copy if you coded all of this in a mex routine. For large variables this could amount to some significant time & resource savings. As an example, the MATLAB typecast function does a data copy but this mex version does a shared data copy (much faster):
Why do you have a uint8 conversion on your fread result? Doesn't the 'uint8=>uint8' already give the result as uint8?