MATLAB: Storing words

cell array

I have made a few changes, however my program were not able to to store the processed words into a cell array, which leads to an error for execution of the next line of codes:
??? Subscript indices must either be real positive integers or logicals.
Error in ==> WordSearchtryout2>loadWordBank at 170
if strcmp(wordbank{iword}(end), 's')
Error in ==> WordSearchtryout2 at 59
loadWordBank();
I understand that this code might be super long, and you might not have any time to check any of it, but if you do, I would totally appreciate it.
function loadWordBank
clc;
[FILENAME, pathname] = uigetfile('*.wsb','Read Matlab Code File');
if isequal(FILENAME,0) || isequal(pathname,0)
fprintf('User pressed cancel');
else
fprintf('User selected %s \n', FILENAME);
end
fid = fopen(FILENAME,'rt');
if fid<0
%error could not find the file
return,
end
total_no_words=0;
lineNUM=1;
wordbank=cell(250000,1);
no_plurals=0;
while ~feof(fid)
tline = fgetl(fid);
if(isempty(tline))
%line is empty, skip it
continue;
end
if(~ischar(tline))
%if line does not contain character
fclose(fid);
break;
end
%we finally have a string
tline=strtrim(tline);
if(sum(isspace(tline)))==length(tline)
continue;
end
is_letter=isletter(tline);
is_space=isspace(tline);
checkcount=sum(is_letter+is_space);
if checkcount~=length(tline)
% invalid line in puzzle
return;
end
%tline only contain spaces and letters
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
total_no_words=total_no_words+1;
wordbank(total_no_words,1)={tline};
end
lineNUM=lineNUM+1;
end
wordbank2=wordbank;
for iword=1:length(wordbank)
if strcmp(wordbank{iword}(end), 's')
no_plurals=no_plurals+1;
wordbank2{iword} = wordbank{iword}(1:(end-1));
end
end
inv_words=line_NUM-length(wordbank);
% this include empty lines, lower upper case letters, as required
no_dup=length(wordbank2)-length(unique(wordbank2));
fprintf('LOAD WORD BANK \n');
fprintf('Loading word bank: none....started\n');
fprintf('Loading word bank: %s\b\b\b\n',FILENAME);
fprintf('Successfully loaded %d words from the word bank file\n',total_no_words);
fprintf('Removing invalid words..%d words were successfully removed. \n',inv_words); %not complete
fprintf('Removing duplicate words and sorting...done\n');
fprintf('Removed %d duplicate words\n',no_dup);
fprintf('Searching for and removing any plural forms of words ending in S:%%\n');
fprintf('Removed %d plural word \n',no_plurals);
fprintf('Building word indices and calculating beginning letter counts...done\n');
fprintf('Calculating word length counts...done\n');
fprintf('Final word count: %d \n');
end

Best Answer

Your line
for iword=1:length(wordbank)
should be
for iword=1:total_no_words
as you do not want to be trying to de-pluralize words that were never stored. length(wordbank) is going to be set by you defining it as a cell array with 25000 entries.
As an optimization, you can replace your checkcount lines that currently say
checkcount=sum(is_letter+is_space);
if checkcount~=length(tline)
with
if ~all(is_letter | is_space)
And after that your line
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
could be optimized to
if strcmp(tline,lower(tline)) || strcmp(tline,upper(tline))
However! This line checks that time is all in upper-case or all in lower-case and will not store any line that is in mixed-case. Your all-upper-case lines will be stored in upper-case. I suspect you do not intend either of these behaviours, and I cannot tell what you are wanting to check with that "if" statement. I suspect you should be using
tline = lower(tline);
and then storing that unconditionally.