Hi,
I am trying to use readtable for read in a .dat file. The file looks like this, where there could be 1 to very many entries in the columns that start with a "1'" here.
# NetMHCIIpan version 4.0# Input is in PEPTIDE format# Prediction Mode: EL+BA# Threshold for Strong binding peptides (%Rank) 2%
# Threshold for Weak binding peptides (%Rank) 10%
# Allele: HLA-DPA10103-DPB10101-------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel
-------------------------------------------------------------------------------------------------------------------------------------------- 1 HLA-DPA10103-DPB10101 AAAAAAAAAAAAAAA 3 AAAAAAAAA 0.380 Sequence 0.020745 81.44 NA 0.366182 951.24 32.45 --------------------------------------------------------------------------------------------------------------------------------------------Number of strong binders: 2 Number of weak binders: 0--------------------------------------------------------------------------------------------------------------------------------------------# Allele: HLA-DPA10103-DPB10201-------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel-------------------------------------------------------------------------------------------------------------------------------------------- 1 HLA-DPA10103-DPB10201 BBBBBBBBBBBBBBBB 2 BBBBBBBBB 0.960 Sequence 0.491911 1.02 NA 0.712020 22.55 0.27 <=SB --------------------------------------------------------------------------------------------------------------------------------------------Number of strong binders: 2 Number of weak binders: 0--------------------------------------------------------------------------------------------------------------------------------------------# Allele: HLA-DPA10103-DPB10202-------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel-------------------------------------------------------------------------------------------------------------------------------------------- 1[.......]
These columns would then start 2,3,4,[…]. I successfully use
opts = detectImportOptions('filename.dat'); opts.DataLines = [16 Inf];opts.VariableNamesLine = 14;readtable(fullfile('path','filename.dat',opts,'ReadVariableNames', true);
for files with a large number of columns between the —-, i.e. e.g.
# Allele: HLA-DPA10103-DPB10101-------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel-------------------------------------------------------------------------------------------------------------------------------------------- 1 HLA-DPA10103-DPB10101 AAAAAAAAAAAAAAA 3 AAAAAAAAA 0.380 Sequence 0.020745 81.44 NA 0.366182 951.24 32.45 2 HLA-.... 3 .... .... .... 50 HLA....--------------------------------------------------------------------------------------------------------------------------------------------Number of strong binders: 2 Number of weak binders: 0--------------------------------------------------------------------------------------------------------------------------------------------
However, this does not work for short "fillings" and my code very much depends on being robust in either scenario.
I tried playing with the opts but did not get it to work. I would be very grateful for any advice! Maybe a method other than readtable (readtext?) is needed and then a conversion to a table? In the end I will need a table like this:
Thank you very much for your advice! I have spent a long time deleoping the code around this and this is the final part that keeps breaking…
Best Answer