It is VERY rare to have to use more than one hidden layer. MATLAB's version of ELMAN uses a poorer approximation of timeseries network gradients. Using more layers just makes things worse. From the help documentation:
>> help elmannet
elmannet Elman neural network.
Elman networks are provided for historical interest. For much better
results use narxnet, timedelaynet, or distdelaynet.
Elman networks with two (or more) layers can learn any dynamic
input-output relationship arbitrarily well given enough hidden
neurons and enough input and layer delays. However, Elman networks
use static derivative calculates instead of full dynamic calculates.
This results in a trade off of reduced training calculations, but the risk of poorer accuracy.
Hope this helps.
Thank you for formally accepting my answer
Greg
Best Answer