Nw = (I+1)*H+(H+1)*O = (3+1)*6+(6+1)*1 = 31
whereas with N = 8 data points and O = 1 output node,
the number of training equations is no larger than
Ntrneq = Ntrn*1 <= N*1 = 8
2. Using a validation subset to mitigate overtraining
an overfit net would result in BOTH Nval and Ntrn
being inadequately small.
3. Other well known remedies (which can be combined)
a. Regularization: Add a constant times the sum of
weight magnitudes to the performance function
b. Add simulated noisy data about each original
data point within a sphere of radius of 0.5.
4. However, before getting all fancy, just try to
reduce the number of unknown weights by reducing the
number of hidden nodes.
Hope this helps,
*Thank you for formally accepting my answer*
Greg
Best Answer