Solved – Function approximation using multilayer perceptron (neural network)

approximationmachine learningneural networksperceptron

I've been asked to solve a problem for a project. I'm working on Python or R. I need to approximate a function with multiplayer perceptron (neural network).

The function is: $y= 2\text{cos}(x)+4$ on the interval $[0,10]$.

After reading a lot about perceptron and neural network for the approximation of functions, I found a code that helped me a lot and my program is based on this code.
Here is the code (with comments) I use to approximate my function followed by questions and examples :

# Create vector of random data between 0 and 10, lenght 50
x <- sort(10*runif(50))

# We generate data from our fonction (that we need to approximate)
y <- 2*cos(x)+4 

# We generate vector of 100 valors from 0 to 10 that we will use to draw the basic function and the one approximate with neural network (multilayer perceptron)
x1 <- seq(0, 10, by=0.1)

# Basic function (to draw the curve of the basic function that we need to approximate)
f <- 2*cos(x1)+4

# Charging library nnet (to apply neural network to our model)
library(nnet)

# We apply neural network to our generated data (random points of the function we want to approximate)
# We use one hidden layer, 6 neurons and 40 iteration
nn <- nnet(x, y, size=6, maxit=40, linout=TRUE)

# We display the 50 random generated points of our function
plot(x, y)

# Original function curve (2*cos+4)
lines(x1, f)

# Approximate function curve, generated by prediction on the results of the neural network
lines(x1, predict(nn, data.frame(x=x1)), col="green") 

Few questions

1) In the example I found, we generate 50 points (vector x), but why 50 ? If I use 20 or 1000 results can vary.

2) How to fix the number of neurons to use ? I just try with a lot of different numbers but I don’t have rule.

3) How to fix the number of iteration the neural network use (maxit=40 here) ? Should I use other parameters of nnet ? I'm a bit lost

4) I have 1 hidden layer as I see that most of approximate function case use only one but I dont really know why.

5) When I create vector « x1 » I dont know how to fix the number of element I want (here 100 but what if I choose 20 or 1000)

6) When I run my code, results are changing, sometimes the approximation is quite precise and sometimes It’s just not good from a certain points of the curve and I dont really know why.

I'm also interested by this method on python but it seems more complicated to implement

Example 1 : a good approximation (not bad)

green line : approximated function with neural network (multilayer perceptron)

dark line: our function (2*cos(x)+4)

dots are generated points from the function (y) that we use to build the model
1st approximation (not bad)

Example 2 : a bad approximation (with the same code)
enter image description here

Best Answer

First timer here. I feel a little jumpy about it. Let's go:

Network hyperparameter(number of neurons, learning rate, etc..) setting, is a open field of research in ML. So, there is no formula, follow these tips:

1) This problem is about a phenomenon: Less neurons means less approximation ability for your network. More training samples means... more training, making harder to see network architecture features and flaws.

2) You need enough neurons as you have curves, because each neuron plots one sigmoid in graph, you will have better results with more neurons, but this improvement will decrease as you insert neurons, so, don't use more network parameters (more neurons implies more weights and biases) than your training samples. Giving you a range of 2 to 7 neurons.

3)You need enough training that gives you more chance to have a good approximation mostly times, without too much training time. Try increasing epochs.

4) Each layer plots one curve (called decision boundary) in graph. You have just one curve to approximate. You use more layers in another kinds of problems, like classification.

5) These are your training samples, as in 1), more training samples will increase network performance, improving approximation. There is a relationship between numbers of samples and numbers of epochs. Decrease training epochs while increasing training samples.

6) Your are using random values, that will affect network behavior. To mitigate this, try changing network parameters as in 1) and 3). With right parameters, function will be approximated.

That's all. Cheers!

Related Question