Solved – the role of basis functions in reinforcement learning

machine learningreinforcement learning

In the very simple examples of reinforcement learning (gridworld, mountain car), we use real numbers or some elementary functions as reward functions.

When state spaces become larger and larger, and eventually continuous, the states become harder and computationally expensive to define. So here comes the idea of functional approximation, where we can use features to define.

I have always thought of 'features' (from the word itself) as qualities which I can measure. For example: how far is the agent from a certain obstacle, or how far the agent is from goal position, etc. But I have never seen this in examples/sample codes. In Sutton, there is talk of radial basis functions for features.

here are my questions:

What is the role of radial basis functions in functional approximation?
Can my idea of 'features' (per definition) work?
Do you have some examples in github, or otherwise which shows this implementation?

Best Answer

I'm offering an answer this old post because it ranks in Google.

I like to think of kernels, or any feature transformation, as a way to increase the dimensionality of the input. You then pass this input to a model to do whatever it is you want to do.

For example, in a classification problem, you can inspect the decision boundary in the original domain of the data (imagine a 2D domain). The transformed data is in a much higher-dimensional space than the original. The linear model operates in this high-dimensional space. When you map that decision boundary back down to the original 2D space, it appears as if the boundary has significant complexity, like curves and circles and what-not.

The same is true for regression and is therefore also true in RL. They allow you to arbitrarily increase model complexity whilst retaining all the benefits of using a linear model (robustness, convergence guarantees, etc.).

Regarding examples, I find it easier to reason about this in terms of classification/regression problems. I recommend going through classification examples in sklearn's excellent documentation. For example the gaussian processes or support vector machines. Then you should be able to apply that knowledge to the world of RL.

Best Answer

Related Solutions

Solved – Understanding the role of the discount factor in reinforcement learning

TL;DR.

Discount factor smaller than 1 (In Detail)

Other optimality criteria

End note

Related Question