[Math] a good technique to decide step size in sub-gradient method for dual decomposition

convex optimizationoptimization

I am looking at the following paper to implement dual decomposition for my algorithm:
http://www.csd.uoc.gr/~komod/publications/docs/DualDecomposition_PAMI.pdf

On Pg.29 they suggest setting the step size for the sub-gradient method by taking the difference of the best primal solution and current dual solution and dividing by the L2-norm of the sub-gradient at current iteration.

My doubt is the following: Do I use sub-gradients for each slave problem and compute a different step-size for each slave problem? Or is there some way I can compute the sub-gradient for the combined dual problem?

Best Answer

Step-sizes are the crucial and difficult point when using subgradient methods. Basically you need that the step-sizes tend to zero but not too fast. If one uses a-priori step-sizes (e.g. of the form $1/k$) then the method provably converges but in practice it will slow down such that you'll not observe convergence.

The dynamic rule they suggest in the paper (in equation (40)) look like so-called Polyak step-sizes with estimation of the true optimal values (obtained by values of the dual problem). One can prove convergence with these step-sizes under special conditions. I do not know a good reference off the top of my head but many books on (nonsmooth) convex optimization should treat this.

Related Question