What makes distributed function computation difficult

combinatoricscomputer scienceinformation theoryopen-problemprobability theory

Encoder $k$ informs a decoder of a source $X_k$ at rate $R_k$; $k=1,\dots, K$.
The decoder seeks to recover $X_0=f(X_1,\dots, X_K)$ with high probability. Do schemes exist with failure rate arbitrarily close to 0?

There is no known generally tight bound on the entropy rates of the collection of reliable encoders for this system. The problem has been known and open since at least the 70s.

The set of letter-typical source outcomes for large enough blocklength is a concrete and well understood object.
It contains all of the problem’s structure.
What complexity in the problem makes it difficult to find the solution through studying a typical set?

I am trying to understand what obstacles exist that cause this problem to go on unsolved.

Some results and surveys are referenced in Wolf et. al. "On the Binary Lossless Many-help-one Problem with Independently Degraded Helpers" (2018) but I cannot find an exposition that addresses this more basic question.

Best Answer

The shape of the decoder-function-of-interest $f$ may make it beneficial for the encoders to transform their inputs using non-trivial structure across time indices. Usual random binning techniques do not provide control over the needed structure.

We don't have a general description of useful structure to introduce at the encoders. We don't know how to design structures useful for a given ($f$, source distribution) tuple.

A classic example shows that for certain ($f$, sources), a certain structure is useful:

  • Korner and Marton, “How to encode the modulo-two sum of binary sources (corresp.)"

This approach can be generalized:

  • Krithivasan and Pradhan, “Distributed source coding using Abelian group codes: A new achievable rate-distortion region”

The following papers examine classes of $f$ that indicate whether such structure is needed:

  • Han and Kobayashi, "A dichotomy of functions $F(X,Y)$ of correlated sources $(X,Y)$ from the viewpoint of the achievable rate region"
  • Watanabe, "A Classification of Functions in Multiterminal Distributed Computing"
Related Question