[Math] Blotto game variation

game theorylinear programmingnash-equilibriumoptimization

My smart friend ZWX challenged me to solve the "brainteaser" below, but to my surprise, the problem seems highly nontrivial as I took a closer look. Anyway, the question goes:

In a game, both you and your opponent have $100$ units of resources to allocate to $10$ districts. For example, a valid allocation could be $(70, 20, 10, 0, 0, 0, 0, 0, 0, 0)$.

After deciding on a plan, you and your opponent simultaneously reveal your allocations to each other. For each of the $10$ districts, whoever has allocated more resources wins it. If there is a tie, nobody wins. The winner of the $n^{\text{th}}$ district is then rewarded with $n$ votes. The ultimate winner is the one who gets more votes in the $10$ districts overall.

For example, if you choose $(70, 20, 10, 0, 0, 0, 0, 0, 0, 0)$, and your opponent does $(0, 0, 0, 0, 0, 0, 0, 0, 0, 100)$. Then you would get $1 + 2 + 3 = 6$ votes, while your opponent has $10$ votes, and therefore is declared the winner.

The question is: What is the optimal strategy for playing this game? And what is the probability of winning under this optimal strategy?

Regarding game theory, I have a basic understanding of zero-sum game, the payoff matrix, saddle point theorem, equilibrium mixing strategy and the linear programming approach.

As I approach the problem, I think of the payoff matrix as being a matrix of allocations. Assume, without loss of generality, that the ultimate winner is rewarded with $+1$ and the loser $-1$ (and in case of a tie, both get $0$). Then the payoff matrix will be a large symmetric matrix with all zeros on its diagonal. Off-diagonal elements may take $-1$, $+1$ or $0$.

According to Von Neumann Minimax Theorem and the symmetry of the game, I believe the optimal value for each player is simply zero. However, in analyzing the optimal mixed strategy, it seems to me that in this case the payoff matrix is too large ($\frac{109!}{10! \times 99!} \times \frac{109!}{10! \times 99!}$) to apply some of the traditional methods I mentioned above, so I believe there should be some clever tricks either to organize the payoff matrix, to eliminate dominated cases, or just to think about the problem.

Thanks!

Updates:

I played around with some simpler cases, and here are my findings:

  1. When there are $2$ districts, there is a single pure optimal strategy: put all resources in the $2^{\text{nd}}$ district.
  2. When there are $3$ districts and $2$ units of resources, there is still only one pure strategy: put all resources in the $3^{\text{th}}$ districts, namely, $(0, 0, 2)$.
  3. When there are $4$ districts and $2$ units of resources, there is one optimal mixed strategy: allocate between $(0, 0, 0, 2)$, $(0, 0, 1, 1)$, $(0, 1, 1, 0)$ with equal probabilities of $\frac{1}{3}$.
  4. When there are $10$ districts and $2$ units of resources, there is one optimal mixed strategy: allocate between $(0, …, 0, 0, 2)$, $(0, …, 0, 1, 1)$, $(0, …, 1, 1, 0)$ with equal probabilities of $\frac{1}{3}$.

I even wrote a short Python script to do the simulation, which you might find helpful.

import numpy as np
from sets import Set


def allocate(district, unit):
    """ generates all possible allocations
    :param district: number of districts
    :param unit: number of units of resources
    :return 2D numpy array of all possible allocations
    """
    # one district: base case
    if district == 1:
        return np.array([[unit]])

    # allocate all units to districts 1-n
    temp = allocate(district - 1, unit)
    result = np.array([np.append(t, 0) for t in temp])

    # allocate i units to district n
    for i in range(1, unit + 1):
        temp = allocate(district - 1, unit - i)
        new_result = np.array([np.append(t, i) for t in temp])
        result = np.concatenate((new_result, result), axis = 0)

    # return
    return result


def payoff(district, unit, allocation):
    """ generates the payoff matrix
    :param district: number of districts
    :param unit: number of units of resources
    :param allocation: 2D numpy array of allocations
    :return 2D payoff matrix
    """
    vote = np.array([i for i in range(1, district + 1)])

    # size of payoff matrix
    s = allocation.shape[0]
    payoff_matrix = np.zeros((s, s))

    # fill in the payoff matrix 
    for i in range(0, s):
        for j in range(i, s):

            # absolute win?
            win = sum(vote * (allocation[i] > allocation[j])) > sum(vote * (allocation[j] > allocation[i]))

            # tie?
            equal = sum(vote * (allocation[i] > allocation[j])) == sum(vote * (allocation[j] > allocation[i]))

            # if absolute win
            if win:
                payoff_matrix[i, j] = 1
                payoff_matrix[j, i] = -1

            # if lose
            elif not equal:
                payoff_matrix[i, j] = -1
                payoff_matrix[j, i] = 1                         
    # return 
    return payoff_matrix


def analyse(payoff):
    """ return min-max and max-min of the payoff matrix
    :param payoff: the payoff matrix
    :return min-max & max-min;
            if v1 == v2, then unique optimal pure strategy exists
    """
    v1 = max(np.min(payoff, axis = 1))
    v2 = min(np.max(payoff, axis = 0))
    return v1, v2


def dominate(payoff, j, i):
    """ check if strategy j dominates i
    :param payoff: the payoff matrix
    :param j: index of strat
    :param i: index of strat
    :return boolean that represents if strat j dominates strat i
    """
    return (sum(payoff[j] >= payoff[i]) == payoff.shape[0])


def eliminate(payoff, run):
    """ calculate the indices of strategies that can be eliminated
    :param payoff: the payoff matrix
    :param run: the # of iterations you want to run
    :return a set of indices of dominated strategies
    """
    to_delete = Set([])
    counter = 0
    while counter < run:
        counter += 1
        for i in range(0, payoff.shape[0]):
            for j in [t for t in range(0, payoff.shape[0]) if t != i]:
                if dominate(payoff, j, i):
                    to_delete.add(i)
                    payoff[i, :] = -1 * np.ones(payoff.shape[0])
                    payoff[:, i] = -1 * np.ones(payoff.shape[0])
    return to_delete


def keep(payoff, run):
    """ calculate the indices of strategies that might be kept
    :param payoff: the payoff matrix
    :param run: the # of iterations you want to run
    :return a set of indices of non-dominated strategies
    """
    return Set(range(0, payoff.shape[0])) - eliminate(payoff, run)


if __name__ == "__main__":

    # params
    district = 10; unit = 2

    # get possible allocations
    allocation = allocate(district, unit)

    # payoff matrix
    payoff = payoff(district, unit, allocation)

    # check if there is an equilibrium
    v1, v2 = analyse(payoff)

    print '--- elimination starts here! ---'

    # keep indices
    keep = keep(payoff, 5)

    # sub payoff matrix of non-dominated strategies
    submatrix = payoff[list(keep)][:, list(keep)]

    '''
    # size of the sub payoff matrix
    s = submatrix.shape[0]

    # first (n - 1) equality 
    a_partial = submatrix

    # sum to one constraint
    sum_to_one = np.array([np.ones(s)])

    # complete (a, b)
    a = np.concatenate((a_partial, sum_to_one), axis = 0)
    b = np.append(np.zeros(s), 1.0)

    # just playing around...
    sol1 = np.linalg.solve(a_partial, np.zeros(s))

    # but linalg.solve can't solve non-square ax = b wtf?
    # sol2 = np.linalg.solve(a, b)
    '''

    print 'the end!'

Best Answer

A recent paper [1] provides an effective (i.e.: polynomial-sized) linear programming solution to the classic version of the Colonel Blotto game. I haven't gone through all the details, but I believe it should also be applicable to this weighted version.

  1. Behnezhad, S., Dehghani, S., Derakhshan, M., HajiAghayi, M., & Seddighin, S. (2017, February). Faster and Simpler Algorithm for Optimal Strategies of Blotto Game. In AAAI (pp. 369-375).