[Math] For a M/M/1/K System, why is utilization = $\rho=\lambda T_s$

queueing-theory

Before I ask my question these are the symbols I used:

$w$ = Customers waiting in queue
$\mu$ = Service rate
$T_s = 1/\mu$ = Average service time
$\lambda$ = Average arrival time
$\rho$ = Server utilization

How come $\rho=\lambda T_s$ for a M/M/1/K System (which is the same for as M/M/1)

This is my intuitive reasoning, please tell me where I am going wrong.

Picture a M/M/1/3 queue system.

  • Assume at one point the queue is full. i.e w=3. At this point, a task (n) is born.
    This will cause this new task to be rejected.
  • Now assume a few moments later, the server has processed all the 3 tasks (w=0) and a new
    task hasn't arrived yet. So the server will be idle until a new task (n+1) arrives

Now in the scenario above, if it was a M/M/1 queue, the task (n) wouldn't have been rejected and so
the server wouldn't be idle as shown in the second bullet point above.
In other words, a M/M/1 system would serve task (n) and (n+1)
where as a M/M/1/3 system would serve only task (n+1)

So clearly, in a M/M/1/K system, the server has more chances to have idle periods, thus lowering
its utilization when compared to a M/M/1 system.

So again, $\rho=\lambda T_s$ for a M/M/1/K system doesn't make sense to me. What am I missing?

Best Answer

In an $M/M/1/K$ process, where $P_n$ is the proportion of time $n$ people are queuing or being served and $0 \le n \le K$, you can reach balance with

$$P_n=\left(\dfrac{\lambda}{\mu}\right)^n P_0 $$

and, if you write $\rho = \frac{\lambda}{\mu}$ as the ratio of the arrival and service rates, then this gives $P_n=\rho^n P_0$. Since $\sum_{n=0}^K P_n=1$, this results in $$P_n=\rho^n \dfrac{1-\rho}{1-\rho^{K+1}}$$

which with $K \to +\infty$ and $0 \le \rho \lt 1$ would give the $M/M/1$ result of $P_n \to \rho^n (1-\rho)$

With finite K, you no longer have $1-P_0=\rho= \frac{\lambda}{\mu}= {\lambda}T_s$ as the expected proportion of time the server is working, which seems to be the point of your question. Instead you have $$1-P_0=\rho\dfrac{1-\rho^{K}}{1-\rho^{K+1}}$$ (with $1-P_0=\frac{K}{K+1}$ when $\rho=1$), which is less than $\rho$, meaning that the server expects to work less when the queue has limited capacity, much as you might intuitively expect

Related Question