We have a set of jobs $J_i = (a_i, b_i)$ where all the $a_i,b_i \ge 0$. We want to find the permutation $\pi$ which minimises $$\max_j \left\{ b_{\pi(j)} + \sum_{k \le j}a_{\pi(k)} \right\}$$
Suppose we have proven that for all job sets $X$ of size no greater than $n$ the permutation which sorts the jobs by descending $b_i$ breaking ties by descending $a_i$ is an optimal solution (bearing in mind that there may be more than one optimal solution). (Call this ordering doubly-reverse-lexicographical because it sorts in descending order starting from the right).
Consider a job set $Y$ of size $n+1$ whose "worst" job is $J_{n+1}$, and suppose that there is no optimal solution in which this job comes first. Of the optimal solutions select one which puts $J_{n+1}$ as early as possible, in position $j^* = \pi^{-1}(n+1) > 0$. Now, which $j$ maximises the expression above (i.e. which job finishes last)?
Given that $b_{\pi(j^*)}$ is maximal by construction and the sum is non-decreasing the maximising $j$ must be $\ge j^*$. Also, if the maximising $j > j^*$ we can move $J_{n+1}$ to the start without affecting the optimality, so by contradiction that's impossible. We're forced to conclude that $J_{n+1}$ is the job which finishes last.
Now consider what happens if we swap it with the job before it. Call that job $J_p$ for previous. $J_p$'s finishing time increases from $b_p + \sum_{k < j^*}a_{\pi(k)}$ to $b_p + \sum_{k \le j^*}a_{\pi(k)} \le b_{n+1} + \sum_{k \le j^*}a_{\pi(k)}$ and $J_{n+1}$'s finishing time decreases from $b_{n+1} + \sum_{k \le j^*}a_{\pi(k)}$ to $b_{n+1} + \sum_{k \le j^* \wedge k \ne j^*-1}a_{\pi(k)}$ so we have a contradiction with this being an optimal solution which puts $J_{n+1}$ as early as possible.
We've now proven that there is an optimal solution for $Y$ in which $J_{n+1}$ comes first. Either some of these optimal solutions allow for the rest of the jobs to be ordered non-optimally because the worst job is so bad that it doesn't matter – in which case putting them in an optimal ordering will give an optimal ordering – or the rest of the jobs must be optimally ordered to get an optimal ordering. In either case we can apply our strong inductive presupposition to say that there is an optimal ordering which is doubly-reverse-lexicographical.
The base case of only one job is trivial.
As Brian Tung writes in a comment, "greedy algorithm" is not really a crisp technical term, but more of a fuzzy concept that can be helpful for remembering and categorizing algorithms in your mind, and for getting ideas for how to possibly approach a new problem. There are essentially no cases where it makes any technical difference whether you call this-or-that algorithm greedy or not.
That being said, I wouldn't call algorithm (b) a "greedy algorithm" myself, but it's not completely crazy to do so either.
In general, if I were to describe what a greedy algorithm means, my first attempt would be something on the lines of
An algorithm that constructs a solution in little steps, where each step is chosen among the possible step at each time, based on a simple rule to identify which possibility makes the most immediate progress towards the eventual goal.
That certainly describes algorithm (a) -- the rule is simple and selects the way to increase the number-of-skis delivered that leads to the smallest value so far for the sum we want to minimize. That's a reasonable way to measure immediate progress.
On the other hand, if I want to shoehorn (b) into the "greedy paradigm" I would need to weaken my description into
An algorithm that constructs a solution in little steps, where each step is chosen among the possible step at each time, based on a simple rule.
The rule in algorithm (b) is surely simple enough, but it's hard to justify the rule as a short-sighted way to find the most immediate progress -- unless we cheat and already know that the rule happens to construct an optimal solution anyway.
On the other hand, algorithms that match the weakened desciption still have some relevant features in common with actual greedy algorithms -- among which is that the tactics that are useful for proving greedy strategies optimal (when they are indeed optimal), such as exchange arguments, can also be useful to reason about the wider group of algorithms.
So in this sense it is "not completely crazy" to describe (b) as a greedy algorithm, because it can still point towards useful ways to think and reason about the algorithm.
Best Answer
Without loss of generality, let us assume that $s_1 \leq s_2 \leq s_3 \leq \cdots \leq s_n$. Let the greedy algorithm choose the first $m$ program $P_1,P_2,\ldots,P_m$. The space occupied by the greedy algorithm is $$S_{\text{greedy}} = P_1 + P_2 + \cdots + P_m$$ We hence have that $$P_1 + P_2 + \cdots + P_m + P_{m+1} > D \tag{$\star$}$$ Let another algorithm choose $l$ programs, where $l \geq m+1$. Let these programs be $P_{a(1)},P_{a(2)}, \ldots, P_{a(l)}$, where $a(k)$ are distinct and form an increasing sequence. The space occupied by these programs is $$S_{\text{algorithm}} = P_{a(1)} + P_{a(2)} + \cdots + P_{a(l)}$$
Now note that since $a(k)$ is an increasing sequence, we have $$a(k) \geq k \implies P_{a(k)} \geq P_k\tag{$\perp$}$$ Hence, we get that $$S_{\text{algorithm}} \geq P_1 + P_2 + \cdots + P_l = \underbrace{P_1 + \cdots + P_m + P_{m+1}}_{>D} + \sum_{k=m+2}^l P_k \,\,\,\,\,\,\,\,\,(\because l \geq m+1)$$ Contradiction. Hence, $l \leq m$. $(\star)$ and $(\perp)$ are the crucial ingredients.