Maybe this part of the answer helps. It was sufficient for many tasks, but fails at some reasonable problems.
A permutation action is a multi-set of conjugacy classes of subgroups of a group. The degree of the action is the total of the indices of representatives from each class (with multiplicity). The kernel of the action is the intersection of all the classes, or equivalently, the intersection of the normal cores of the representatives from each class.
If you have a collection of subgroups, organize them into conjugacy classes, sort them by their normal core (first by size, then by actual subgroup). For each normal core (especially starting with the small ones), choose the largest subgroup (smallest index) with that core. These largest subgroups are your ingredients.
Now roughly speaking try all combinations: compute the index and the kernel, and keep the best one, save any improvements to disk if you plan on letting this run for a while.
If you don't have a collection of subgroups handy, then you need to use group-specific ideas to get yourself some. If the Fitting subgroup is small, then cores are unlikely to be a real problem, so you just want big subgroups that are cheap. For small (≤107 or so) groups, you can compute local subgroups pretty cheaply.
If the Fitting subgroup is large or weird, then cores will be plentiful and weird or at least hard to avoid (a particularly awful situation is a unique minimal normal subgroup of order 2). In this case, one computes a full subgroup lattice. You can use recent versions of magma to get a fast answer, but be sure to read the changelogs to make sure you weren't affected by a missing subgroup.
At any rate, in practice this method failed to handle some of the perfect groups in the perfect group library. Perfect groups with large Fitting can require very large permutation representations, but the theoretical lower bounds were often quite a bit lower than what I was able to achieve in practice.
If your groups are finitely presented and you have no good starting permutation rep, then you may find that coset enumeration is faster for finitely presented groups than for millions-of-points permutation groups. In other words, typically speaking you start with some permutation representation, because it is going to be faster than any finite presentation. However, for really bad permutation representations (close to regular), you may find coset enumeration is much faster. In particular, finding the index or core of a subgroup might be faster to use ACE than to use permutation group code.
If your groups are small and solvable with low sectional rank, just compute the subgroup lattice and sort.
Best Answer
It is difficult to find this number for arbitrary finite groups, but many families have been solved. A somewhat early paper that has motivated a lot of work in this area is:
Johnson, D. L. "Minimal permutation representations of finite groups." Amer. J. Math. 93 (1971), 857-866. MR 316540 DOI: 10.2307/2373739.
This paper classifies those groups for which the regular permutation representation is the minimal faithful permutation representation (cyclic of prime power order, K4, or generalized quaternion) and some results on nilpotent groups (improved in later papers).
The minimal permutation degrees of the finite simple groups are known from:
Cooperstein, Bruce N. "Minimal degree for a permutation representation of a classical group." Israel J. Math. 30 (1978), no. 3, 213-235. MR 506701 DOI: 10.1007/BF02761072.
This paper only finds the degrees. A fuller description of the permutation representations are given in:
Grechkoseeva, M. A. "On minimal permutation representations of classical simple groups." Siberian Math. J. 44 (2003), no. 3, 443-462 MR 1984704 DOI: 10.1023/A:1023860730624.
There are a great deal of topics associated with minimal permutation degrees. I'll just briefly sketch them below, let me know if any interest you and I can give citations or longer descriptions:
The minimal degree of a subgroup is never larger than the minimal degree of the parent group, but the minimal degree of a quotient group may be much, much larger than that of the original. This poses problems in computational group theory, since quotient groups may be difficult to represent. Some quotients are easy to represent, and this has had a significant impact on CGT in the last 10 years or so.
Finding minimal permutation representations of covering groups can be difficult, and here I think the results are much less complete. Basically what you want are large subgroups not containing normal subgroups. In a covering group Z(G) is contained in really quite a few of the "good choices". This is because it is contained in Φ(G), the Frattini subgroup, the intersection of the maximal subgroups. One has to give up using maximal subgroups (at least if Z(G) is cyclic of prime power order), and so the minimal degree can increase dramatically.
Minimal degrees of primitive permutation groups is a topic with a different flavor (rather than specific families of groups, it is more of the interactions between group properties), but a great deal is known. Similar techniques are used to describe asymptotic behavior of minimal degrees of arbitrary families of finite groups, and quite powerful results are available there.