I've seen feature hashing and embedding mentioned in comments. Apart from that you can try clustering players by IDs if you have some additional data.
Another approach which is suitable for categorical data with many level is mean encoding.
Mean encoding (also sometimes called target encoding) consists of encoding categories with means of target (for example in regression if you have classes 0 and 1 then class 0 is encoded by mean of response for examples with 0 and so on). There are some answers on this site on that which provide more detail. I also encourage you to see this video if you want to get more about how it works and how you can implement it (there are several ways that to do mean encoding and each has its pros and cons).
In Python you can do mean encoding yourself (some approaches are shown in the video from the series I linked) or you can try Category Encoders from scikit-learn contrib.
A note on terminology:
As far as I am aware (unfortunately, there are a lot of blogs written by people who overlook the subtle differences and thus mis-information spreads):
One hot encoding is exactly what you described, generating a map from each unique value in a string column to an integer
Dummying is making K new columns (in which K is the number of unique values), of which exactly one column per row must be one.
In the "dog, cat, horse" example, when using a decision tree, consider the following example. Perhaps your target variable is "has it ever meowed?". Clearly what you want your decision tree to do is be able to ask the question "is it a cat? (yes/no)".
If you one-hot encode, such that dog -> 0, cat-> 1, horse->2, the tree can't isolate all of the cats using one question, because decision trees always split using "is feature x greater than or less than X?"
If you're using logistic regression, it also can't assign higher probabilities of meowing to cats.
If you dummy, the tree can explicitly ask the question "the column which signifies cat greater than 0.5?", thus splitting your data into cats and not cats.
If you use logistic regression, your optimiser can learn that the coefficient related to this column should be positive.
Thus in my opinion, whenever you have categorical data which has no implicit ordinality, always dummy, never one-hot encode.
In the case where your data has high cardinality, this could cause problems, especially if the number of examples of each type is tiny, but this is a problem you can't really solve, you simply have too detailed information for the size of your training data and using it would lead to over-fitting.
Nonetheless, one way to mitigate this, is to do some manual clustering (or actual clustering), in which you make a synthetic column, which can take fewer values, and many of the unique values of the original column map to the same value in the new column (e.g. dog, cat, horse-> mammal, pigeon, parrot , chicken -> bird). This makes it easier for the algorithm to learn, and if there's enough data, it can split further within each cluster.
Best Answer
As far as I see it, the use of categorical data is just useful for isolation forests as long as the data is still ordinal. In this case you can use an OrdinalEncoder to encode the categorical data (and retain the ordering). Then, the algorithm works the same was as for numerical data, since the minimum and maximum values can still be set accordingly.
If the data is not ordinal however, it might be reasonable to use OneHotEncoding since for OrdinalEncoding, isolation forests would assume an ordering that does not exist. Using OneHotEncoding however, the algorithm can still gain information by considering the single values of a feature.