Since you don't include any portion of your data set, it's hard to be sure precisely what is going on, but here's what I'm guessing:
If you include all the possible categories as dummy variables plus an intercept, as R does by default, then you have a perfectly multicolinear system. A unique set of coefficients can't be identified in this case, so R excludes one of the dummy variables from your regression. This becomes the reference group, which is represented by the intercept now, and all other coefficients are measured relative to it. The dummy variable that R decides to exclude depends upon the order; that's why you get different results based upon the ordering. If you add and subtract the right combinations of coefficients, you can move from one regression to another and see that you get exactly the same results---see here, for example.
No, you don't need more data and you certainly don't need more dummy variables; in fact, you need less. Just exclude one of the categories from each dummy variable group and you'll be fine. But do note that the reported coefficients that you get will depend upon which group you exclude (but, again, when you add pieces correctly, you get the exact same results).
How I would look at this (and I put a caveat in at the start, I am not the best at maths but have an interest in sports) is that the English Premiership is possibly normally distributed based up on the standard deviation of error of a prediction around a rating system. I am basing this on the principles of Hal Stern: "On the Probability of Winning a Football game" (1991, The American Statistician, vol. 45, no. 3, pp.179-183) – and the fact that I do this for other sports for fun. You can get a better idea of this from something like a QQ plot (and other methods - someone answered one of my questions on this in a quite detailed fashion). The reason why I say “I think” is because given how low scoring football is I wouldn’t have a much certainty as say basketball or American football or Australian Rules Football. As I understand it the poisson distribution can be reasonably approximated to the normal distribution (given your question related to the poisson) and also the principal of scoring a goal could be governed by a Poisson variable.
The fundamentals, Manchester City are 1-0 down at home to Wigan after 30 minutes (I am assuming that it is a standard Premier League game and not a friendly or a Cup Game when there are a different level of significance to the result). Home Advantage in the Premiership can be calculated by rating systems (as a component). Home advantage exists in the English Premiership as it does in every sports league as is the general assumption anecdotally (be it positive or negative). In the case of the English Premiership a handy paper on this is “Home advantages in balanced competitions : English soccer 1991-1996”; Stephen R Clarke (Proceedings of the 3rd Australian Conference on Mathematics and Computers in Sport, Coolangatta, Queensland, 1996). You can calculate it individually for each team (assuming the trend would hold for a number of years to the point where it was significant) but in the case of this we will take the average Home Advantage.
Then you need team ratings, assuming you have past game data I would go with something like the ratings system put forward by Kenneth Massey as part of his 1997 thesis: (http://masseyratings.com/theory/massey97.pdf) – there are plenty of other ranking systems you can pick from (“Who’s No.1 – The Science of Rating and Ranking”, Carl Meyer, Amy N. Langville – Princeton 2012 – which covers the Massey thesis method amongst others). I personally use some of the methods suggested by Wayne Winston in Mathletics (Princeton 2009). This is where I don’t have a great maths background but get processes and can do Excel and where the principal for the following process comes from.
I’m expecting you are being expected to work the odds given backward to generate a team rating, but from “Why are Gambling Markets Organised So Differently from Financial Markets?” - The Economic Journal (Volume 114 - 2004), Steven Levitt” we understand that in a lot of cases bookmakers don’t set the odds based on their expected outcome of an event, but it is often to maximise client biases based on how they expect their clients to bet. As a result you could justify disregarding the odds given.
My fundamentals for the game based on my current spreadsheet for the English Premiership for this season:
Standard Deviation of Error of a Prediction from a Rating System for a game: 1.45553586306297 (stats to 2 dp would be fine but I pulled these straight from Excel)
Rating for Manchester City: 0.895454396395622
Rating for Wigan: -0.602272751
Average Premiership Home Edge (In this case for Manchester City to benefit from): 0.352036612
Predicted Margin of Victory for Manchester City based on the above: 1.84976376
Game time: 90 minutes (I’m not assuming any injury time in either half)
Game time already elapsed: 30 minutes
Standard Deviation of the remaining game time: 1.45553586306297 divided by Square Root of 90/60 (e.g. to get the Standard Deviation in relation to the fraction of the remaining game time) = 1.188440056
For Manchester City to win they need to win by 1.5 goals so for the normal distribution x=1.5, your mean is 1.84976376 as you are assuming your rating of both teams still holds, your standard deviation is 1.188440056 (based on a 60 minute time segment) and I set Excel to return the Cumulative Distribution Function. I then subtracted this from 1 (so you are getting what is beyond this point). Based on the above I have the chance of Manchester City winning as 61.55%.
For the draw, it would be the same as the above + as per the above but instead of subtracting it from 1, you would subtract a normal distribution of x=0.5 (so they score one goal to reduce the margin, but not two goals more to win). I have the chance of this as 25.64%
To find the chance of Wigan winning I would just then subtract the first two results from one (as it is the only outcome not covered) and this returns 12.81%.
There are a couple of things that I don’t like about this;
1) There are a lot of assumptions to be made based on the question and what is available.
2) I am not 100% convinced yet that Soccer can be normally distributed based on the principles of Stern (in an ideal world you would be deriving ratings based on the abilities of the players on the pitch – e.g. adding linear weights through regression to goals scored etc.)
3) You are making the assumption in the above example that I have worked through that your ratings derived before the game would still hold at the 30 minute mark (which if you assumed the linear weights approach to players variables such as shots to derive a rating – this probably wouldn’t be the case). The other alternative would be to diminish the predicted mean (or margin of victory value through out the game). In the case of the above if you multiply the 1.84976376 by 0.66 e.g. a third of the game is gone and feed it in to the process above, you get the revised values:
Manchester City winning: 41.10%
Draw: 32.03%
Wigan winning: 26.87%
Which feels a bit more likely to me (note: I edited the figures this afternoon as I made a mistake adding things up and what I was subtracting from what - the figures just looked wrong) than the original set of figures obtained.
I hope this helps or gives you some other ideas and good luck.
Best Answer
You can use bivariate Poisson distribution with probability mass function
$$ f(x,y) = \exp\{-(\lambda_1+\lambda_2+\lambda_3)\} \frac{\lambda_1^x}{x!} \frac{\lambda_2^y}{y!} \sum^{\min(x,y)}_{k=0} {x \choose k} {y \choose k} k!\left(\frac{\lambda_3}{\lambda_1\lambda_2}\right)^k $$
where $E(X) = \lambda_1+\lambda_3$ and $E(Y) = \lambda_2+\lambda_3$ and $\mathrm{cov}(X,Y) = \lambda_3$, so you can treat $\lambda_3$ as a measure of dependence between the two marginal Poisson distributions. The pmf and random generation for this distribution is implemented in extraDistr package if you are using R.
In fact, this distribution was described in terms of analyzing sports data by Karlis and Ntzoufras (2003), so you can check their paper for further details. Those authors in their earlier paper discussed also the univariate Poisson model, where they concluded that independence assumption provides fair approximation since the difference between scores of both teams does not depend on the correlation parameter of bivariate Poisson (Karlis and Ntzoufras, 2000).
Kawamura (1984) described estimating parameters for bivariate Poisson distribution by direct search using maximum likelihood. As about regression models, you can use EM algorithm for maximum likelihood estimation, as Karlis and Ntzoufras (2003), or Bayesian model estimated using MCMC. The EM algorithm for bivariate Poisson regression is implemented in bivpois package (Karlis and Ntzoufras, 2005) that is unfortunately out of CRAN at this moment.
Karlis, D., & Ntzoufras, I. (2003). Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52(3), 381-393.
Karlis, D. and Ntzoufras, I. (2000) On modelling soccer data. Student, 3, 229-244.
Kawamura, K. (1984). Direct calculation of maximum likelihood estimator for the bivariate Poisson distribution. Kodai mathematical journal, 7(2), 211-221.
Karlis, D., and Ntzoufras, I. (2005). Bivariate Poisson and diagonal inflated bivariate Poisson regression models in R. Journal of Statistical Software, 14(10), 1-36.