There are a number of issues here. I will say several things, not in any particular order.
I agree with your professor, you should use type I SS. To say that type I SS shouldn't be used when groups are unbalanced doesn't make sense; if the factors are orthogonal, type I, II, and III SS are identical--this is just arguing that type I SS should never be used. I have discussed the meaning of and argument for using type I SS in the answers to other questions on CV (mostly here, but also here). In brief, when cells are unbalanced, the factors are correlated, and there are sums of squares that could be attributed to more than one factor. Using type II or III SS ignores those overlapping sums of squares, thus they must be wrong.
Type I SS are used to conduct hypothesis tests. The analyst decides what order to enter the terms into the model. Deciding on an order amounts to deciding which factors will get which overlapping sums of squares. Algorithmically, the sums of squares used for each factor equals the reduction in the error term when that factor is added to the model. For instance, imagine a 'reduced' model with $k$ factors, an 'augmented' model with $k+1$ factors is fit, then:
$$SS(k+1)=SSE_{k}-SSE_{k+1}$$
To form the proper F ratio to test these 'extra sums of squares', first divide them by their corresponding df to compute the mean squares for that factor, then they are divided by the MSE from the full model (i.e., with all the factors entered). Done in this manner, if you sum over all the sums of squares they will equal the total SS. This same procedure also affords 'simultaneous' tests, in which several factors are entered together and the F test checks if all are nulls. (For example, adding all the 2-way interactions together would allow you to simultaneously check if all the corresponding $\beta$'s=0.)
The parameter estimates (i.e., $\beta$'s) to report are those from the full model. Parameter estimates from reduced models are likely to be biased since your factors are correlated.
If all of this seems hard to follow, don't feel bad. It is confusing. Some time ago, the US Food and Drug Administration required that all research it funded use type III SS. The reasoning was that it was better for people to use a method that was wrong but that they could understand, than one that was right that they couldn't. While one could take issue with this line of reasoning, the point is that doing this correctly is not terribly simple.
The meaning of your parameter estimates depends on how the factors were coded. Perhaps the most typical scheme is 'reference cell coding' (commonly called 'dummy coding'). In this case, one cell (usually a control group) is designated as the reference cell and the other levels of that factor are encoded in $l-1$ new variables (i.e., columns). For instance, if you have 3 groups (a control and 2 treatments), you add 2 new columns to your data set. In the first column (treatment1), each observation gets a 1 if it is in the first treatment group, and a 0 otherwise. In the second column (treatment2), you would put a 1 for those in the second treatment group or else a 0. If you use this approach (of course, actually you don't do this, but if the software does it behind the scenes), then the intercept is equal to the mean of your control group, and the parameter estimate for each level is the difference between the mean of that level and the mean of the control group. There are many, many different coding schemes and the meanings of the parameter estimates will differ for each; check out the link above for more information.
You don't interpret interactions. If you believe that an interaction is real, you don't interpret the main effects of the factors that go into that interaction, either. Instead, you interpret 'simple effects'. That is, you look at the effect of one factor on the dependent variable several times (specifically, at each level of the other factor). Which factor you choose to hold constant, and which factor you examine directly is entirely up to you; pick whatever seems most intuitive. Often the best way to understand your data when you have interactions, is to make graphs. For instance, if you had a 2x2 design with a 'significant' interaction, you could make a barplot of group A1 & A2, for the first level of factor B, and another barplot of A1 & A2, for the second level of factor B. In this manner, you are looking at the effect of factor A at each level of the other factor.
Best Answer
I'm a bit confused by your question, but as you are planning on reporting both within- and between-subjects effects, I assume you are actually conducting a mixed-design ANOVA.
Regardless, the tables of descriptives are what you probably want to report; the estimated marginal means are the means controlling for covariates (i.e., the mean of X when Y is held constant at its mean value). How you report these depends on how many descriptives you have to report; I would report the means and standard deviations in text if there are only a few of them, and in a table if there are many.