Balance in the Training Set
For logistic regression models unbalanced training data affects only the estimate of the model intercept (although this of course skews all the predicted probabilities, which in turn compromises your predictions). Fortunately the intercept correction is straightforward: Provided you know, or can guess, the true proportion of 0s and 1s and know the proportions in the training set you can apply a rare events correction to the intercept. Details are in King and Zeng (2001) [PDF].
These 'rare event corrections' were designed for case control research designs, mostly used in epidemiology, that select cases by choosing a fixed, usually balanced number of 0 cases and 1 cases, and then need to correct for the resulting sample selection bias. Indeed, you might train your classifier the same way. Pick a nice balanced sample and then correct the intercept to take into account the fact that you've selected on the dependent variable to learn more about rarer classes than a random sample would be able to tell you.
Making Predictions
On a related but distinct topic: Don't forget that you should be thresholding intelligently to make predictions. It is not always best to predict 1 when the model probability is greater 0.5. Another threshold may be better. To this end you should look into the Receiver Operating Characteristic (ROC) curves of your classifier, not just its predictive success with a default probability threshold.
There are several issues here.
Typically, we want to determine a minimum sample size so as to achieve a minimally acceptable level of statistical power. The sample size required is a function of several factors, primarily the magnitude of the effect you want to be able to differentiate from 0 (or whatever null you are using, but 0 is most common), and the minimum probability of catching that effect you want to have. Working from this perspective, sample size is determined by a power analysis.
Another consideration is the stability of your model (as @cbeleites notes). Basically, as the ratio of parameters estimated to the number of data gets close to 1, your model will become saturated, and will necessarily be overfit (unless there is, in fact, no randomness in the system). The 1 to 10 ratio rule of thumb comes from this perspective. Note that having adequate power will generally cover this concern for you, but not vice versa.
The 1 to 10 rule comes from the linear regression world, however, and it's important to recognize that logistic regression has additional complexities. One issue is that logistic regression works best when the percentages of 1's and 0's is approximately 50% / 50% (as @andrea and @psj discuss in the comments above). Another issue to be concerned with is separation. That is, you don't want to have all of your 1's gathered on one extreme of an independent variable (or some combination of them), and all of the 0's at the other extreme. Although this would seem like a good situation, because it would make perfect prediction easy, it actually makes the parameter estimation process blow up. (@Scortchi has an excellent discussion of how to deal with separation in logistic regression here: How to deal with perfect separation in logistic regression?) With more IV's, this becomes more likely, even if the true magnitudes of the effects are held constant, and especially if your responses are unbalanced. Thus, you can easily need more than 10 data per IV.
One last issue with that rule of thumb, is that it assumes your IV's are orthogonal. This is reasonable for designed experiments, but with observational studies such as yours, your IV's will almost never be roughly orthogonal. There are strategies for dealing with this situation (e.g., combining or dropping IV's, conducting a principal components analysis first, etc.), but if it isn't addressed (which is common), you will need more data.
A reasonable question then, is what should your minimum N be, and/or is your sample size sufficient? To address this, I suggest you use the methods @cbeleites discusses; relying on the 1 to 10 rule will be insufficient.
Best Answer
This is not so much a problem of logistic regression per se as it is a problem with classification accuracy as a performance measure. Note that balancing the data set is not necessarily the only valid approach. If one of the classes is actually much more common in the population (and not merely in your sample), a naive model (classifying everything as belonging to the most common category) really is a good guess. If the error costs are not symmetric, balancing the data set might lead you to err in the wrong direction (the more costly one).
The problem also often comes up the other way around: Training/evaluating on some artificially balanced data set before using the resulting model in a strongly unbalanced situation (think detecting fraud or diagnosing a rare disease) where the usefulness of the model is not nearly as high as the raw accuracy would suggest. It all depends on your objectives and your cost structure.