From Modern Epidemiology 3rd Edition by Rothman, Greenland and Lash:
There are at least three forms of overmatching. The first refers to matching that harms statistical efficiency, such as case-control matching on a variable associated with exposure but not disease. The second refers to matching that harms validity, such as matching on an intermediate between exposure and disease. The third refers to matching that harms cost-efficiency.
The answer from AndyW is about the second form of overmatching. Briefly, here's how they all work:
1: In order to be a confounder, one of the criteria is that the covariate be associated with both the outcome and the exposure. If it's only associated with one of them, its not a confounder, and all you've succeeded in doing is widening your confidence interval.
To explore this type of overmatching further, consider a matched case-control study of a binary exposure, with one control matched to each case on one or more confounders. Each stratum in the analysis will consist of one case and one control unless some strata can be combined. If the case and its matched control are either both exposed or both unexposed, one margin of the 2 x 2 table will be 0 ... such a pair of subjects will not contribute any information to the analysis. If one stratifies on correlates of exposure, one will increase the chance that such tables will occur and thus tend to increase the information lost in stratified analysis.
2: This is partially discussed by AndyW. Matching on an intermediate factor will bias your estimate, as will matching on something affected by both the exposure and outcome. This is essentially controlling on a collider, and any technique that does so will bias your estimate.
If, however, the potential matching factor is affected by exposure and the factor in turn affects disease (i.e., is an intermediate variable), or is affected by both exposure and disease, then matching on the factor will bias both the crude and adjusted effect estimates. In these situations, case-control matching is nothing more than an irreparable form of selection bias.
3: This is more of a study design problem. Extensively matching on variables that you needn't match on for reasons 1 & 2 can cause you to reject easily obtained controls (friends, family, nearby social network, etc.) in favor of far harder to obtain controls that can be matched on the unnecessary set of covariates. That costs money - money that could have been spent on more subjects, better exposure or disease ascertainment, etc., for no appreciable gain in bias or precision, and indeed having threatened both.
I don't have a complete answer but can provide some thoughts:
1) Adjustment does remove the confounding effect, but only if the underlying causal pathways are correctly specified. There are occasions where adjustment can cause bias rather than decreasing biases. For more information on this issue, search for collider bias and directed acyclic graphs.
2) Adjustment does remove the confounding effect, but only if the operationalization is correct. In other words, you have chosen the correct variable to represent the construct. There are multiple reasons why age may not be a good indicator of aging (the actual construct that is related to mortality.) For example, fatal heart disease can be lifestyle-related. It can also be related to immunological response and how body mediates inflammation. All these factors can have substantial difference within the same age. In the reporting side, under-report of one's age tends to go up with age, introducing some error that is correlated with age as well. If you control for age, and thinking you have controlled for age-related factors, chance is this assumption is usually over-ambitious. It's always more important to know what the control variables really means.
3) There are also other dynamics which can cause adjustment alone to be insufficient. For example, interaction between age and other variable(s) in the model can bias the estimate of age. Non-linear relationship between age and mortality can also cause simple adjustment for age alone an imperfect method.
My guess is in epidemiology, it's better to say "no" whenever someone asks if something can completely removed whatever... perhaps except "can randomized controlled trial completely remove biases?" Then "theoretically yes."
Best Answer
Your question is actually a very hard one to answer. It is however good that you are asking before the study has been conducted - preferably well before the study is conducted. So this answer comes in a few parts:
Planning for potential confounding and effect modification is a long process that relies fairly heavily on subject matter expertise. Make sure you have a good team. If you don't feel like you do, or could use more, see if someone in your department or organization can help you out - there are lots of HIV/AIDS epidemiologists out there. I can think of some variables I would think are important (number of sexual partners, access to testing facilities, etc.) but you'd be better served by understanding the process rather than just having a list.