1) Using the defaults for MatchIt, nearest neighbor matching matches on the propensity score as defined by a logistic regression of treatment on the covariates included in your formula. For each treated unit, it finds the one unmatched control with the closest propensity score, and then throws out the unmatched control units. There is no issue with continuous vs. categorical covariates here. Note King & Nielsen (2016), who describe why propensity score matching can actually make balance worse, as in your example.
2) MatchIt creates matches for the ATET, but the Matching package, which also implements genetic matching, allows you to specify that you want the ATE. After matching, you can simply perform the regression analysis you would have had you randomly assigned your units (assuming balance has been achieved).
1) If your goal is to make a causal inference, balance is paramount. Although you may have improved balance, if it is not good then your causal inference may still be invalid (/your estimate will still be biased). If you have untreated units that fall outside the range of your treated units, your causal inferences will not be valid for them unless you can justify extrapolation. You may want to delete these cases and limit your inferences to the region of overlap. (Overlap can be conceptualized as the overlap between the covariate distributions, e.g., the convex hull, or common support on the estimated propensity scores.)
2) You could randomly simulate, but I think a better approach would be to find the matched groups that yield the best balance and move forward with that single sample.
3) I'm not sure what your outcome is, but an "alternative to matching" is regression (to many, matching is an alternative to regression). If you are willing to make parametric assumptions and use regression of some kind (e.g., logistic, linear, count), you can use regression instead of or in addition to matching. With only 7 covariates that you want to control for, it shouldn't be too hard to run a regression and account for potentially relevant interactions and curviliearities.
4) Hypothesis tests are NOT appropriate for assessing balance. Balance is a sample property, so there is no sense in which a p-value will be more helpful than an effect size measure. Also, typically, balance is achieved by comparing your balance statistics to a chosen threshold, not by looking at reduction in imbalance from your original sample. The fact that you've achieved balance slightly better than you started with doesn't mean you can move forward; you need to arrive at balance that permits an unbiased estimate of the treatment effect.
I recommend you try various methods of conditioning on the propensity scores. You've chosen matching, but there doesn't seem to be reason not to try weighting or full matching (really a form of weighting). Weighting using CBPS or by entropy balancing can be very effective. If you want to compare balance across multiple methods of conditioning, you can use the cobalt package which interfaces with some of the other packages and offers some additional tools for balance assessment. I also recommend you combine propensity score conditioning with regression on the treatment and your covariates. That technique is preferred in the literature and can reduce the remaining imbalance in your adjusted sample.
Best Answer
This happens when you have (at least) two individuals that have the same propensity score. MatchIt randomly selects one to include in the matched set. My recommendation would be to select one matched set and carry out your analysis with it. I agree that trying other conditioning methods such as full matching and IPW would be a good idea. You could report results of various analyses in a sensitivity analysis section.
Edit: This is probably the wrong answer. See Viktor's answer for what is likely the actual cause.
Edit 2020-12-07: For
MatchIt
version less than 4.0.0, the only random selection that would occur when nearest neighbor matching was when ties were present or whenm.order = "random"
, which is not the default. If few variables were used in matching, and especially if they were all categorical or took few values, ties are possible. As of version 4.0.0, there are no longer any random processes unlessm.order = "random"
; all ties are broken deterministically based on the order of the data.