R Propensity Scores – Why Is V.Threshold Showing ‘Not Balanced’ in Psm Balancing?

matchingpropensity-scoresr

I ran a full match for a PSM, with the balanced output summarized below:

  Call:
matchit(formula = buyout_flag ~ tsale + sfincs_avg + logprox + 
    tenure + age + percent_ethwhite_origin + percent_poverty_origin + 
    percent_hs_origin + percent_owner_origin + house_medval_origin + 
    percap_origin, data = hcad_floodp, method = "full", distance = "glm", 
    link = "probit", caliper = 0.1)

  Summary of Balance for Matched Data:
                            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max Std. Pair Dist.
    distance                        0.365         0.365           0.000      0.999     0.002    0.023           0.013
    tsale                           1.881         1.862           0.034      0.595     0.082    0.222           1.316
    sfincs_avg                      6.275         6.487          -0.059      0.842     0.022    0.064           0.735
    logprox                         3.923         3.886           0.041      0.892     0.017    0.068           0.958
    tenure                         13.054        13.026           0.003      0.983     0.015    0.049           1.068
    age                            60.878        60.423           0.030      0.971     0.017    0.043           1.148
    percent_ethwhite_origin        41.500        39.986           0.062      0.953     0.024    0.054           0.947
    percent_poverty_origin         15.090        15.590          -0.049      1.035     0.026    0.073           1.003
    percent_hs_origin              81.379        80.997           0.021      1.083     0.026    0.088           0.941
    percent_owner_origin           57.988        56.722           0.062      0.968     0.021    0.062           1.069
    house_medval_origin        146647.059    145763.842           0.015      1.241     0.022    0.074           0.935
    percap_origin               29349.670     28864.395           0.043      0.954     0.026    0.069           1.010

I am trying to interrogate the balance a bit more thoroughly (working through this paper/guide) using bal.tab(.,m.threshold-0.1) (which turned out fine), and bal.tab(., v.threshold=1), where I'm a bit confused on the results.

Call
 matchit(formula = buyout_flag ~ tsale + sfincs_avg + logprox + 
    tenure + age + percent_ethwhite_origin + percent_poverty_origin + 
    percent_hs_origin + percent_owner_origin + house_medval_origin + 
    percap_origin, data = hcad_floodp, method = "full", distance = "glm", 
    link = "probit", caliper = 0.1)

Balance Measures
                            Type Diff.Adj V.Ratio.Adj      V.Threshold
distance                Distance    0.000       0.999                 
tsale                    Contin.    0.034       0.595 Not Balanced, >1
sfincs_avg               Contin.   -0.059       0.842 Not Balanced, >1
logprox                  Contin.    0.041       0.892 Not Balanced, >1
tenure                   Contin.    0.003       0.983 Not Balanced, >1
age                      Contin.    0.030       0.971 Not Balanced, >1
percent_ethwhite_origin  Contin.    0.062       0.953 Not Balanced, >1
percent_poverty_origin   Contin.   -0.049       1.035 Not Balanced, >1
percent_hs_origin        Contin.    0.021       1.083 Not Balanced, >1
percent_owner_origin     Contin.    0.062       0.968 Not Balanced, >1
house_medval_origin      Contin.    0.015       1.241 Not Balanced, >1
percap_origin            Contin.    0.043       0.954 Not Balanced, >1

To me, it looks like the V.Ratio is under 1 for everything except percent_poverty percent_hs, and percent_medval. What am I missing?

Best Answer

The variance ratio ranges from 0 to infinity. Perfect balance on the variance means the variance ratio is equal to 1. A variance ratio of .5 means the same thing as a variance ratio of 2; they are just in opposite directions. So when you supply a threshold to bal.tab() for variance ratios, they work in both directions; that is, setting threshold = c(v = 2) will trigger any variance ratios that are greater than 2 or less than .5.

A variance ratio of 1 means the variances are exactly equal; setting a threshold of 1 means that any variance ratio greater than 1 or less than will be triggered; any covariate that is not exactly balanced on the variances will therefore be triggered. This would be like setting a threshold on the standardized mean difference (SMD) of 0; any SMD greater than or less than 0 will be triggered. This can be helpful if you are trying to detect departures from perfect balance, but in most cases, small imbalances are okay, and the thresholds should be set slightly away from the value that indicates perfect balance. Using a variance ratio threshold of, for example, 1.25 will detect even moderate departures from perfect balance.

Related Question