Logistic regression does not make any assumptions about the distribution of the independent variables (neither does OLS regression, but that's another post).
However, if there are a lot of variables and a lot of zero inflation, then I think the potential for complete or quasi-complete separation increases.
Another problem may be accuracy of estimates; as far as I know, the computed standard errors etc. will be correct, but I think they could well be large.
More details (the number of IVs; the sample size; the nature of the variables, the degree of correlation among the IVs) will help you get more detailed answers. Actually running the regression and posting results would also help.
I think the model is more appropriately a left-censored Gaussian, since the process you describe is about discarding information below some value (in this case, the location is known to be 0, which is simpler than the case of an unknown censoring value). In other words, there's some real quantity which can (hypothetically) be measured, but that quantity is not recorded. We need to use a modeling tool that reflects that there is some true, non-censored value, but that this value is not available to us.
One resource I happen to have on my bookshelf is Gelman et al., Bayesian Data Analysis (3rd edition). Censoring and truncation models are discussed starting on page 224. The authors write
Suppose an object is weighed 100 times on an electronic scale with a known measurement distribution $\mathcal{N}(\theta,1^2)$, where $\theta$ is the true weight of the object....
[T]he scale has an upper limit of 200 kg for reports: all values above 200kg are reported as "too heavy." The complete data are still $\mathcal{N}(\theta,1^2)$, but the observed data are censored; if we observe "too heavy," we know that it corresponds to a weighing with a reading above 200.
This is very similar to the problem as the one stated by OP, with the exception that it's censored above 200 instead of below 0, and the concept that each item is weighed repeatedly with some instrument error.
One R package that seems relevant is censReg
.
Arne Henningsen. "Estimating Censored Regression Models in Rusing the censReg
Package"
We demonstrate how censored regression models (including standard Tobit models) can be estimated in R using the add-on package censReg
. This package provides not only the usual maximum likelihood (ML) procedure for cross-sectional data but also the random-effects maximum likelihood procedure for panel data using Gauss-Hermite quadrature.
I haven't used it, so I can't vouch for its quality or utility in this problem. There are probably lots of other options. The approach taken in Bayesian Data Analysis is to just code up your own model, either using the base library, or using stan
. This has the greatest degree of flexibility, at the cost of having to do the coding yourself.
Best Answer
There are a variety of solutions to the case of zero-inflated (semi-)continuous distributions:
glmmTMB
will also do 'zero-inflated'/hurdle models for Beta or Gamma responses)Or, if your data structure is simple enough, you could just use linear models and use permutation tests or some other robust approach to make sure that your inference isn't being messed up by the interesting distribution of the data.
There are R packages/solutions available for most of these cases.
There are other questions on SE about zero-inflated (semi)continuous data (e.g. here, here, and here), but they don't seem to offer a clear general answer ...
See also Min & Agresti, 2002, Modeling Nonnegative Data with Clumping at Zero: A Survey for an overview.