Thomas Lumley's 'survey' package for R is an excellent tool for specifying survey designs in R. Here is a link to Lumley describing how to specify a sample design. His examples in this link are with schools; your question is about hospitals -- very similar. The author of this package has generously produced many many tutorials (+ a book), which should be easy enough for you to find. Just Google 'lumley survey.'
However, as noted in the comments, there is some information lacking here for us to give very specific advice. Really important is how these three hospitals were chosen and if the hospitals have the same number of clinicians. If you select three hospitals and sample 50 clinicians regardless of the size of the hospital, then I think you're question isn't really about cluster effects -- this is really about stratification. In this case, what you will need to do, for the calculation of means or other descriptives, is to specify a strata variable, and you will need to specify stratum sizes (as larger hospitals should be given more weight if you sampled 50 from each hospital regardless of size). Also, if you sampled a very large fraction of clinicians in each hospital (say greater than 5%), then you should also make sure that you account for a finite population correction -- the fpc parameter in svydesign() function. (fps and stratum sizes are often equal, so you only need to specify one if you don't have your own probability or post-stratification weights.)
Best case scenario -- these three hospitals are all hospitals in a specific city, or you sampled hospitals probability proportionate to size. In which case, move on, your sample is fine -- just account for stratification. If hospitals were selected in pretty much in other way, however, you need to do some reading about first-stage stratification, as without more information about your design, we won't be able to help. Or, you unambiguously caveat that hospitals were sampled conveniently.
But if your intention is regress something-on-something with this data set, then I would simply include hospital as a covariate -- or depending on your question, consider specifying hospital as a random effect. There are some arguments for including survey weights in a regression like this, but anecdotally I would say this is not the norm.
*As a final aside, if you do opt to use the 'survey' package, don't forget to use the "~" where you have to. For good reasons, this is required when you are referring to specific variables in your dataset, but in my experience when people run into problems with this package, it's because they forgot to include a "~" somewhere. Most other packages don't require this, so it throws some people off.
Best Answer
The essence of this question is statistical, although as phrased it is not likely to mean much to non-Stata users.
Yes, there is a reason, and it doesn't have to be called "theoretical" as it is practical too.
The reasons for using
pweights
withproportion
would be, and could only be, that youHave survey data so declared, which is why you are using
pweights
at all.Cannot do that calculation in
ci
.proportion
has other uses too.To spell out a little what is meant by "survey data": Survey design characteristics often include sampling weights, one or more stages of clustered sampling, and stratification.
If that's so, there is not really any question of choice or shopping around:
ci
is not the command to use for survey data.