I do not see how this is an implementation of no prior knowledge. Instead, it roughly corresponds to any value being equally likely, which is completely different compared to complete uncertainty about the prior as is for example achieved by minimax methods.

It appears that you are confusing prior ignorance *about the parameter value*, with prior ignorance *about the prior distribution itself*. The latter is not a requirement of Bayesian analysis. Clearly, you cannot specify the mathematical form of a prior distribution and also claim ignorance about that distribution --- any specification of a prior distribution constitutes perfect knowledge *of the prior*. So, the goal here is to specify a prior that captures ignorance *about the parameter value*. A weakly informative prior has the following general benefits:

**It represents genuine prior ignorance:** A weakly informative prior will gives a reasonable representation of genuine ignorance about the parameter. For example, if you use an improper uniform prior for a mean parameter (over all the real numbers) then every value has equal density. This representation comes from the principle of insufficient reason (formulated by Jacob Bernoulli, but more commonly associated with Laplace).$^ \dagger$ Uniformity of distribution on an appropriate measurement scale means that the prior does not strongly favour particular values of the parameter.

**It does not contribute strongly to the posterior:** The prior and likelihood functions both contribute to the posterior. There are various techniques to measure the contribution of each of these functions. For example, when using a conjugate prior, the contribution of the prior can be measured as a number of pseudo data points. With a weakly informative prior the number of pseudo data points in the prior is low (usually one or less). In such cases we sometimes say that this small contribution from the prior "lets the data speak for itself".

**It allows us to make ***objective* inferences: In objective Bayesian analysis we formulate a method of prior selection that leads to a unique prior (i.e., it does not have variable hyperparameters). Virtually every approach to objective Bayesian analysis formulates the prior based on some argument to ignorance, yielding a weakly informative prior.$^ {\dagger \dagger}$

Of course, you should bear in mind that all of these arguments apply to a situation where we want to **avoid** adding information about the parameter into the prior. If we have genuine prior information that we want to incorporate into the analysis then we will generally want to eschew a weakly informative prior in favour of one that capture that information.

$^ \dagger$ Note that when the "principle of insufficient reason" is applied to continuous random variables, the variable should be on a proper scale where uniformity is an appropriate representation. A nonlinear transform of a uniform random variable is non-uniformly distributed, which means that one must decide *which* representation of the variable is uniform. For a mean parameter we generally take this to be uniform on its initial scale, but for a variance parameter we usually take this to be uniform on its log-scale (i.e., after a logarithmic transformation).

$^ {\dagger \dagger}$ Note that there are different theories here of what prior is appropriate (e.g., Jeffrey's prior, Jaynes max-ent prior), so there are still multiple competing priors *at a theoretical level*. However, once you subscribe to a particular theory, you can then formulate how ignorance is represented objectively in particular cases. (In any case, most of the competing theories use very similar priors, so there is usually very little difference in the posterior under any of these theories.)

## Best Answer

While that $6.3$ seems not to mean the standard deviation in Stan, if that's what you want to do...

...do it in multiple steps.

Simulate from the t-distribution with the appropriate degrees of freedom $\nu$, using

`rt`

.Divide by the population standard deviation, $\sqrt{\frac{ \nu }{ \nu-2 }}$. Now the population standard deviation is $1$.

Multiply by your desired standard deviation.

Add your desired mean, since the population mean is $0$.

You can combine these steps in one function.