In univariate interval estimation, the set of possible actions is the set of ordered pairs specifying the endpoints of the interval. Let an element of that set be represented by $(a, b),\text{ } a \le b$.
Highest posterior density intervals
Let the posterior density be $f(\theta)$. The highest posterior density intervals correspond to the loss function that penalizes an interval that fails to contain the true value and also penalizes intervals in proportion to their length:
$L_{HPD}(\theta, (a, b); k) = I(\theta \notin [a, b]) + k(b – a), \text{} 0 < k \le max_{\theta} f(\theta)$,
where $I(\cdot)$ is the indicator function. This gives the expected posterior loss
$\tilde{L}_{HPD}((a, b); k) = 1 - \Pr(a \le \theta \le b|D) + k(b – a)$.
Setting $\frac{\partial}{\partial a}\tilde{L}_{HPD} = \frac{\partial}{\partial b}\tilde{L}_{HPD} = 0$ yields the necessary condition for a local optimum in the interior of the parameter space: $f(a) = f(b) = k$ – exactly the rule for HPD intervals, as expected.
The form of $\tilde{L}_{HPD}((a, b); k)$ gives some insight into why HPD intervals are not invariant to a monotone increasing transformation $g(\theta)$ of the parameter. The $\theta$-space HPD interval transformed into $g(\theta)$ space is different from the $g(\theta)$-space HPD interval because the two intervals correspond to different loss functions: the $g(\theta)$-space HPD interval corresponds to a transformed length penalty $k(g(b) – g(a))$.
Quantile-based credible intervals
Consider point estimation with the loss function
$L_q(\theta, \hat{\theta};p) = p(\hat{\theta} - \theta)I(\theta < \hat{\theta}) + (1-p)(\theta - \hat{\theta})I(\theta \ge \hat{\theta}), \text{ } 0 \le p \le 1$.
The posterior expected loss is
$\tilde{L}_q(\hat{\theta};p)=p(\hat{\theta}-\text{E}(\theta|\theta < \hat{\theta}, D)) + (1 - p)(\text{E}(\theta | \theta \ge \hat{\theta}, D)-\hat{\theta})$.
Setting $\frac{d}{d\hat{\theta}}\tilde{L}_q=0$ yields the implicit equation
$\Pr(\theta < \hat{\theta}|D) = p$,
that is, the optimal $\hat{\theta}$ is the $(100p)$% quantile of the posterior distribution, as expected.
Thus to get quantile-based interval estimates, the loss function is
$L_{qCI}(\theta, (a,b); p_L, p_U) = L_q(\theta, a;p_L) + L_q(\theta, b;p_U)$.
As mentioned in the comments, the model you're looking for is Bayesian linear regression. And since we can use BLR to calculate the posterior predictive distribution $p(r_t|t, \mathcal{D}_\text{ref})$ for any time $t$, we can numerically evaluate the distribution $p(\text{CAR}|\mathcal{D}_\text{event}, \mathcal{D}_\text{ref})$.
The thing is, I don't think a distribution over $\text{CAR}$ is what you really want. The immediate problem is that $p(\text{CAR} = 0|\mathcal{D}_\text{event}, \mathcal{D}_\text{ref})$ has probability zero. The underlying problem is that the "Bayesian version of hypothesis tests" is comparing models via their Bayes factor, but that requires you to define two competing models. And $\text{CAR} = 0, \text{CAR} \neq 0$ are not models (or at least, they're not models without some extremely unnatural number juggling).
From what you've said in the comments, I think what you actually want to answer is
Are $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ better explained by the same model or by different ones?
which has a neat Bayesian answer: define two models
$M_0$: all the data in $\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}$ is drawn from the same BLR. To calculate the marginal likelihood $p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_0)$ of this model, you'd calculate the marginal likelihood of a BLR fit to all the data.
$M_1$: the data in $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ are drawn from two different BLRs. To calculate the marginal likelihood $p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_1)$ of this model, you'd fit BLRs to $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ independently (though using the same hyperparameters!), then take the product of the two BLR marginal likelihoods.
Having done that, you can then calculate the Bayes factor
$$\frac{p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_1)}{p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_0)}$$
to decide which model is more believable.
Best Answer
First of all, Frequentist methods also provide a distribution over possible answers. It is just that we do not call them distributions because of a philosophical point. Frequentists consider parameters of a distribution as a fixed quantity. It is not allowed to be random; therefore, you cannot talk about distributions over parameters in a meaningful way. In frequentist methods, we estimate confidence intervals which can be thought of as distributions if we are letting go of the philosophical details. But in Bayesian methods the fixed parameters are allowed to be random; therefore, we talk about the (prior and posterior) distributions over the parameters.
Second, it is not always the case that only a single value is used at the end. Many applications require us to use the entire posterior distributions in subsequent analysis. In fact, to derive a suitable point estimate, full distribution is required. A well known example is risk minimization. Another example is model identification in natural sciences in the presence of significant uncertainties.
Third, Bayesian inference has many benefits over a frequentist analysis (not just the one that you metion):
Ease of interpretation: It is hard to understand what a confidence interval is and why it is not a probability distributions. The reason is simply a philosophical one as I have explained above briefly. The probability distributions in Bayesian inference are easier to understand becuase that is how we typically tend to think in uncertain situations.
Ease of implementation: It is easier to get Bayesian probability distributions than frequentist confidence intervals. Frequentist analysis requires us to identify a sampling distribution which is very difficult for many real world applications.
Assumptions of the model are explicit in Bayesian inference: For example, many frequentist analyses assume asymptotic Normality for computing the confidence interval. But no such assumptions are required for Bayesian inference. Moreover, the assumptions made in Bayesian inference are more explicit.
Prior information: Most importantly, Bayesian inference allows us to incorporate prior knowledge into the analyses in a relatively simple manner. In frequentist methods, regularization is used to incorporate prior information which is very difficult to do in many problems. It is not to say that incorporation of prior information is easy in Bayesian analysis; but it is easier than that in frequentist analysis.
Edit: A particularly good example of ease-of-interpretation of Bayesian methods is their use in probabilistic machine learning (ML). There are several method developed in ML literature with the backdrop of Bayesian ideas. For example, relevance vector machines (RVMs), Gaussian processes (GPs).
As Richard hardy pointed, this answer gives the reasons why someone would want to use Bayesian analysis. There are good reasons to use frequentist analysis also. In general, frequentist methods are computationally more efficient. I would suggest reading first 3-4 chapters of 'Statistical Decision Theory and Bayesian Analysis' by James Berger which gives a balanced view on this issue but with an emphasis on Bayesian practice.
To elaborate on the use of entire distribution rather a point estimate to make a decision in risk minimization, a simple example follows. Suppose you have to choose between different parameters of a process to make a decision, and the cost of choosing wrong parameters is $L(\hat{\theta},\theta)$ where $\hat{\theta}$ is the parameter estimate and $\theta$ is assumed to be true parameter. Now given the posterior distribution $p(\hat{\theta}|D)$ (where $D$ denotes observations)we can minimize expected loss which is $\int L(\hat{\theta},\theta)p(\hat{\theta}|D)d\hat{\theta}$. This expected loss can be minimized for every value of $\theta$ and the $\theta$ value with minimum expected loss can be used for decision making. This will result in a point estimate; but the value of the point estimate depends upon the loss function.
Based on a comment by Alexis, here is why frequentist confidence intervals are harder to interpret. Confidence intervals are (as Alexis has pointed out): A plausible range of estimates for a parameter given a Type I error rate. One naturally asks where does this possible range come from. The frequentist answer is that it comes from the sampling distribution. But then the question is we only observe one sample? The frequentist answer is we infer what other samples could have been observed based on the likelihood function. But if we are inferring other samples based on likelihood function, those samples should have a probability distribution over them, and, consequently, the confidence interval should be interpreted as a probability distribution. But for the philosophical reason mentioned above, this last extension of probability distribution to confidence interval is not allowed. Compare this to a Bayesian statement: A 95% credible-region means that the true parameter lies in this region with 95% probability.
A side note on philosophical differences between Bayesian and frequentist theory (based on a comment by ): In frequentist theory probability of an event is relative frequencies of that event over a large number of repeated trials of the experiment in question. Therefore, the parameters of a distribution are fixed because they stay the same in all the repetitions of the experiment. In Bayesian theory, the probabilities are degrees of belief in that an event would occur for in a single trial of the experiment in question. The problem with frequentist definition of probability is that it cannot be used to define probabilities in many real world applications. As an example, try to define the probability that I am typing this answer an android smartphone. Frequentist would say that the probability is either $0$ or $1$. While the Bayesian definition allows you to choose an appropriate number between $0$ and $1$.