I do not think I will be able to give regular time investment to continue learning data analysis
I don't think Casella & Berger is a place to learn data much in the way of data analysis. It's a place to learn some of the tools of statistical theory.
My experience so far telling me to be a statistican one needs to bear with a lot of tedious computation involving various distributions(Weibull, Cauchy, t, F...).
I've spent a lot of time as a statistician doing data analysis. It rarely (almost never) involves me doing tedious calculation. It sometimes involves a little simple algebra, but the common problems are usually solved and I don't need to expend any effort on replicating that each time.
The computer does all the tedious calculation.
If I am in a situation where I'm not prepared to assume a reasonably standard case (e.g. not prepared to use a GLM), I generally don't have enough information to assume any other distribution either, so the question of the calculations in LRT is usually moot (I can do them when I need to, they just either tend to be already solved or come up so rarely that it's an interesting diversion).
I tend to do a lot of simulation; I also frequently try to use resampling in some form either alongside or in place of parametric assumptions.
Will I need to spend 20hr+ per week on it like I used to be?
It depends on what you want to be able to do and how soon you want to get good at it.
Data analysis is a skill, and it takes practice and a large base of knowledge. You'll have some of the knowledge you need already.
If you want to be a good practitioner at a wide variety of things, it will take a lot of time - but to my mind it's a lot more fun than the algebra and such of doing Casella and Berger exercises.
Some of the skills I built up on say regression problems are helpful with time series, say -- but a lot of new skills are needed. So learning to interpret residual plots and QQ plots is handy, but they don't tell me how much I need to worry about a little bump in a PACF plot and don't give me tools like the use of one-step-ahead prediction errors.
So for example, I don't need to expend effort figuring out how to do reasonably ML for typical gamma or weibull models, because they're standard enough to be solved problems that have already been largely put into a convenient form.
If you come to do research, you'll need a lot more of the skills you pick up in places like Casella & Berger (but even with those kind of skills, you should also read more than one book).
Some suggested things:
You should definitely build up some regression skills, even if you do nothing else.
There are a number of quite good books, but perhaps Draper & Smith Applied Regression Analysis plus Fox and Weisberg An R Companion to Applied Regression; I'd also suggest you consider following with Harrell's Regression Modelling Strategies
(You could substitute any number of good books for Draper and Smith - find one or two that suit you.)
The second book has a number of online additional chapters that are very much worth reading (and its own R-package)
--
A good second serving would be Venables & Ripley's Modern Applied Statistics with S.
That's some grounding in a fairly broad swathe of ideas.
It may turn out that you need some more basic material in some topics (I don't know your background).
Then you'd need to start thinking about what areas of statistics you want/need -- Bayesian stats, time series, multivariate analysis, etc etc
Best Answer
Wikipedia has a page that lists many probability distributions with links to more detail about each distribution. You can look through the list and follow the links to get a better feel for the types of applications that the different distributions are commonly used for.
Just remember that these distributions are used to model reality and as Box said: "all models are wrong, some models are useful".
Here are some of the common distributions and some of the reasons that they are useful:
Normal: This is useful for looking at means and other linear combinations (e.g. regression coefficients) because of the CLT. Related to that is if something is known to arise due to additive effects of many different small causes then the normal may be a reasonable distribution: for example, many biological measures are the result of multiple genes and multiple environmental factors and therefor are often approximately normal.
Gamma: Right skewed and useful for things with a natural minimum at 0. Commonly used for elapsed times and some financial variables.
Exponential: special case of the Gamma. It is memoryless and scales easily.
Chi-squared ($\chi^2$): special case of the Gamma. Arises as sum of squared normal variables (so used for variances).
Beta: Defined between 0 and 1 (but could be transformed to be between other values), useful for proportions or other quantities that must be between 0 and 1.
Binomial: How many "successes" out of a given number of independent trials with same probability of "success".
Poisson: Common for counts. Nice properties that if the number of events in a period of time or area follows a Poisson, then the number in twice the time or area still follows the Poisson (with twice the mean): this works for adding Poissons or scaling with values other than 2.
Note that if events occur over time and the time between occurrences follows an exponential then the number that occur in a time period follows a Poisson.
Negative Binomial: Counts with minimum 0 (or other value depending on which version) and no upper bound. Conceptually it is the number of "failures" before k "successes". The negative binomial is also a mixture of Poisson variables whose means come from a gamma distribution.
Geometric: special case for negative binomial where it is the number of "failures" before the 1st "success". If you truncate (round down) an exponential variable to make it discrete, the result is geometric.