As have already been said there is no golden trick here. However it is sometimes useful to know some ways to manipulate series and functions.
Suppose we know
$$\frac{1}{1-x}=\sum_{k=0}^\infty x^k,\qquad|x|<1\tag{1}$$
and
$$e^x=\sum_{k=0}^\infty \frac{x^k}{k!}, \qquad\text{for all $x$}\tag{2}$$
Formally if we integrate (1) we get
$$-\log(1-x)=\sum_{k=0}^\infty \frac{x^{k+1}}{k+1}=\sum_{k=1}^\infty \frac{x^{k}}{k},\qquad|x|<1$$
and then
$$\log(1+x)=-\sum_{k=1}^\infty \frac{(-1)^kx^{k}}{k},\qquad|x|<1$$
Computing $\arctan x$ is similar since by (1)
$$(\arctan x)'=\frac{1}{1+x^2}=\sum_{k\geq0} (-1)^kx^{2k}$$
Also if we use $\sin x= (e^{ix}-e^{-ix})/2i$ in conjunction to (2) we get
$$2i\sin x = \sum_{k=0}^\infty \frac{(ix)^k}{k!}- \sum_{k=0}^\infty \frac{(-ix)^k}{k!} =
\sum_{k=0}^\infty \frac{i^k(1-(-1)^k)x^k}{k!}$$
now $1-(-1)^{k} = 0 $ for even $k$, and $1-(-1)^{k} = 2$ otherwise. So
$$2i\sin x = 2\sum_{k=0}^\infty \frac{i^{2k+1}x^{2k+1}}{(2k+1)!} = 2i\sum_{k=0}^\infty \frac{(-1)^{k}x^{2k+1}}{(2k+1)!}$$
where in the last step we used $i^{2k+1}=i\cdot i^{2k}=i\cdot (-1)^k$, and we reach the expansion
$$\sin x=\sum_{k=0}^\infty \frac{(-1)^{k}x^{2k+1}}{(2k+1)!} $$
Cosine is similar to sine.
Note that these are formal derivations - proofs needs justifications of the steps done.
Sounds like you are in for quite an adventure.
Mathematics is a huge, vast forest of knowledge. The choice of foundations to master and the path towards advanced topics depends to a great deal on precisely what it is that interests you. It sounds like you are not sure yourself what precisely it is that interests you so here goes some general advise to self-learning mathematics, assuming that you are interesting in understanding it all and not just memorize some techniques.
Disclaimer: These are only my humble opinions.
The language: Mathematics is written (mostly) in English augmented by set theory and/or category theory. The former is indispensable while the latter is only highly recommended. There are plenty of books on naive set theory (since you certainly don't want to start with formal set theory until you become seriously interested in logic and set theory). Halmos' "Naive Set Theory" is old but very very good. There are also various texts on category theory (including notes on category theory for CS which you might prefer). Category theory might be hard to digest so you might want to take it slow with categories and read on it while you are reading other things.
It is safe to assume that for the topics that seem to interest you you will certainly need a good dose of analysis. To save time and if you are up to a bit of abstractness look for textbooks that talk also about general metric spaces (e.g., Larussens' "Lectures on Analysis").
Linear algebra is also certainly going to be required. The book "Linear Algebra Done Wrong", despite its name, is a good text.
You should probably set this for yourself as a first goal. As you won't have plenty of time to put into it it might take you a good year to reach that milestone, if not longer. Once that is done you can think about how to proceed.
One thing to remember is that even if it will take you a very long time to get where you want to get to, the things you will learn on the way are very likely to assist you not so much on their own right but rather due to the analytic skills you will develop when working on challenging mathematics problems. Good luck!
Best Answer
Just to give you an idea of where we're headed, here's the punch line of everything I'm about to say:
That's pretty dry, though. I'm not really feeling it. So, let me set the scene.
You are piloting an airship across a vast, uncharted continent. In this place, dark clouds obscure the stars at all times, so you navigate the only way you can: using sensitive instruments to record every detail of your motion. Even as the airspeed indicator measures your velocity, an accelerometer is already reporting its rate of change, and a tower of even stranger and more sophisticated sensors track every jolt, snap, crackle, and pop. At the top of the sensor mast, ensconced in a snarl of cables, a powerful computer drinks up the flood of data, assembling a heavily redundant record of your journey.
The air here is so thin and smooth that the instrument readings barely change from hour to hour. If the accelerometer registers the force of a gentle tailwind, you can watch your velocity creep steadily upward for minutes on end, just as the accelerometer promised.
One night, you are woken by a flash of lightning—not just a flash, but a blinding sheet, pouring down every window. You listen for thunder, but none arrives. The sky is inky black; the glowing digits of the clock beside your bunk read 05:36:22. You go back to sleep.
When you wake up, the clock reads 05:36:22.
Cursing, you jump out of bed and dash to the course computer. The column of log data is frozen halfway down the screen; the last entry is timestamped with the same digits hovering on the face of the clock. You empty every drawer in your cabin looking for your old mechanical watch, which confirms that several hours have passed since the lightning strike. It could be days before you get the computer running again. By then, where will you be?
You have no way of knowing. But, looking at that final log entry, you can try to guess. Let's say $x(t)$ is the distance you'll drift in the first $t$ time units after the lightning strike. If the only thing you know is your velocity at the time of the strike, $x'(0)$, the best you can do is hope that you'll keep moving at about the same velocity, so $$x(t) \approx x'(0)\,t.$$ This guess is consistent with all the information you have, because the velocity given by your guess at the time of the strike matches the velocity recorded in the log. You can see this by taking the derivative of both sides of the approximate equation above and then setting $t$ to zero.
Based on your experience traveling in these parts, you can actually make a pretty strong guarantee that your guess isn't too far from the truth. Let's define a new function $$\epsilon_2(t) = x(t) - x'(0)\,t,$$ which measures the difference between your actual and estimated positions. Since $\epsilon_2(0)$ is zero, the fundamental theorem of calculus tells us that $$\epsilon_2(t) = \int_0^t \epsilon_2'(s)\;ds.$$ We saw earlier that $\epsilon_2'(0)$ is also zero, and you can easily check that $\epsilon_2''(t) = x''(t)$. Hence, $$\epsilon_2'(t) = \int_0^t \epsilon_2''(s)\;ds = \int_0^t x''(s)\;ds.$$ If you're confident that the magnitude of your acceleration $x''$ won't go above $M_2$ between times zero and $t$, you can be confident that $\left| \epsilon_2'(t) \right| \le M_2\,t$, and therefore that $$\left| \epsilon_2(t) \right| \le M_2\frac{t^2}{2}.$$ This guarantee is a baby version of Taylor's theorem.
If you know more than just your velocity at the time of the lightning strike, you can make a better estimate of your course. If the last log entry tells you the first $n$ derivatives of your position at the time of the strike, $x'(0)$ through $x^{(n)}(0)$, you can guess that $$x(t) \approx x'(0) \frac{t}{1!} + x''(0) \frac{t^2}{2!} + x'''(0) \frac{t^3}{3!} + \ldots + x^{(n)}(0) \frac{t^n}{n!}.$$ Just as before, this guess is consistent with all the information you have, because its first $n$ derivatives at the time of the strike match the derivatives recorded in the log.
With more information available, you can make a stronger guarantee about the accuracy of your guess. Once again, define a function $$\epsilon_{n+1}(t) = x(t) - \left[ x'(0) \frac{t}{1!} + x''(0) \frac{t^2}{2!} + x'''(0) \frac{t^3}{3!} + \ldots + x^{(n)}(0) \frac{t^n}{n!} \right]$$ measuring the difference between your guess and the truth. Using the same repeated integration technique as before, you can be confident that $$\left| \epsilon_{n+1}(t) \right| \le M_{n+1} \frac{t^{n+1}}{(n+1)!}$$ if you're confident that the magnitude of $x^{(n+1)}$ won't go above $M_{n+1}$ between times zero and $t$.
Now we're ready to hear the punch line again:
For some functions, you can make your guess as accurate as you want—for times close to the starting time, at least—just by using a Taylor polynomial with more terms. If you use all the terms, extending the Taylor polynomials to an infinite Taylor series, you'll be able to guess the function perfectly for a short period of time! Functions like this are called analytic. A classic example is the function $$x(t) = \frac{1}{1+t^2}.$$ Its Taylor series, $$1 - t^2 + t^4 - t^6 + \ldots,$$ predicts its behavior perfectly when $|t| < 1$.
The Taylor polynomials of a non-analytic function are still good guesses, but there's a limit to how good they can get. Consider, for instance, the function $$x(t) = \begin{cases}e^{-1/t^2} & t \neq 0 \\ 0 & t = 0. \end{cases}$$ At $t = 0$, all the derivatives of this function are zero! Based on its derivatives at $t = 0$, the best you can do is guess that $x(t) \approx 0$. For times close to zero, this guess is actually really good: when $|t|$ is less than $0.2$, $\left|x(t)\right|$ is less than $10^{-10}$, and when $|t|$ is less than $0.1$, $\left|x(t)\right|$ is less than $10^{-40}$. On the other hand, the guess $x(t) \approx 0$ definitely isn't perfect, and using a Taylor polynomial with more terms won't make it any better. The only way to squeeze more accuracy out of it is to look at times closer to zero.
We saw earlier that the $n$th-degree Taylor polynomial of a function will stay accurate for as long as the function's $(n+1)$st derivative stays small. Thus, you might suspect that something funny must be going on with the higher derivatives of $e^{-1/t^2}$—and you'd be right. The higher derivatives of this function stay small for a while, but then spike to enormous levels, with each derivative going more berserk than the last.
Strange as it may sound, this kind of behavior is pretty common in nature. Many solutions to the heat equation, for example, are non-analytic, and the energy levels of a hydrogen atom depend non-analytically on the ambient electric field. Oh, and that function $e^{-1/t^2}$ we've been playing with? It shows up all the time in quantum field theory, as that second link attests.
Analytic functions are common in nature too—so common that, in some basic science classes, they're the only kind of function you'll ever use. In many situations, their ubiquity is explained by a theorem developed by a series of great 19th-century analysts, starting with Augustin Cauchy and ending with Sofia Kovalevskaya. The Cauchy-Kovalevskaya theorem describes an enormous class of partial differential equation problems guaranteed to yield analytic solutions. Using a tiny fraction of its power, you can prove that solutions of the following equations are always analytic:
This ought to give you some idea of why analytic functions might be common in fields like physics, chemistry, biology, and cosmology.