To start, let's clear up two points of confusion:
(1) "[W]e use the t-distribution if the sample size is small."
Not exactly, if variances $\sigma_1^2,\, \sigma_2^2$ are unknown
and estimated by $S_1^2,\, S_2^2,$ respectively, then you always
use the t-distribution. (If sample sizes are large enough for
degrees of freedom to exceed 30, then in some circumstances
it is OK to use a normal approximation. But with modern software
or printed t tables,
the normal approximation is not necessary. The approximation works
best for tests at the 5% level, not so well at 1%.)
(2) "[A]ssuming that the true standard deviations are not equal, ... then the degrees of freedom is given [by the Welch–Satterthwaite equation]."
No. This equation works whether or not $\sigma_1 = \sigma_2.$ However, if variances are not equal, you must use the Welch–Satterthwaite equation (not the pooled-variance equation with
degrees of freedom $\nu = n_1 + n_2 - 2.)$
Pooled 2-sample t test: If data are normal and population variances are equal, then the test statistic
for testing $H_0: \mu_1 = \mu_2$ against $H_a: \mu_1 \ne \mu_2$ is:
$$T = \frac{\bar X_1 - \bar X_2}{S_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}{}},$$
where $S_p^2 =\frac{(n_1-1)S_1^2 + (n_2-1)S_2^2}{n_1 + n_2 - 2}.$
If $H_0$ is true, then $T$ has Student's t distribution with
degrees of freedom $\nu = n_1 + n_2 - 2.$
Welch 'separate variances' 2-sample t test. However, more generally if $H_0$ is true, the test statistic
$$T^\prime = \frac{\bar X_1 + \bar X_2}{\sqrt{\frac{S_1^2}{n_1} +\frac{S_2^2}{n_2}}}.$$
is approximately distributed according
to Student's t distribution with degrees of freedom $\nu$ given by the Welch-Satterthwaite equation. This is true whether or not
the population variances are equal.
One can show that that degrees of freedom $\nu$ according to the Welch-Satterthwaite equation satisfies
$$\min(n_1 - 1, n_2 - 1) \le \nu \le n_1 + n_2 - 2.$$ So if the smaller of the two sample sizes exceeds 30, then $\nu \ge 30$ and (testing at
the 5% level) it is OK to use a normal approximation for the
distribution of $T^\prime.$
Whatever the sample size, $T^\prime$ has very nearly Student's t distribution with the the Welch-Satterthwaite degrees of freedom.
(This is known from probability theory and from many simulation studies.)
Which to use? The bottom line is that most statisticians use the $T^\prime$-statistic and the Welch-Satterthwaite degrees of freedom to do
2-sample t tests unless they have very strong prior evidence that
population variances are equal (rarely the case). Most modern
statistical software packages use the Welch 2-sample t test by default. Some programs will use $T$ with the pooled SD $S_p$ if
the user overrides the default.
Notes: (a) If $n_1 = n_2,$ then one can show that $T = T^\prime$
numerically, but one should still use the Welch-Satterthwaite degrees of freedom unless the population variances are known to be equal.
(b) If sample variances $S_1^2$ and $S_2^2$ are nearly equal,
then the Welch-Satterthwaite $\nu$ is near $n_1 + n_2 - 2.$ If the
sample variances are far apart then $\nu$ may be considerably smaller---perhaps as small as $n_1 -1$ or $n_2 - 1.$
(c) Especially if $n_1 << n_2$ and $\sigma_2 << \sigma_1,$ then results from the pooled
2-sample test using $T$ and $S_p$ can be very misleading. (The
notation $<<$ means 'much smaller than'.)
(d) It is not a good idea to test whether $\sigma_1^2 = \sigma_2^2$ in order to decide whether to use $T$ or $T^\prime.$ The test for equal variances has poor power, and simulation studies have shown
that the 'hybrid' test (using $T^\prime$ only if the equal-variances test rejects) can give misleading results.
Demonstration of note (c). Using R statistical software:
Small sample from $\mathsf{Norm}(\mu_1=150,\sigma_1=30);$
larger sample from $\mathsf{Norm}(\mu_2=150,\sigma_2=5.)$
The null hypothesis is true, and so should not be rejected.
x1 = rnorm(10, 150, 30); x2 = rnorm(50, 150, 5)
mean(x1); sd(x1)
[1] 139.3158
[1] 31.34551
mean(x2); sd(x2)
[1] 150.1088
[1] 5.246149
Welch 2-sample test properly fails to reject:
t.test(x1, x2)
Welch Two Sample t-test
data: x1 and x2
t = -1.0858, df = 9.1011, p-value = 0.3055
alternative hypothesis: true difference in means is not equal to 0
sample estimates:
mean of x mean of y
139.3158 150.1088
Pooled two-sample t test improperly rejects at the 5% level, 'finding' a
difference in population means that does not actually exist.
(The small sample with the large SD gives a misleading sample mean.)
t.test(x1, x2, var.eq=T)
Two Sample t-test
data: x1 and x2
t = -2.3504, df = 58, p-value = 0.02217
alternative hypothesis: true difference in means is not equal to 0
sample estimates:
mean of x mean of y
139.3158 150.1088
Best Answer
As we are dealing with unknown population variances, we need to first calculate the sample pooled variance, denoted by $$s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}$$ where the denominator is just the sum of $(n_1-1)$ and $(n_2-1)$. You can think of it as a weighted sum of the sample variances, $s_1$ and $s_2$.
Once this is calculated, the standard error is immediately $$\text{s.e.}=s_p\sqrt{\frac1{n_1}+\frac1{n_2}}$$ and you can obtain the lower limit of the $95\%$ confidence interval from $(\hat{\mu_1}-\hat{\mu_2})-t_{n_1+n_2-2,0.975}\,\text{s.e.}$
Note that $0.975=1-\frac{0.05}2$.
In your case where we have $n_1=n_2$, we get $\text{s.e.}=\sqrt{\dfrac{s_1^2+s_2^2}{n_1}}$.