It is firmly expected that for every \epsilon > 0 each aritmetic progression with difference q and terms coprime with q will contain a prime <<{\epsilon} q^{1 + \epsilon}.
This is a direct consequence of a conjecture of H. L. Montgomery (not the one on pair correlations, another one), which implies both the GRH and the Elliott-Halberstam conjecture. But the upper bound $\ll {\epsilon} q^{1 + \epsilon}$ for the least prime in an arithmetic progression was conjectured by S. Chowla (in his book "The Riemann Hypothesis and Hilbert's tenth problem"), and probably by others independently of him, years before Montgomery made his conjecture, and presumably on the basis of the same kind of heuristic argument that moonface advances. In fact, I think that even an upper bound as strong as qlog(q)^2 has been conjectured, though I won't swear to that. But it is definitely known that qlog(q) won't work. The Montgomery conjecture seems reasonable, because it rests on the assumption that there is square root cancellation in a certain sum with D-characters as coefficients in an explicit formula - this ties in with moonface's comment about the proof that GRH implies L \leq 2 being "lossy". On the other hand, one cannot expect to get q^{1 + \epsilon} out of the Elliott-Halberstam conjecture in any direct way, because that is an averaging kind of statement. You would not expect to get L \leq 2 out of the Bombieri-A. I. Vinogradov theorem either, for that is an averaged version of GRH. The point is that information is needed about every single one of the arithmetic progressions individually.
In Section 9 of
Diamond, Harold G., Elementary methods in the study of the distribution of prime numbers, Bull. Am. Math. Soc., New Ser. 7, 553-589 (1982). ZBL0505.10021.
an argument from
Diamond, Harold G.; Erdős, Paul, On sharp elementary prime number estimates, Enseign. Math., II. Sér. 26, 313-321 (1980). ZBL0453.10007.
is sketched, where it is shown that for every $\varepsilon > 0$ there is a Chebyshev style argument that shows that
$$ (1-\varepsilon) x \leq \psi(x) \leq (1+\varepsilon) x$$
for all sufficiently large $x$. The idea is to compare $\Lambda(n) = \mu * \log(n)$ with a truncated version $\Lambda_T(n) = \mu_T * \log(n)$ where $\mu_T$ is $\mu$ truncated to a large interval $[1,T]$ (and modified at $T$ to ensure the normalisation $\sum_n \mu_T(n)/n=0$). Elementary methods can show that $|\sum_{n \leq x} \Lambda_T(n) - m_1(T) x| \leq o(x)$ for an explicitly computable quantity $m_1(T)$ (in fact $m_1(T) = \sum_{n<T} \log \frac{T}{n} \frac{\mu(n)}{n}$) that can be easily demonstrated to converge to one as $T \to \infty$ in a suitably averaged sense. In particular $|\sum_{n \leq x} \Lambda_T(n) - x| \leq |m_1(T)-1| x + o(x)$. The Dirichlet hyperbola method can also be used to produce an upper bound $|\sum_{n \leq x} \Lambda_T(n) - \psi(x)| \leq \delta_T x + o(x)$ for some explicitly computable quantity $\delta_T$ depending on $T$; with the aid of the PNT (in the form $\sum_{n \leq x} \mu(n) = o(x)$) one can show that $\delta_T \to 0$ as $T \to \infty$, and this is enough to conclude the claim. Amusingly, this gives a circular proof of the PNT assuming the PNT; however if one only wishes to get the PNT up to a given error $\varepsilon$ then this argument does not require the PNT, one simply has to locate a concrete value of $T$ for which the upper bounds $|m_1(T)-1|, \delta_T$ in the above inequalities (which can be explicitly computed by Chebyshev type methods) are small enough.
It looks very likely that one can twist this argument by characters and conclude that for every $\varepsilon > 0$ and primitive residue class $a \hbox{ mod } q$ there is a Chebyshev style argument that shows that
$$ (1-\varepsilon) \frac{x}{\phi(q)} \leq \psi(x,q,a) \leq (1+\varepsilon) \frac{x}{\phi(q)}$$
for all sufficiently large $x$, where one again needs the prime number theorem in arithmetic progressions (presumably this time in the form $\sum_{n \leq x} \mu(n) \chi(n) = o(x)$) to guarantee that the elementary upper bounds for the error are indeed small. It seems plausible though that for specific residue classes like $1 \hbox{ mod } 4$ or $-1 \hbox{ mod } 4$ one could actually produce explicit numerical choices of the $T$ parameter that obey all the required properties and in particular produce non-trivial lower bounds of Chebyshev type.
Best Answer
When $(a,m) = 1$, let $\pi(x;a \bmod m)$ be the number of primes $p\leq x$ such that $p \equiv a \bmod m$. The prime number theorem for arithmetic progressions mod $m$ says for all $a \in (\mathbf Z/m\mathbf Z)^\times$ that $\pi(x;a \bmod m) \sim (1/\varphi(m))x/\log x$.
Harold Shapiro, in the paper Some assertions equivalent to the prime number theorem for arithmetic progressions, Comm. Pure Appl. Math. 2 (1949), 293-308, showed that theorem for an integer $m \geq 1$ is equivalent to each of the following conditions for the same $m$: $$ \sum_{\substack{n \leq x \\ n \equiv a \bmod m}} \mu(n) = o(x) $$ for all $a$ with $(a,m) = 1$ and $$ \sum_{\substack{n \geq 1 \\ n \equiv a \bmod m}} \frac{\mu(n)}{n} \ \ {\rm converges} $$ for all $a$ with $(a,m) = 1$. This should address both of your questions.
Concerning the second equivalent condition above, note that the prime number theorem is in many places expressed as being equivalent to the calculation $\sum \mu(n)/n = 0$, but it is also equivalent just to the convergence of $\sum \mu(n)/n$ because it is easy to show that this series must be $0$ if it converges thanks to Abel's theorem for Dirichlet series: convergence of $\sum \mu(n)/n$ implies this series must equal $\lim_{s \to 1^+} \sum \mu(n)/n^s = \lim_{s \to 1^+} 1/\zeta(s) = 0$.
I explained in my answer here how the prime number theorem in arithmetic progressions mod $m$ is equivalent to nonvanishing of $L(s,\chi)$ on the line ${\rm Re}(s) = 1$ for all Dirichlet characters $\chi \bmod m$, With this information, let's see how to derive the first Moebius analogue above. When $(a,m) = 1$, $$ \sum_{\substack{n \leq x \\ n \equiv a \bmod m}} \mu(n) = \sum_{n \leq x} \frac{1}{\varphi(m)}\left(\sum_{\chi \bmod m} \overline{\chi}(a)\chi(n)\right)\mu(n), $$ which is $$ \frac{1}{\varphi(m)}\sum_{\chi \bmod m} \overline{\chi}(a)\left(\sum_{n \leq x} \chi(n)\mu(n)\right). $$ To show this is $o(x)$, we show each inner sum is $o(x)$.
The inner sum at $\chi$ is the partial sum of the coefficients of the Dirichlet series $\sum \chi(n)\mu(n)/n^s$, which is $1/L(s,\chi)$ for ${\rm Re}(s) > 1$. The coefficients of this Dirichlet series are bounded in absolute value (by $1$) and $1/L(s,\chi)$ has an analytic continuation to the line ${\rm Re}(s) = 1$ (this is where we use the nonvanishing of all $L(s,\chi)$ on that line, including the pole at $s = 1$ when $\chi$ is the trivial character, so the reciprocal there is analytic at $s=1$ with a zero), so by Newman's Tauberian theorem $$ \frac{1}{x}\sum_{n \leq x} \chi(n)\mu(n) \to {\rm Res}_{s=1} \frac{1}{L(s,\chi)} = 0. $$ The convergence of $\sum_{n \equiv a \bmod m} \mu(n)/n$ when $(a,m) = 1$ is proved by a similar approach: reduce the task to showing for all Dirichlet characters $\chi \bmod m$ that $\sum_{n \leq x, n \equiv a \bmod m} \chi(n)\mu(n)/n$ converges as $x \to \infty$.