My feeling is that the biggest problem with the epsilon-delta definition is that this is the first time students have ever seen the universal and existential quantifiers. By the time you say, "For every epsilon there exists a delta," you have already lost 95% of your audience before you even get to the business end of the proposition.
And of course the other problem is with the lower-case Greek letters. Students have been seeing x, y, z, and t all their lives; and out of nowhere you show them epsilon and delta.
In other words it's the basic form of the definition that's intimidating and confusing to students; not so much the actual idea, which is simply that you can arbitrarily constrain the output by suitably constraining the input.
Perhaps if instructors started with the conceptual understanding and then spent time explaining "for all" and "there exists" and giving them a gentle introduction to Greek letters used as variables, things would get better.
When $\delta(\varepsilon)$ is written as you have above, it is merely a notational reminder that our choice of $\delta$ has to depend on the $\varepsilon$ we're given -- nothing more. In fact, $\delta$ also depends on $f$, $a$, and $L$. Writing $\delta(\varepsilon)$ does not mean that $\delta$ is a function to which we may plug in $\varepsilon$ to get our limit-satisfying $\delta$-value. In that vein, the also-common notation $\delta_\varepsilon$ could be argued to be better. However, we could create an actual function which acts in the spirit of the aforementioned $\delta(\varepsilon)$ and addresses your objection that we're "throwing out" other perfectly good values of $\delta$. This is most vivid if we restrict our attention to the following setup.
Let $A \subseteq \mathbb R$ be open and $f \colon A \longrightarrow \mathbb R$ have limit $L$ at $a$:
$$
\lim_{x \to a} f(x) = L.
$$
We may define
\begin{align}
\begin{split}
\delta_*(\varepsilon) &= \sup\{ \delta > 0 : a-\delta < x < a \implies |f(x) - L| < \varepsilon \}, \\
\delta^*(\varepsilon) &= \sup\{ \delta > 0 : a< x < a + \delta \implies |f(x) - L| < \varepsilon \},
\end{split}
\tag{1}
\end{align}
with the idea that $\delta_*(\varepsilon)$ tells you how far left of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$, and $\delta^*(\varepsilon)$ tells you how far right of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$. (We know that $\delta_*, \delta^* > 0$ exist because those sets on the RHS of $(1)$ are nonempty according to the limit definition.) Hence the largest open $x$-interval for which $|f(x) - L| < \varepsilon$ is $$
X(\varepsilon) = \big( a - \delta_*(\varepsilon), a + \delta^*(\varepsilon) \big).
$$
An issue here is that $X(\varepsilon)$ is not (necessarily) symmetric about $a$, so it doesn't (necessarily) correspond to $|x - a| < \delta$ for any $\delta$. To remedy this, we define $\hat \delta (\varepsilon) = \min\{\delta_*(\varepsilon), \delta^*(\varepsilon)\}$; then any $x$ in the interval
$$
X'(\varepsilon) = \big( a - \hat \delta(\varepsilon), a + \hat \delta (\varepsilon) \big)
$$
will satisfy $|f(x) - L| < \varepsilon$. Note that $X'(\varepsilon) = \{ x : |x - a| < \hat \delta(\varepsilon)\}$, and hence any $\delta$ in the interval $I(\varepsilon) = \big( 0, \hat \delta(\varepsilon) \big]$ satisfy the $\varepsilon$-$\delta$ definition of our limit. Moreover, $I(\varepsilon)$ is the largest set of values of $\delta$ that will work for a given $\varepsilon$. In other words:
$\delta$ satisfies the $\varepsilon$-$\delta$ definition $\iff \delta \in I(\varepsilon)$.
This answers your Question 1. A proof of this follows @grand_chat's answer. Note that $I(\varepsilon)$ depends on $a$, $f$, and $L$ implicitly.
One thing that may bother you is that $X(\varepsilon)$ may be much bigger than $X'(\varepsilon)$, so we're "throwing out perfectly good $x$'s". The $\varepsilon$-$\delta$ definition restricts $X(\varepsilon)$ to a symmetric interval ($X'(\varepsilon)$) about $a$. Does this help address your rigor question?
Of course satisfying the definition of a limit only requires us to find one such $\delta$. The reason that this is what you describe as the preferred method by professors etc. is the existence of complicated functions $f$ which make computing $I(\varepsilon)$ very difficult: it amounts to solving $f(x) = L \pm \varepsilon$ for $x$, which is inverting $f$. Since they don't need to find $I(\varepsilon)$, but just a single point in it, they opt for less work.
Your example of a "linear" function happens to be one in which the imprecise $\delta(\varepsilon)$ which people often write coincides with $\delta_*(\varepsilon) = \delta^*(\varepsilon) = \hat \delta (\varepsilon)$ in a quite canonical way, which may deceive people into believing some property of uniqueness for $\delta(\varepsilon)$.
Best Answer
A sequence $(x_n)_{n\in\mathbb N}$ of real numbers has limit $x$ iff for every $\epsilon>0$ there exists $n_0\in\mathbb N$ such that $n>n_0$ implies $|x_n-x|<\epsilon$.
A function $f\colon I\to \mathbb R$ is continuos at $x_0\in I$ if for every $\epsilon>0$ there exists a$\delta>0$ such that for all $x\in I$ with $|x-x_0|<\delta$ we have $|f(x)-f(x_0)|<\epsilon$.
These are different concepts. Then again, they are the same: Note that $|x-x_0|$ measures the distance between $x$ and $x_0$; we can define a metric $d$ on $\mathbb N\cup\{\infty\}$ such that a sequence $(x_n)_{n\in\mathbb N}$ converges to $x$ if and only if the function given by $$f(n)=\begin{cases}x_n&n\in\mathbb N\\x&x=\infty\end{cases}$$ is continuous at $\infty$. The metric I mentioned can be defined as $d(n,m)=\left|\frac 1n-\frac 1m\right|$ for $n,m\in\mathbb N$ and $d(n,\infty)=d(\infty,n)=\frac 1n$ and $d(\infty,\infty)=0$.