Contrary to your claim near the end of your question, I claim that the time-derivative of the field is being treated as an "independent" argument of the Lagrangian. I'll try to convince you of this by showing you how this independence leads to everything working out the way you think it should. Some of the key points are at the end, so please read all the way through before you succumb to skepticism.
For the sake of simplicity, let's assume from the start that we are considering a classical theory of fields $\phi:\mathbb R^2\to\mathbb R$. Let $\mathcal F$ denote the set of admissible fields in this theory. We denote the first field argument with $t$ and the second argument with $x$, so we write $\phi(t,x)$ as usual.
Ok so, now let's turn to the Lagrangian. To describe this correctly, imagine taking the $x$ argument of a field in our theory fixed, then this yields a real-valued function of a single, real variable $\phi(\cdot, x):\mathbb R\to\mathbb R$. Suppose that $\mathcal G$ denotes the set of such functions. Then the Lagrangian can be defined as a functional $L:\mathcal F\times\mathcal F\to\mathcal G$. In other words, it takes in two functions that map $\mathbb R^2\to\mathbb R$ and outputs a function that maps $\mathbb R\to\mathbb R$. We label the first argument suggestively by $\phi$ and the second argument suggestively by $\dot \phi$, but in principle, one can evaluate $L$ on whatever fields $\phi$ and $\psi$ that one chooses and write, for example $L[\phi, \psi]$.
I claim that the definitions of the relevant functional derivatives are as follows:
\begin{align}
\frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t)
&= \lim_{\epsilon\to 0}\frac{L[\phi+\epsilon\Delta_x,\dot\phi](t)-L[\phi,\dot\phi](t)}{\epsilon} \\
\frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t)
&= \lim_{\epsilon\to 0}\frac{L[\phi,\dot\phi+\epsilon\Delta_x](t)-L[\phi,\dot\phi](t)}{\epsilon}
\end{align}
where, I am using the notation
\begin{align}
\Delta_{x}(t,x') = \delta(x'-x)
\end{align}
Notice that this is essentially like taking partial derivatives because we vary the arguments of $L$ independently.
Now, suppose that we have a theory described by a Lagrangian density that is a local function of the field and its first derivatives. Then the Lagrangian density is defined as a function $\mathscr L:\mathbb R^3\to\mathbb R$, and, because we anticipate that we will be putting the values of the field and its derivatives into the arguments of the Lagrangian density, we label its three arguments with the symbols $\phi, \dot \phi, \phi'$. The symbols $\dot\phi$ and $\phi'$ are supposed to suggestively indicate that the arguments of the Lagrangian density are meant to be evaluated on the values of a field and its time and space derivative. This is, of course, a bit of an abuse of notation since $\phi$ is usually reserved as a symbol for the field, a function $\mathbb R^2\to\mathbb R$, not for the values of the field. But as long as we keep this abuse of notation in mind, we shouldn't get confused. Then we have
\begin{align}
L[\phi, \dot\phi](t) = \int dx' \,\mathscr L(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x'))
\end{align}
Now let's apply the definitions of the functional derivatives above and see what we get. For one thing, we have
\begin{align}
\frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t)
&= \lim_{\epsilon\to0}\frac{\int dx'\,\mathscr{L}(\phi(t,x'),\dot{\phi}(t,x')+\epsilon\delta(x'-x),\partial_{x'}\phi(t,x'))-\int dx'\,\mathscr{L}}{\epsilon} \\
&= \int dx'\,\frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x'))\,\delta(x'-x) \\
&= \frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x))
\end{align}
which is exactly what you said you should get in your question. Similarly, I'll leave it to you to show that the definition above yields
\begin{align}
\frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t)
&= \frac{\partial\mathscr L}{\partial \phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x)) \\ &\hspace{2cm}-\frac{\partial}{\partial x}\left[\frac{\partial\mathscr L}{\partial \phi'}(\phi(t,x),\dot\phi(t,x),\phi'(t,x))\right]
\end{align}
or, if we relax the notation a bit since we know what we're doing now, we can summarize this as
\begin{align}
\frac{\delta L}{\delta \dot\phi} = \frac{\partial\mathscr L}{\partial\dot\phi}, \qquad \frac{\delta L}{\delta \phi} = \frac{\partial \mathscr L}{\partial \phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'}
\end{align}
Now, suppose that we want to obtain the Euler-Lagrange equations. For this, we define the action for our theory as a function $S:\mathcal F\to\mathbb R$ as follows:
\begin{align}
S[\phi]=\int dt \,L[\phi, \dot\phi](t)
\end{align}
Notice that here, the symbol $\dot\phi$ does denote the partial time-derivative of the field $\phi$, namely $\dot\phi = \partial_t\phi$. The key point here is that even though the arguments of the Lagrangian are independent, we always have the freedom to evaluate the arguments on a field and its derivative which are certainly not independent. In particular, this means that if we vary the action, then in the integral on the right hand side, we can perform the sort of integration by parts that you were worried we wouldn't be able to do. In fact, if you vary the action, then you'll find that
\begin{align}
\delta S[\phi]
&= \int dt\,dx\,\left[\frac{\delta L}{\delta\phi} - \frac{\partial}{\partial t}\frac{\delta L}{\delta\dot\phi}\right]\delta\phi
\end{align}
so setting the variation to zero, and using the results I derived above using the claimed definitions of the partial variational derivatives, we obtain the standard Euler-Lagrange equations
\begin{align}
\frac{\partial \mathscr L}{\partial \phi} -\frac{\partial}{\partial t}\frac{\partial \mathscr L}{\partial \dot\phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'}=0.
\end{align}
Let me give a pedagogical answer. You're confused about the functional derivative
$$ \frac{\delta}{\delta \phi(y)}\int dx\, \phi(x) \partial^2 \phi(x).$$
We can compute such derivatives by perturbing a functional:
$$ F[\phi + \epsilon \chi] = F[\phi] + \epsilon \int dx \, \frac{\delta F[\phi]}{\delta \phi(x)}\chi(x) + O(\epsilon^2).$$
Now let $F[\phi] = \int dx\, \phi(x) \partial^2 \phi(x)$. Then
$$ F[\phi + \epsilon \chi] - F[\phi] = \epsilon \int dx \left[ \chi(x) \partial^2 \phi(x) + \phi(x) \partial^2 \chi(x) \right] + O(\epsilon^2). $$
But this is not of the correct form, due to the $\partial^2 \chi(x)$ term. However, you can always integrate by parts (exercise):
$$ \int dx \, \phi(x) \partial^2 \chi(x) = \int dx \, \chi(x) \partial^2 \phi(x) \, + \, \text{boundary terms}.$$
By assumption, the boundary terms do not contribute. Bringing everything together, we can rewrite the formula above as
$$ F[\phi + \epsilon \chi] - F[\phi]= 2 \times \epsilon \int dx \, \chi(x) \partial^2 \phi(x) + O(\epsilon^2).$$
At this point, we conclude that
$$ \frac{\delta F[\phi]}{\delta \phi(x)} = 2 \partial^2 \phi(x). $$
Best Answer
Comment to the question (v2): P&S is using the notation of a 'same-spacetime' functional derivative. To illustrate this notation, let us for simplicity stay within first variations, and leave it to the reader to generalize to higher-order variations.
I) First of all, functional/variational derivatives should not be confused with partial derivatives. In practice, from an operational point of view (if we are not worried about mathematical details about existence and boundary terms), all we need to know is the following rules:
The formula $$\tag{A} \frac{\delta \phi^{\beta}(y)}{\delta\phi^{\alpha}(x)} ~=~\delta^{\beta}_{\alpha}~\delta^n(x-y), $$ where $n$ is the spacetime dimension.
Appropriate generalizations of elementary rules in calculus, such as, e.g., the chain rule, integration by parts, commutativity of derivatives, and the Dirac delta distribution.
For instance, by these rules 1 & 2, we have that
$$ \frac{\delta}{\delta\phi^{\beta}(y)} \frac{\partial}{\partial x^{\mu_1}}\ldots \frac{\partial}{\partial x^{\mu_r}}\phi^{\alpha}(x) ~=~ \frac{\partial}{\partial x^{\mu_1}}\ldots \frac{\partial}{\partial x^{\mu_r}} \frac{\delta}{\delta\phi^{\beta}(y)}\phi^{\alpha}(x)$$ $$\tag{B}~\stackrel{(A)}{=}~\delta_{\beta}^{\alpha}~\frac{\partial}{\partial x^{\mu_1}}\ldots \frac{\partial}{\partial x^{\mu_r}}\delta^n(x-y). $$
Similarly, by rules 1 & 2, we can deduce that the action
$$\tag{C}S~=~\int d^nx ~{\cal L}(x) , \qquad {\cal L}(x)\equiv {\cal L}(\phi(x), \partial \phi(x), \ldots, x),$$
has the Euler-Lagrange expression as its functional derivative
$$ \tag{D}\frac{\delta S}{\delta\phi^{\alpha} (x)}~=~ \frac{\partial{\cal L}(x) }{\partial\phi^{\alpha} (x)} - d_{\mu} \left(\frac{\partial{\cal L}(x) }{\partial\partial_{\mu}\phi^{\alpha} (x)} \right)+\ldots.$$
The ellipsis $\ldots$ in eqs. (C) and (D) denotes possible contributions from higher-order spacetime derivatives.
II) From formula (A) it becomes clear that it does not makes sense to consider the functional derivative $\frac{\delta {\cal L}(x)}{\delta\phi^{\alpha} (x)}$ wrt. the same spacetime argument $x$, because that would lead to infinities, cf. $\delta^n(0)=\infty$. Nevertheless, it is tempting to introduce the notation of a 'same-spacetime' functional derivative
$$\tag{E}\frac{\delta {\cal L}(x)}{\delta\phi^{\alpha} (x)}~:=~ \frac{\partial{\cal L}(x) }{\partial\phi^{\alpha} (x)} - d_{\mu} \left(\frac{\partial{\cal L}(x) }{\partial\partial_{\mu}\phi^{\alpha} (x)} \right)+\ldots.$$
We stress that eq. (E) is only a notational definition. It becomes meaningless if we try to interpret the lhs. of eq. (E) using the above rules 1 & 2.
III) Similarly, P&S talk about second-order 'same-spacetime' functional derivative
$$\tag{F}\frac{\delta^2 {\cal L}(x)}{\delta\phi^{\alpha} (x)\delta\phi^{\beta}(x)}.$$
We recommend to first work out the ordinary second-order functional derivative
$$\tag{G}\frac{\delta^2 S}{\delta\phi^{\alpha} (x)\delta\phi^{\beta}(y)}$$
using rules 1 & 2. Then it should be fairly straightforward to translate (G) into the 'same-spacetime' functional derivative language (F), if needed. [In particluar, eq. (G) contains a $\delta^n(x-y)$ while eq. (F) does not.]
IV) Finally we should mention that in field theory one often suppresses the spacetime indices $x,y,\ldots$, by using DeWitt's condensed notation.