In sequent calculus, Weakening (or Thinning) and Contraction are the so-called structural rules of inference.
Weakening rule introduces an extra formula $D$ in the antecedent: :
$$\frac{\Gamma\vdash \Delta}{D,\Gamma\vdash \Delta} \text {LW}$$
or in the succedent :
$$\frac{\Gamma\vdash \Delta}{\Gamma\vdash \Delta,D} \text {RW}$$
of the sequent $\Gamma \vdash \Delta$, where $\Gamma, \Delta$ are sets of formulae.
The "meaning" of the rule is the following :
if we have a derivation of the sequent $\Gamma \vdash \Delta$, we can "add" a formula $D$ to the set of assumptions .
Contraction rule is :
$$\frac{D,D,\Gamma\vdash \Delta}{D,\Gamma\vdash \Delta} \text {LC}$$
and :
$$\frac{\Gamma\vdash \Delta,D,D}{\Gamma\vdash \Delta,D} \text {RC}$$
The "meaning" of the rule is the following :
we can always "cancel" redundant occurrences of a formula in the antecedent or succedent of a sequent.
Added
The basic semantic definitions for sequents are :
a sequent $\Gamma \vdash \Delta$ is satisfied in an interpretation if either some formula in $\Gamma$ is not satisfied, or some formula in $\Delta$ is satisfied
and :
a sequent is valid if it is satisfied in every interpretation.
This means that we have to "read" a sequent as follows :
$$(\gamma_1 \land \ldots \gamma_n) \rightarrow (\delta_1 \lor \ldots \delta_m)$$
where $\Gamma = \{ \gamma_1, \ldots, \gamma_n \}$ and $\Delta = \{ \delta_1, \ldots, \delta_m \}$.
For simplicity, assume that $\Gamma = \{ \gamma_1, \gamma_2 \}$ and $\Delta = \{ \delta_1, \delta_2 \}$.
The semantic definition above says that the sequent is valid iff, for every interpretation, either some formula in the antecedent is false or some formula in the succedent is true, i.e. :
$$\vDash (\gamma_1 \land \gamma_2) \rightarrow (\delta_1 \lor \delta_2)$$
If it so, then we can add a formula $D$ whatever to the antecedent, and if the conjunction $\gamma_1 \land \gamma_2$ is false the new conjunction $(\gamma_1 \land \gamma_2) \land D$ will still be false or we can add a formula $D$ whatever to the succedent, and if the disjunction $\delta_1 \lor \delta_2$ is true the new disjunction $(\delta_1 \lor \delta_2) \lor D$ will still be true.
In both cases, if the upper sequent $\Gamma \vdash \Delta$ is valid, then also the lower ones : $D,\Gamma \vdash \Delta$ and $\Gamma \vdash \Delta,D$ are.
The usual way to prove that a given category is a model of some syntax is to show that it belongs to some category of structured categories and that the syntax defines the initial object of that category. But since substitution in syntax is strict, the obvious category in which it defines the initial object consists of categorical structures with strict substitution. Thus, in order to interpret syntax in a category without strictly associative pullbacks, one needs a coherence theorem to replace the latter by something that does have strict substitutions.
The reason this problem doesn't arise until you get to dependent type theory is that for first-order logic you can take the quotient by isomorphisms, interpreting predicates by isomorphism classes of monomorphisms rather than single ones. This produces a fibration of posets (not just preorders) with strictly associative substitution.
I am not sure what you mean by "substitution is interpreted by composition". If your category is an elementary topos, then you can also choose to interpret predicates by maps into the subobject classifier, in which case substitution is indeed interpreted by composition, which is strictly associative. But for more general categories this is not possible.
Note also that weakening is really just a special case of substitution, in the broad sense of substitution along a map of contexts.
Finally, as you may have noticed, the section on dependent type theory in Sketches of an Elephant is rather, well, sketchy. In particular, it ignores this issue of coherence.
Best Answer
You are using confusing notation. You are using $\longrightarrow$ for two different things and then, arguably, conflating it with $\to$ which is a third thing. A more common notation, using weakening as an example, would be $$\dfrac{\Gamma\vdash p}{\Gamma,x\vdash p}$$ or $$\dfrac{\Gamma\implies p}{\Gamma,x\implies p}$$
The first thing to note is that these are rules, not axioms. They show you how given one derivation (e.g. a derivation of $\Gamma\vdash p$), you can extend it to a longer derivation (e.g. a derivation of $\Gamma,x\vdash p$). A proof system as a whole will be a collection of such rules. A (conditional) proof of a formula is then a derivation which ends in that formula and which will need to start at some rules that have no premises, e.g. $\dfrac{}{\Gamma,p\vdash p}$.
All that is being said is that we're removing some of these rules as options. That means derivations can't use them. You then consider what derivations you can still make which will give you a (potentially) smaller set of provable formulas. Derivations are tree-like mathematical structures. Removing rules just changes what trees you can make.
As an analogy, if you don't have flour, this (potentially) limits what you can cook. It doesn't mean that it is now impossible to cook anything for which you have a recipe that involves flour. Maybe you have an alternate recipe that doesn't use flour. Maybe you can produce your own flour. Maybe you couldn't follow any of the recipes using flour even if you did have flour.