Say you are standing on the side of a hill. Imagine somewhere beneath the hill, there is a flat $x,y$ plane that you can use to determine your position. Let's say $+x$ is east and $+y$ is north.
If the hill is smooth, then the height of the hill above this plane is some continuous function $f(x,y)$.
The gradient of $f$ at any point tells you which direction is the steepest from that point and how steep it is. To find the direction of the gradient of $f$ where you are standing, decide which direction is the steepest. The answer could be "north" or "30 degrees west of south". There is no vertical component to the gradient, it is telling you a direction with respect to the $x,y$ plane which is your reference. The magnitude of the gradient will be the slope of the hill in that direction.
The tangent plane is the plane that best approximates the shape of the hill where you are standing. The hill may be curved if you look at it from a distance, but maybe directly beneath your feet it is flat enough to set a pizza box down and have it be flush with the ground. The plane that the bottom of the pizza box defines would, roughly, be the "tangent" plane.
I'll try to convince you that they are geometrically quite obviously different, but when it comes to naming them, the begin to look alike :)
Geometrically you probably already have a good picture of what a point is: it's just the primitive notion of a point you have in geometry. That is, a single dimensionless location in space.
A vector should be thought of as having two qualities: a ray that has direction and magnitude. In basic vector algebra in $\Bbb R^n$, we learn that such a ray can slide all around $\Bbb R^n$, and as long as you aren't changing the direction or the length of the ray, then it is still the same vector.
Now when it comes to naming these two things, they start to look alike! With Cartesian coordinates, points in $\Bbb R^n$ are labeled by their projections to the axes, and that creates a list of real numbers. Similarly, when we go about naming vectors, we have this convention of sliding the vector so that it is being emitted from the origin, and then we check to see what point is on its arrowhead. The vector is named after this point.
So in both cases, a similar list of real numbers is used to identify the object. Since this is the case, it's common to just start referring to any ordered $n$-tuple of things from a field (like $\Bbb R$) as a "vector," even if we aren't thinking of it as a ray in that application.
One example is that of vector fields. Since these are functions of position, the inputs they take are points of $\Bbb R^n$ (which look like an ordered $n$-tuple). The outputs are vectors (which again look like an ordered $n$-tuple), but we are interpreting these as the vectors they represent, slid over from the origin to the point we're at.
You can, of course, really have vector inputs! For instance, the length of a vector in $\Bbb R^n$ creates a function from vectors into $\Bbb R$. Of course, the same function could be reinterpreted as the distance-to-zero function on points of $\Bbb R^n$.
So, the difference is all in how you are interpreting that particular list of numbers.
For #1 in your post, you are probably thinking of it as the line segment between points $x$ and $x'$. The addition that's going on is vector addition though. Drawing the vectors that $x$ and $x'$ represent, you see you have two vectors extending from the orign to these two points. For any two vectors $v,w$, $v-w$ yields the vector which fits between the two tips of $w$ and $v$, and points to the tip of $v$. So, you can see that $x-x'$ has the point $x$ on its tip.
What does the $t$ contribute? If you multiply out $xt+(1-t)x'=x'+t(x-x')$, you can see that the vector $x-x'$ is being scaled by $t$ to something shorter, and then is being concatenated onto the tip of $x'$. The tip of this arrow gives another point on the segment. Ranging over all $t$ between 0 and 1, you get vectors pointing to all points on that segment.
Best Answer
OK, from the discourse in the comments, I am now clear that the Binormal is generally used for curves. In the case of surfaces, the vector perpendicular to the normal and the first tangent is called a Bitangent. Specifically, if the surface is foliated with geodesics Binormals to these geodesics are in fact Bitangents.