For example, if I drop a ball from a $50$ meters building, then I will consider

  1. the ground is $0$ meter

  2. downward is positive ( which makes gravity positive, downward velocity positive, etc)

so with that if i use $X_f = X_i + V_it + \frac{1}{2}at^2$ then I would get something like
$0 = 50 + 4.9t^2$ which is not even possible.

Instinctively I know what to do but when I think more about sign convention, it seems so confusing.

You want $X_i=-50$. The ground is zero, down is positive, so the top of the building is at $-50$. There's no universal convention. You're stuck figuring it out from scratch each time. Fortunately once you do it several times you'll get the hang of it.

