For high-dimensional data, shared-nearest-neighbor distances have been reported to work in
Houle et al., Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? Scientific and Statistical Database Management. Lecture Notes in Computer Science 6187. p. 482. doi:10.1007/978-3-642-13818-8_34
Fractional distances are known to be not metric. $L_p$ is only a metric for $p\geq 1$, you'll find this restriction in every proof of the metric properties of Minkowski norms.
$$\text{cos-dist}(A, B) = 1 - \text{cos-sim}(A, B)$$
$$\text{cos-sim}(A, B) = \frac{\langle A, B \rangle}{||A|| \cdot ||B||} = \frac{\sum\limits_{i=1}^n A_i \cdot B_i}{\sqrt{\sum\limits_{i=1}^n A_i^2} \cdot \sqrt{\sum\limits_{i=1}^n B_i^2}}$$
Triangle inequality for cosine distance tooks a form of (of course it doesn't hold):
$$\text{cos-dist}(A,C) \nleq \text{cos-dist}(A, B) + \text{cos-dist}(B, C)$$
which is equivalent to:
$$1 - \text{cos-sim}(A,C) \nleq 1 - \text{cos-sim}(A, B) + 1 - \text{cos-sim}(B, C)$$
and after simple transformations:
$$1 + \text{cos-sim}(A, C) \ngeq \text{cos-sim}(A, B) + \text{cos-sim}(B, C)$$
Now, you're trying to find such three vectors A, B and C that:
$$1 + \text{cos-sim}(A, C) < \text{cos-sim}(A, B) + \text{cos-sim}(B, C)$$
Let $A, B, C \in \mathbb{R}^2$ and all of them are of unit length A = [1, 0], B = $\left[\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2}\right]$, C = [0, 1].
Note that vectors A and C are orthogonal, so we would get simply $0$:
$$\text{cos-sim}(A, C) = \frac{0}{\sqrt{1}\sqrt{1}} = 0$$
Each pair of vectors A & B as well as B & C would give the same value:
$$ \text{cos-sim}(A, B) = \frac{\frac{\sqrt{2}}{2} + 0}{\sqrt{1}\sqrt{1}} = \frac{\sqrt{2}}{2},~~~ \text{cos-sim}(B, C) = \frac{0+\frac{\sqrt{2}}{2}}{\sqrt{1}\sqrt{1}} = \frac{\sqrt{2}}{2}$$.
Finally, we could defeat primary inequality by proving that:
$$ 1 + 0 < \frac{\sqrt{2}}{2} + \frac{\sqrt{2}}{2}$$
$$ 1 < \sqrt{2} \approx 1.41 \dots$$
Best Answer
First of all, in many applications you do not need a distance metric, but a dissimilarity will be okay. So make sure that triangle inequality is needed.
In mathematics, triangle inequality is part of the definition of a metric, and distances in mathematics are synonymous to metrics. But in database literature, often distances are not required to be metric.
Second, we cannot recommend a metric for your data, if we don't know your data.
Third, Cosine is closely related to Euclidean distance. Assuming that all your data is normalized to unit length ($||x||=1=||y||$), then \begin{align*} \text{Euclid}^2(x,y)&=\sum_i (x_i-y_i)^2\\ &=\sum_ix^2+\sum_iy^2-2\sum_i x_iy_i\\ &=1+1-2\cdot x\cdot y\\ &=2(1-x\cdot y) \end{align*} Therefore, if your data is normalized to unit length, $$ \sqrt{1-x\cdot y} $$ is a metric. Because as just shown, $\sqrt{1-x\cdot y}=\sqrt{\frac{1}{2}}\text{Euclid}(x,y)$.
While this may get you overly excited that there is a metric based on the dot product, recall that this only holds if all your data lives on the unit circle and this is just Euclidean metric. If this is the behaviour you want, normalize your data and use Euclidean distance... Cosine distance is exactly this normalization. It includes normalization terms for the length of the vectors to ensure they are of unit length...
If your data is sparse, and you can afford to keep all vector lengths in memory, then this may be a faster way to compute Euclidean distance. If you have a sparsity of $s$, the expected sparsity of the dot product is $s^2$, so this can yield a substantial performance benefit of $1/s$, if you have a good implementation.
Update: it was pointed out to me that computing Euclidean this way can suffer from a numerical instability called "catastrophic cancellation".