The manhattan distance is based on absolute value distance, as opposed to squared error (read Eclidean) distance. In practice, you should get similar results most of the time. Absolute value distance should give more robust results, whereas Euclidean would be influenced by unusual values.
This is a multivariate technique, and "distance" between two points involves aggregating the distances between each variable. So if two points are close on most variables, but more discrepant on one of them, Euclidean distance will exagerate that discrepancy, whereas Manhattan distance will shrug it off, being more influenced by the closeness of the other variables.
According to Wikipedia, the k-medoid algorithm is not defined for Euclidean distance, which could explain why you have seen no examples of it. Presumably the reason for this is have a robust clustering method.
begin(RantMode)
Thoughtless analysts often throw a whole bag of variables into an analysis, not all of which have much to do with the problem at hand, nor do those analysts wish to take the necessary time to discern which variables matter -- possibly by talking to subject matter experts. Such analysts (who may possibly call themselves Big Data specialists) would naturally favour a technique that was robust with respect to choice of variable. Statisticians, traditionally, go for small amounts of quality data, and thus favour squared error methods with their greater efficiency.
end(RantMode)
In my opinion your first approach isn't enought because of the difference between categorical and numerical numerical. The standardisation should be maybe more appropriate but i don't have enough knowledge about it and recommand you to treat those two type separetely.
Your second proposition seems great because you use appropriate distance for each type of data and combine them to obtain a final result. There are lots to discuss about how weighted them.
I will encourage you to read this very interesting paper about categorical data where a lot of distance measure are inspect :
Similarity Measures for Categorical Data: A Comparative Evaluation
by Shyam Boriah, Varun Chandola and Vipin Kumar
http://www-users.cs.umn.edu/~sboriah/PDFs/BoriahBCK2008.pdf
It could be more preferable than Hamming depending the case.
Best Answer
I wouldn't always agree with that statement. but here is an intuition why probably the default should be this way.
But that is just an intuition, a "rule of thumb"
For example a discrete histogram, I'd rather use a divergence distance.
Or if you consider "money" to be discrete (I would not suggest to do this here), and the number of products bought. There will usually be some linear dependence between these attributes, then Euclidean is likely more appropriate (not that it would make any sense to apply a distance function on
quantity,cost
without careful feature engineering.)