A somewhat complicated answer to your first question about distances goes like this.
Rencher's Multivariate Analysis cites Lance and Williams (1967) who proposed a general (flexible beta) method underlying most of the existing hierarchical cluster analysis: for three objects A, B and C, be that clusters or individual points considered to be clusters of size 1, if clusters A and B are joined into AB, then the new distance from AB to C can be expressed by
$$
d(AB,C) = \alpha_A d(C,A) + \alpha_B d(C,B) + \beta d(A,B) + \gamma| d(C,A) - d(C,B) |,
$$
They suggested $\alpha_A + \alpha_B + \beta=1$, $\alpha_A = \alpha_B$, $\gamma=0$ and $\beta < 1$, although you could try other choices. E.g., single linkage is $\alpha_A = \alpha_B = 1/2$, $\beta=0$, $\gamma=-1/2$ (contracting the space), complete linkage is the same except for $\gamma=1/2$ (diluting the space), and Ward's method is $\alpha_A = (n_A + n_C)/(n_A + n_B + n_C)$, $\alpha_B = (n_B + n_C)/(n_A + n_B + n_C)$, $\beta= - n_C/(n_A + n_B + n_C)$, and $\gamma=0$ (somewhat space-contracting). The book discusses the properties of the algorithms, such as monotonicity and space contraction/dilution as functions of these parameters.
So an algorithm can be build just based on the distances, and does not need the data matrix. However, the distance matrix for typical problems (sample size $n$ > dimension of the data $p$) takes more memory than the data matrix, so this may not be a very efficient way of approaching the problem, unless of course the data matrix is all you have.
I know of people who spatially cluster individual crime types: see the CrimeStat documentation for a number of applied examples. I don't see much utility in trying to separate different clusters based on the crime type though. Many places are crime generalists, such as a busy commercial area which will have many assaults, robberies, and thefts. These overlapping hot spots would be difficult to separate in any supervised clustering technique.
About the only crime type I might expect this is feasible is residential burglary; those hotspots tend to differ from areas of elevated crime due to more people walking around and interacting.
I can see some utility in such a project though. A hotspot that has many different crime types and a hotspot that only has one crime type may require different strategies by the police department to address the crime problems. That might call for unsupervised classification though.
Best Answer
Epsilon is the local radius for expanding clusters. Think of it as a step size - DBSCAN never takes a step larger than this, but by doing multiple steps DBSCAN clusters can become much larger than eps.
If you want your "clusters" to have a maximum radius, that is a set cover type of problem, so you will probably want a greedy approximation. It's not a clustering problem, because you do not allow the clustering algorithm to discover structure larger than that. You want to approximate your data with a cover, ignoring structure.
But there are some clustering algorithms where you can bound the cluster radius (but they probably won't try hard enough to optimize for your problem):