[GIS] Would you consider online geocoding a breach of privacy

geocodingSecurity

Suppose I have a bunch of addresses of individuals participating in a certain study (most likely – health related, where privacy and ethical considerations are always important issues).

Nowadays, providers like Google or Yahoo offer decent results in terms of positional accuracy.

The North American Association of Central Cancer Registries (NAACCR) lists such options in their 'Geocoding Best Practices: Review of Eight Commonly Used Geocoding Systems' and 'A Geocoding Best Practices Guide' guides.

Cinnamon and Schuurman (2010) for example used BatchGeocode service as a part of their tool to investigate injuries in low resource setting.

Would you consider geocoding such addresses using online services, like Google Maps or OpenStreetMap a breach of privacy?

A possibly related question is Geocoding USA addresses that cannot be sent over internet?

An article in Epidemiology (one of the leading, peer-review journals in the field) published short communication detailing instructions on how to geocode using Google Maps & Places APIs. Interestingly, not a word about security/privacy was mentioned.

Best Answer

There is definitely a privacy implication here - particularly if you are working with small batches of data. Anyone who is attempting to mine the data stream will be able to make assumptions that all requests in the same batch have something in common - even if the medical condition or personal information is not disclosed over the wire.

A better technique is to batch up lots of unrelated data / patients for bulk geocoding.

For example - combine your data needing geocoding with other researchers - the more unrelated issues the better. Randomize the order of the requests. And once per day batch process through this queue, all at once.

Now it becomes vastly harder to mine the data, even if an attacker is able to overhear the geocoding requests.

Related Question