By Sam Leach
The fourth edition in our new series, exploring tracking in the digital world considers how the storage, processing and publication of location data can have significant privacy impacts on individuals and groups. Just witness the many high profile cases where the analysis of publicly released location data has impacted individual and group privacy, potentially exposing people to unforeseen or unintended threats and harm.
These types of location privacy breaches, leakages or attacks usually share some common causes relating to flaws in the way anonymization – both ‘aggregation’ (generalisation) and ‘redaction’ (masking) – has been performed, if at all.
Mathematically speaking, precise location data is a “high dimensional” data set, especially when timestamped or tagged with other properties. This means, roughly speaking, it contains a lot of information and variations, which makes naive anonymization difficult. In other words, your location data can be like a ‘fingerprint’.
The problems start when the idea of privacy by design or data protection by design, has not been considered, or where privacy impact assessments have focussed too narrowly on technology, not people.
In this post we list eleven privacy impacts of location data that, when ignored can lead to tension with your people at best and real harm to them at worst.
1. Using location when location isn’t needed or expected
Does your app or service need to store and share the precise street level accuracy location of people using it? If it does, are you informing the people using your app in an understandable and transparent way? Is there a simple, explainable purpose behind this location-based feature?
2. Too precise a-location when all you need is approximate location
Does your weather app need to know a precise street level location in order to give you accurate local weather conditions, or would an approximate location (say to the nearest five kilometers) be good enough? For the cases where the answer to this question is ‘No’, Apple introduced their ‘precise location’ toggle for all apps, ever since iOS14, giving users control.
3. Too precise a-location when all you need is relative distance
Machines like to work with numbers like longitude and latitude to give the ‘absolute’ location of a person or event in a standard format. But people tend to prefer to use place names or relative distances and times, for example: “How far away are you?” or “How long until you arrive home?
A common privacy approach in social apps, like dating apps, is to implement a proximity filter to help you meet people in your vicinity without giving them your precise current location.
4. Too precise relative location that can be measured repeatedly over time or space
Even if a ‘proximity filter’ is applied, if the relative distance can be measured or sampled repeatedly over time accurately, then you can potentially ‘triangulate’ the location of another person or a specific place like their home. As an example, Global Positioning Satellites (GPS) essentially perform ‘triangulation in the skies’ to locate your position.
5. Too precise timestamps over long time periods when they’re not needed
When we think of the word ‘tracking’, one of the things that is intuitively held in mind is that the location data also has some associated ‘timestamp’ (time and date) associated with it.
Imagine you had the location and timestamps of a group of people over some observation period. This information alone is enough to make ‘contact tracing’ inferences like ‘person A and person B met repeatedly at location X’ or ‘person C was exposed to person D’.
So the more accurate the timestamps, the greater the privacy impacts of the inferences you could make about meetings between people and the more important limiting data retainment periods becomes.
6. Too long-lived personal identifiers when they should be rotated
When we talk about ‘retainment’ of personal and location data, it is most often implicitly assumed this data has a ‘unique identifier’ for each person, that may extend far beyond the retainment timescale.
This need not actually be the case if we want to preserve and engineer privacy in some cases.
For example, even if the unique identifier of a person is ‘rotated’ (randomly updated) once a day, then we can still make useful analyses about traffic conditions on the roads, or footfall on the high street, using smartphone data.
This type of rotation would prevent the aggregation of the data by day, leading to a type of conclusion such as ‘X repeatedly passes by location Y at time Z each day’.
7. Not redacting repeated characteristics
Care needs to be taken with the repeated location characteristics of people. For example, most people can be uniquely identified from their home and office locations. It’s not a perfect ‘fingerprint’, but it is close.
One way to help increase privacy is therefore to carefully redact the start and end of journey data, or carefully add random noise using special privacy techniques.
8. Not redacting rare characteristics
Rare characteristics in location data sets can be exploited if combined with other ‘external’ data. For instance, time stamped photos of a person at a known location could be used together with journey data to figure out their unique identifier in their tracking data and hence all their future trips in some data set. Thus a person is reidentified from ‘pseudonymised’ data.
Rare characteristics ‘add information’ because they ‘narrow down the size of the haystack’ in which a needle (identity of a person) can be found.
9. Too complicated privacy settings
Many organisations provide people with privacy ‘settings’ or ‘choices’ in order to give ‘control’ to users of their service. However the more settings there are, then the harder it becomes for an everyday person with limited time to understand the privacy impact of each setting.
People ultimately want control over their information, not over app settings.
10. Remote server features when local smartphone features will do
Do you need to transfer a person’s location from their phone to a remote server for storage and processing, or can you just perform a service ‘locally’ on their phone? An example here is ‘geofence’ or place-based features on some safety apps. Besides, there can often be many benefits to locally processed app features, such as graceful handling of offline conditions (when there is no internet connection).
11. Using location data which is highly correlated with protected characteristics
If your system makes a prediction relying upon location data (such as a home postcode), that is highly correlated with some protected characteristics (such as ethnicity), then your system may be making unfair or illegal interventions, reinforcing stereotypes, or causing real harm.
For privacy specialists, the above list of eleven impacts is not meant to be a comprehensive framework for the technical aspects of engineering privacy and preventing harm. We know privacy specialists include techniques such as ‘differential privacy’, and ‘federated learning’, which are being explored around the world in specific product domains as well as in national censuses. We also can’t ignore the need for organisational privacy controls as well as the need for education.
Rather, this list is hoped to be a set of now-common-sense intuitions about the ways in which things go wrong for people and organisations when handling location data, most of the time.
We believe that individuals, businesses and organisations will become even more aware of the privacy impacts of location data, and we support the need for accountability and trust in this domain. We also welcome the continued iteration and development of frameworks and legislation around the world such as the GDPR (General Data Protection Regulations), CCPA (California Consumer Privacy Act) and the implementation of products and services that use ‘privacy by design’.
To find out more about the work of Track24, visit our main website here: www.track24.com
- Strava Privacy Settings: https://support.strava.com/hc/en-us/articles/216918777-Privacy-Settings
- Strava Engineering heat map blog post (November 2017): https://medium.com/strava-engineering/the-global-heatmap-now-6x-hotter-23fc01d301de
- Nathan Ruser tweetstorm (January 27th) https://twitter.com/Nrg8000/status/957318498102865920
- Washington Post article https://www.washingtonpost.com/world/a-map-showing-the-users-of-fitness-devices-lets-the-world-see-where-us-soldiers-are-and-what-they-are-doing/2018/01/28/86915662-0441-11e8-aa61-f3391373867e_story.html
- Example of 1: Famously, up until 2015, Facebook Messenger automatically shared your location with your friends when you sent them a message, until they came under fire for letting your friends ‘stalk’ you (https://www.tripwire.com/state-of-security/security-data-protection/cyber-security/stalk-location-facebook-messenger/). The feature was withdrawn and has now been reworked as the more privacy-conscious and time-limited Live Location feature of Messenger.