Each Wi-Fi/bluetooth enabled device, such as a smartphone or wireless printers, transmits messages to determine with the nearest cell tower and/or communicate with other devices. These messages contain, amongst other things, information pertaining to the identification of the Wi-Fi enabled device that sends the message and the strength of the signal that is being transmitted by the device. Using these identification information one can determine how many unique devices are present at a certain location. Moreover, when comparing the lists of unique devices of two locations one can determine how many devices moved from one location to the next. Important information when monitoring large crowds.
Wi-Fi/Bluetooth sensor detects the communication of these Wi-Fi and Bluetooth enabled devices with the nearest cell tower or Wi-Fi router and filters out the unique media access control (MAC) addresses of these mobile devices. Important to realize is that only a part of the messages is detected, roughly 20% of the devices in the vicinity of the sensor. Moreover, some types of mobile devices shield the MAC address of the device. The messages of these types of mobile devices can consequently not be used to track the movements of the device through the sensor network.
The MAC-address, which is part of this message, is privacy-sensitive information as it can be used to identify one particular Wi-Fi enabled device that belongs to a particular person. Therefore, it is important to ensure that the right to privacy of citizens is safeguarded. Therefore, the Wi-Fi/Bluetooth sensors applied in the CityFlows project anonymize this data as soon as possible in a way that a person is ‘forgotten’ at least one time per day, while retaining the important properties of the data that allows one to analyze the crowds movement behaviour. The anonymization process consists of four steps, which are performed each minute:
- Capturing the messages and filtering the MAC-addresses and timestamps from each of the messages
- Scrambling the MAC-addresses by applying a hashing algorithm. All sensors within the monitoring system apply the same hash at the same time, which is changing at least 1 time per day.
- Deleting a part of the hashed addresses, which ensures that even if one knows the exact hashing algorithm, one is not able to recreate the original MAC-address.
- Transmitting the list of shortened addresses to the CM-DSS database via a secure connection.
After this process, the CM-DSS database contains a list of hashed identifiers that can be used to analyze the crowd’s movements, but not be used to identify one particular mobile device, nor one particular person.