Device fingerprinting is a technique that got popular at the end of the 90s by websites, to identify and track users. Apps have in contrast to websites often access to a much wider property range usable for fingerprinting. The unique device identification via fingerprinting across sandbox borders of multiple apps is a relevant privacy issue as it increase the possibilities for tracking users, even without requiring permissions. Many smartphone users are unaware of such practices and possibilities because such activities happen unnoticeably in the background.
Just like each person has an individual fingerprint, electronic devices are also uniquely identifiable. Through manufacturing tolerances, different hardware configurations, software properties and individual software configurability such devices become uniquely identifiable. Depending on the number and the variance of properties used, more or less unique fingerprints can be generated. Typically, several hard- and software device properties are combined through a hashing algorithm to an ideally unique bit sequence.
Motivations for app developers to include tracking libraries are: advertisement, bot detection, account takeover, spam, fraud detection and secure environment detection for payments. Mobile operating systems such as Android and iOS are aware of the user’s transparency through fingerprinting and take steps to empower the user to regain privacy whilst trying to maintain legitimate tracking objectives. The advertisement ID was introduced especially to target device tracking with a resettable ID provided by iOS. It allows advertisers to personalize ads and at the same time give the user options to reset it or hide it for individual apps. Also, Privacy Labels are added to the app store to better explain the app’s data usage to the user. Permissions and Privacy Labels should shed light on the usage of tracking, and it’s purpose during app install and usage. Apple AppStore’s licence agreement also states clear rules on data and identifier usage. However, as shown by Deng et al., many apps exist in the AppStore infringing such rules, working their way around permissions and being dishonest with their Privacy Labels. One increasing way to bypass such restrictions is to use fingerprinting techniques not requiring user consent.
Tracking Possibilities with Fingerprinting
The fingerprint generated by each app containing the same fingerprinting SDK are ideally the same on the same device. The user becomes very transparent by correlating the fingerprint with other data provided during app usage. As an example, one could think of a scenario, pictured in the following figure, where a shopping app, a search engine app and a navigation app all contain the same fingerprinting SDK. All user input of each app like recent purchases, search terms and the device location could be accumulated in the cloud under the same fingerprint. The data collection of each app alone is already a privacy risk to the user, but the combination of different app data makes users fully transparent. Such data is extremely valuable for example for advertisement companies to personally tailor advertisements for individual users.
Fingerprinting SDKs and Commonly used Properties
To gain insight into the current state of iOS fingerprinting and the latest techniques being used, we identified real iOS fingerprinting SDKs through systematic internet research and conducted manual static and dynamic analysis. The respective SDKs source-code was manually analysed with Ghidra and if possible, the SDK was tested in a minimal app construct. During the manual analysis we judged the SDK as fingerprinter or non-fingerprinter based on the observed behaviour and used properties. This leaves 13 SDKs that we classify as fingerprinters and further analyse in more detail. The SDKs are listed in the following table. We found that all 13 SDKs use the native iOS API to collect device characteristics. Through dynamic analysis of custom-built test apps, we have found that the collection of features shows up through temporal spikes. Most SDKs send the collected device characteristics to a server for further processing, except for OpenIDFA, DFID and DFIDSwift, which are non-commercial products. Additional passive fingerprinting may be performed on the server side. The SDKs that do not include server communication generate a hash on the device and return it. Nevertheless, we argue that fingerprinting is generally only useful when the fingerprint leaves the device. Therefore, it can be assumed that in these cases the caller of the fingerprinting framework handles the network communication. It is obvious from the following table that fingerprinting SDKs share some
collected properties, while other properties are exclusively queried by some SDKs. Very commonly used properties are: device model, identifier for vendor (similar to advertisement ID), device name, storage space properties, battery level and different locale identifiers. Other properties like device fonts, commonly used in web based fingerprinters, is rather seldom used. We suppose, due to the low entropy of this value on iOS devices. In conclusion, one can say that fingerprinters usually use multiple properties and don’t rely on a unique identifier provided through the advertisement ID in iOS.
Fingerprinting Access Pattern
We were able to observe access to common iOS APIs gathered in the previous step. Frida hooks were written to log access to respective APIs in a dynamic app analysis on a jailbroken iPhone X (iOS 14.4.2). In the next step, we analysed several top apps from the app store to see if fingerprinting can be observed based on API access pattern during execution.
Firstly, let’s have a look at an example of an app without fingerprinting in the following figure. One can see that different properties were used during the app execution, some less, some more frequently. The accesses to the observed properties came from different caller modules (packages or libraries) from inside the app. Also, the properties were accessed throughout the whole runtime. Access to properties such as timezone and locale are typically required to properly display and run the app and totally legitimate.
Now, let’s look at an app which we judge as fingerprinting in the following figure. One can see that from second 15 to 17, many properties are accessed in a very short timespan. Furthermore, the accesses come from the same caller module called ForterSDK, which is a known fingerprinting SDK. We observed such behaviour for all fingerprinting SDKs: Many properties are collected by the same caller module in a short time frame.
As a proof of concept, we developed a fingerprinting detector to detect fingerprinting on the traces of a dynamic app analysis. The analyser is implemented as a sliding window of a certain width. The window of four seconds width slides over the event timeline and counts the number of accesses to known fingerprinting APIs per caller module. Access numbers above a certain threshold are then justified as fingerprinting. In a first test, we were able to correctly detect fingerprinting in 85% of the cases just by analysing access pattern. As further refinements, machine learning could improve accuracy as well as the inclusion of known fingerprinting SDK properties through a static analysis. Further insights on our used detection mechanism can be read in our extensive paper at ACM, published at the EICC conference 2023.
Fingerprinting as of today is used in many apps. Smartphone users are mostly unaware of the existence and the privacy issues related with such. This research lays the fundamentals to identify device fingerprinting in iOS apps. Future work will be to come up with remediation ideas to balance the privacy interests of smartphone users and fingerprinters.
This research work was supported by the National Research Center for Applied Cybersecurity ATHENE. ATHENE is funded jointly by the German Federal Ministry of Education and Research and the Hessian Ministry of Higher Education, Research and the Arts.
Note: Fingerprinting detection is currently work in progress and not (yet) part of the Appicaptor service.