What data is transferred by business apps, and how secure is their processing? Our research shows: If your employees use apps arbitrarily, you put your company’s security at risk.
At it-sa 2024, we present our app analysis framework Appicaptor. You can use it to automatically check whether apps are compliant with your company’s IT security demands. New results within the Athene funded research project FiDeReMA will improve Appicaptor analysis techniques. Among other things, the goal is to identify and evaluate privacy implications when using arbitrary apps.
Our current efforts to improve Appicaptor revolve around different privacy aspects:
Device fingerprinting
App DSGVO Consent Banners
Permission piggybacking of Android in app third-party libraries
Device Fingerprinting
Device fingerprinting is a technique used to uniquely identify devices and therewith typically also users. Mobile apps make use of different device properties such for example device name, software version and others. They combine such values mostly with a hashing function to a unique identifier. Use cases for device fingerprinting are app usage statistics, fraud detection and mostly targeted advertisement.
When employees install apps with device fingerprinting on their mobile corporate devices, it can lead to the loss of sensitive business data for companies. Attackers can acquire the collected data and potentially identify the devices of company management, spy on trade secrets, and ascertain customer contacts. In practice, the Cambridge Analytica case shows how sufficient data from various sources can be used to analyze users and manipulate them through targeted advertising, thereby influencing even election results.
Our analysis results show that more than 60% of the top 1,000 apps on Google Play employ device fingerprinting techniques. This allows a unique device identification even across app borders.
We extracted 30,000 domains from the most popular 2,000 iOS and Android apps. Afterwards, we filtered out domain names for the most prevalent device fingerprinters covering 90% market share. We were able to prevent device fingerprinting in 40% of the cases by using publicly available domain blocklists. These lists are specialized on tracking and advertisement domains and can be easily applied to for example a firewall. Another 40% would easily be blockable by updating the blocklist: Some fingerprinters use random subdomains for their communication in the form of https://8726481.iFingerprinter.org giving many possibilities to be blocked.
We also proposed an approach to randomize return values of popular properties used for device fingerprinting. Our results prove that this technique dramatically reduces the uniqueness of the device fingerprint. Nevertheless, respective APIs remain accurate enough for their intended use cases. The hope is that our proposed techniques are included in future Android and iOS releases.
In-App DSGVO Consent Banners
Many of the previously mentioned device fingerprinting properties would typically require user consent as stated by the DSGVO. According to recent court decisions and EU regulations, rejecting must be just as easy as accepting consent banners. However, many companies don’t obey such regulations. Those who obey mostly apply dark pattern to push the user into accepting all tracking.
We are currently working on detecting if apps provide a reject option on first instance, without having to click through several submenus. In our attempt, we are employing artificial intelligence to analyse such consent banners for reject options on first instance. The results yield a success rate of 82%. Some apps try hard to remain within legal boundaries but make it hard for users to reject data usage. Can you find the first level reject option in the following consent banner? – Yes there is an option.
We have collected the most interesting consent banners and bundled them into a game. With the app “Reject all cookies” you need to overcome several consent banners and keep your data private. Having rejected all “cookies” in the app you can win real cookies at our it-sa 2024 booth (6-314).
Permission piggybacking in Android Apps
Traditionally, Android apps comprise a main application and several supporting libraries. These libraries often inherit permissions from the main app, granting them unnecessary access beyond their core functions. This leads to permission piggybacking, where libraries exploit inherited permissions to gather data without requesting them directly. Advertisement and tracking libraries particularly leverage this tactic to collect extensive user data.
We have developed an analysis tool to detect third-party app libraries only probing for already granted permissions, without ever requesting permissions themselves.
Analysing the top 1,000 apps on Google Play reveals that 50% of the libraries exhibit this permission probing behaviour. Presumably, these libraries then adapt their behaviour according to the granted permissions of the main app. Accordingly, they collect more or less data, as available. Most of the identified libraries exhibiting such behaviour are well-known advertising and tracking libraries. This fact underlines the urge for a finer granular permission system to separate main app and third-party libraries.
Visit us at it-sa 2024 and have a cookie with us!
You’ll find us in hall 6, booth number 6-314 for a demonstration and discussions.
Since May 2024, the inclusion of an iOS Privacy Manifest has been a requirement for app submissions with newly added third-party SDKs. We analyzed first results about data collection practices, compliance issues with Apple’s guidelines, and privacy risks posed by SDK providers.
Apple mandates that all app submissions with specific newly added third-party SDKs have to include a comprehensive iOS Privacy Manifest. This manifest outlines the data collection and usage practices of apps, providing users with transparency and empowering them to make informed decisions about their digital privacy. As outlined in our previous blog post, Appicaptor correlates the privacy manifest contents with additional privacy findings through static analysis.
In our evaluation, we analyzed the iOS Privacy Manifests definitions of our app sample set, that consists of the most popular 2,000 free iOS apps available on the German Apple App Store in June 2024.
Apple mandates that apps with newly added third-party software development kits (SDK) of one of the 86 listed third-party SDKs have to include a iOS Privacy Manifest (see Apple Developer Support). Our analysis of the app sample set revealed the presence of all 86 SDKs in at least one app. We evaluated whether apps integrating these SDKs adhered to the privacy manifest requirement: Our findings indicate that only 47 out of the 86 SDKs were associated with at least one app that defined a privacy manifest. Conversely, the remaining 39 SDKs were utilized at least within one app within the sample set, but no app employing these SDKs defined a privacy manifest. This discrepancy may stem from the following factors: either the SDK developers may have not provided a privacy manifest template for their SDKs, or the app developers have not incorporated SDK versions that include privacy manifest template or explicitly deleted the manifest entry, or the app update does not include a newly added third-party SDK from the list. So our analysis can only investigate the Privacy Manifest declarations of about half of the libraries that Apple has selected. However, these first results already provide interesting insights in the data analytics industry.
What data is requested in real apps and why?
As first evaluation on the manifest content, we evaluated the data types declared within the iOS Privacy Manifests of our app sample set. Our analysis revealed that 50% of the apps declare in the manifest that they collect data types such as OtherDiagnosticData, CrashData, and DeviceID. Especially the data type DeviceID is privacy-sensitive as it provides the possibility to correlate actions of users across the sandbox boundaries of individual apps.
The full list of collected data types and the number of apps in which such data is collected according to the privacy manifest is given in the following graph. It can be seen, that a wide range of data types are defined in the app set, including very sensitive data types as Health (Health and medical data), PhysicalAddress, PreciseLocation and SensitiveInfo (racial or ethnic data, sexual orientation, pregnancy or childbirth information, disability, religious or philosophical beliefs, trade union membership, political opinion, genetic information, or biometric data).
Although Apple clearly specifies how data type classes should be defined in the iOS Privacy Manifest, we found 205 apps within our app sample set that included malformed or self-defined data types in iOS Privacy Manifests. Consequently, these do not align with Apple’s specified requirements and the discrepancy highlights issues in the implementation and adherence to Apple’s guidelines.
Following the evaluation of data types, we proceeded to assess the given purposes. The next graph presents the distribution of purposes defined in the iOS Privacy Manifests per app. Our analysis indicates that collecting data is declared for the purposes of AppFunctionality or Analytics in over 60% of the apps of the sample set. Consistent with our findings in the data type evaluation, we observed that a substantial number of apps (235 apps) which included purpose definitions deviate from Apple’s expected values. These are summarizes in the category Malformed / Self-defined Purpose in the chart below.
Do the declared data type and their reasoning align?
Further analysis was conducted to determine which data types are collected for which specific purpose and in how many apps of the app sample set this occurs. For this evaluation, we analyzed the iOS Privacy Manifest definitions of the 2,000 apps in our sample set, focusing on tuples of data types and associated purposes. The following graphic illustrates which data types are accessed for which purposes. The size of the circles corresponds to the number of apps in which that specific data type-purpose tuple was defined in the iOS Privacy Manifest. It is interesting to see, that collecting certain sensitive data types such as SensitiveInfo, PhysicalAddress, PreciseLocation and Health are declared for purposes besides AppFunctionality and almost all data types are used also for the purpose of Analytics, which could have an effect on user’s privacy.
Are there component providers that stand out in terms of data type and usage?
Data types and purposes are defined for specific components in the iOS Privacy Manifest. For that reason, our next focus was to investigate if specific purposes to collected data types are specific to certain component providers. Therefore, the following analysis examines the purposes for collected data types grouped for each component provider within the iOS Privacy Manifest. The following graph highlights the relationship between purposes and the SDK provider. To do so, we associated the providers to their SDKs and extracted the 20 most frequently contained SDK providers in the Privacy Manifests within the evaluated app set. We analyzed in how many apps each purpose is defined for these top 20 providers within the app set. In the diagram it can be seen that certain providers, such as Firebase, concentrate on a few specific and targeted purposes. In contrast, others, like Adjust, request data for a wide array of purposes. As the purposes relate to the activities of the SDK providers, this can be seen as summary on the functionality aspects provided by the SDK components and their providers.
Similar to the evaluation of the specific purposes related to component providers, we expanded the view to cluster the information according to the data types accessed. To do so, we again took the 20 most frequently mentioned SDK component providers in the Privacy Manifests within the evaluated app set and counted how many apps declare a data type and purpose tuple for these top 20 component providers. In the following graph the size of the circles in the graph corresponds to the number of apps in which each specific data type-purpose tuple was declared in the iOS Privacy Manifest. Like the former evaluation, this graph shows, that certain component providers (like Firebase) focus on certain data type and purpose tuples, whereas other component providers define various data type and purpose tuples (e.g, Google and Facebook).
The data type definition may additionally contain boolean processing flags, that should specify external usage of the data. The processing flag linkedspecifies that the data type is linked to the user’s identity, whereas trackedspecifies that the data type is used to track users. Our final evaluation of the Privacy Manifest data of the app sample set focus on the aspect whether certain data providers specify data types with these processing flags or not.
Data types flagged with tracked can be shared with a data broker. Therefore, if the processing flag tracked or linked/tracked is set, this may threaten the user’s privacy significantly. Therefore, we examined what processing flags are set by different component providers for requested data types. We took the ten most frequently mentioned SDK component providers in the Privacy Manifests within the evaluated app set and analyzed how many apps declare a data type, processing type and purpose tuple for these top ten component providers. The size of the circles in the following graph corresponds to the number of apps in which each specific data type-processing flag-purpose tuple was defined in the iOS Privacy Manifest. The graph groups the data types in relation to the requested purpose in circle groups. The circle group’s label states which processing flags are set to true: If the circle group is labeled with linked, then only the data type is linked to the user’s identity. If circle group is labeled with linked/tracked, then the data type is linked to the user’s identity, and it is used to track users. Elements of a circle group that has no label are neither linked nor tracked.
A significant difference in the processing flag usage can be seen when comparing the results for Google and Facebook. Google defines the processing flag linked or no processing flag for most data types and purposes. In contrast, Facebook sets the processing flag with the most possible extent linked/tracked for most data types and purposes.
Conclusion
The analysis of iOS Privacy Manifests for the 2,000 most popular free iOS apps on the German Apple App Store in June 2024 reveals several insights about data collection practices and compliance with Apple’s guidelines.
About half of the Privacy Manifests in apps have declared to collect data type OtherDiagnosticData, CrashData or DeviceID, with various sensitive data types also being collected. However, the presence of apps with malformed or self-defined data types indicates inconsistencies in adhering to Apple’s guidelines. Additionally, the observed entries for data collection purposes were found to violate Apple’s specification. This undermines the effectiveness of the privacy manifest and hopefully will be addressed with checks by Apple during the app review process.
However, even in this preliminary analysis with a lot of data missing for SDKs, the benefit of the Privacy Manifests can be seen. This way, it is possible to inspect the relations between collected data types, purposes and components, showing that in some apps sensitive data types are collected for purposes not related to the app’s functionality.
The examination of specific data type-purpose tuples and their association with component providers revealed that certain SDK providers, focus on targeted purposes, while others, request data for a broader range of purposes. Notably, the processing flags for data types, particularly those flagged as tracked, pose significant privacy risks. The contrast between providers, which primarily uses the linked flag, and those which extensively use the linked/tracked flag, may underscore the varying levels of privacy impact across different SDK providers.
Staying informed about how apps collect and use your data is crucial in digital privacy. Addressing this, Appicaptor finds privacy-related app issues and extracts contents of the iOS Privacy Manifest. All information is correlated with results from static and dynamic analysis to provide enhanced privacy insights.
In today’s interconnected digital landscape concerns related to data privacy are in focus. App developers, platform providers and smartphone OS manufacturers face pressure to prioritize user privacy and transparency. In response to these concerns Apple has introduced an initiative: the iOS Privacy Manifest. This feature serves as a documentation how apps access, utilize and transmit privacy-related data. Appicaptor uses the information from the iOS Privacy Manifest, enabling users and companies to make well-informed decisions.
Understanding the iOS Privacy Manifest
iOS Privacy Manifests are attached to the app’s binary as property list (plist) and hold information regarding the collection and usage of privacy-related data by the app and included third-party SDKs. Not providing valid information within the iOS Privacy Manifest may lead to notifications or rejection throughout the app submission process. Strict enforcement in the App Store is scheduled starting in May 2024.
Privacy Manifest Components
When it comes to populating a iOS Privacy Manifest, there are several key data structures that developers must adhere to:
Required Reason APIs: Apple introduced a class of APIs referred to as Required Reason APIs to address concerns regarding fingerprinting. Any app or SDK utilizing an API from this list must explicitly state a valid purpose for their usage.
Data Usage Categories: Developers must provide a breakdown of data usage categories in their privacy manifests. This includes specifying the data collected, whether it’s linked to end-users’ identities, its usage for tracking purposes, and a list of reasons justifying. Apple provides a predefined set of purposes that developers have to reference.
External Domain Usage: All external tracking domains used in the app or third-party SDKs must be listed in the iOS Privacy Manifest. This provides users with clear visibility into the presence of all domains utilized for tracking purposes. When tracking permission (via the App Tracking Transparency framework) is not granted by the user, network requests to these domains will be blocked by iOS.
Appicaptor: iOS Privacy Manifest as additional App Evaluation Source
Appicaptor extracts the contents of the iOS Privacy Manifest and presents them in human-readable format. But beyond that, Appicaptor correlates the manifest’s content with privacy-related findings discovered from the app binary through static analysis. Therefore, Appicaptor offers users a comprehensive understanding of how their privacy-related data is handled by the apps:
Privacy Transparency: By presenting the iOS Privacy Manifest contents in a human-readable format, Appicaptor users gain transparency into the app’s data collection practices.
Informed Decision-Making Elevated: Armed with in-depth understanding of an app’s privacy practices, Appicaptor users can make informed decisions about whether to engage with the app. Correlation of the manifest contents with static analysis findings in coordination with Appicaptor’s ability to define custom rulesets empowers companies to assess the privacy risks associated with using the app to make choices aligned with their privacy preferences and concerns.
Enhanced Privacy Insights: Appicaptor’s static analysis enables an examination of the app’s codebase, revealing privacy insights in addition to what is disclosed in the iOS Privacy Manifest. By correlating these findings, users gain a more granular understanding of how their data is collected, used, and shared by the app.
Accountability and Trust Building: Appicaptor’s evaluation reports promotes accountability and trust by ensuring alignment between stated privacy practices and actual implementation. By correlating the results, Appicaptor users can identify and address any discrepancies between the documentation and implementation of privacy-related data usage.
Conclusion
In an era where trust between users and tech companies is important, transparency regarding data collection practices is non-negotiable. Using the iOS Privacy Manifest as an additional source of evaluation, Appicaptor enhances its app evaluation capabilities and gives users a more thorough understanding of how their data is handled by various apps. This integration enforces Appicaptor’s commitment empowering users with transparent and informed decisions regarding app security.
Appicaptor’s solution contributes to the overall integrity of the digital ecosystem by promoting transparency, accountability, and responsible data usage. By empowering users with comprehensive privacy insights and encouraging developers to uphold best practices and that the implementation is inline with documentation, Appicaptor’s approach supports a culture of trust and integrity that benefits everyone involved.
Device fingerprinting is a technique that got popular at the end of the 90s by websites, to identify and track users. Apps have in contrast to websites often access to a much wider property range usable for fingerprinting. The unique device identification via fingerprinting across sandbox borders of multiple apps is a relevant privacy issue as it increase the possibilities for tracking users, even without requiring permissions. Many smartphone users are unaware of such practices and possibilities because such activities happen unnoticeably in the background.
Just like each person has an individual fingerprint, electronic devices are also uniquely identifiable. Through manufacturing tolerances, different hardware configurations, software properties and individual software configurability such devices become uniquely identifiable. Depending on the number and the variance of properties used, more or less unique fingerprints can be generated. Typically, several hard- and software device properties are combined through a hashing algorithm to an ideally unique bit sequence.
Motivations for app developers to include tracking libraries are: advertisement, bot detection, account takeover, spam, fraud detection and secure environment detection for payments. Mobile operating systems such as Android and iOS are aware of the user’s transparency through fingerprinting and take steps to empower the user to regain privacy whilst trying to maintain legitimate tracking objectives. The advertisement ID was introduced especially to target device tracking with a resettable ID provided by iOS. It allows advertisers to personalize ads and at the same time give the user options to reset it or hide it for individual apps. Also, Privacy Labels are added to the app store to better explain the app’s data usage to the user. Permissions and Privacy Labels should shed light on the usage of tracking, and it’s purpose during app install and usage. Apple AppStore’s licence agreement also states clear rules on data and identifier usage. However, as shown by Deng et al., many apps exist in the AppStore infringing such rules, working their way around permissions and being dishonest with their Privacy Labels. One increasing way to bypass such restrictions is to use fingerprinting techniques not requiring user consent.
Tracking Possibilities with Fingerprinting
The fingerprint generated by each app containing the same fingerprinting SDK are ideally the same on the same device. The user becomes very transparent by correlating the fingerprint with other data provided during app usage. As an example, one could think of a scenario, pictured in the following figure, where a shopping app, a search engine app and a navigation app all contain the same fingerprinting SDK. All user input of each app like recent purchases, search terms and the device location could be accumulated in the cloud under the same fingerprint. The data collection of each app alone is already a privacy risk to the user, but the combination of different app data makes users fully transparent. Such data is extremely valuable for example for advertisement companies to personally tailor advertisements for individual users.
Fingerprinting SDKs and Commonly used Properties
To gain insight into the current state of iOS fingerprinting and the latest techniques being used, we identified real iOS fingerprinting SDKs through systematic internet research and conducted manual static and dynamic analysis. The respective SDKs source-code was manually analysed with Ghidra and if possible, the SDK was tested in a minimal app construct. During the manual analysis we judged the SDK as fingerprinter or non-fingerprinter based on the observed behaviour and used properties. This leaves 13 SDKs that we classify as fingerprinters and further analyse in more detail. The SDKs are listed in the following table. We found that all 13 SDKs use the native iOS API to collect device characteristics. Through dynamic analysis of custom-built test apps, we have found that the collection of features shows up through temporal spikes. Most SDKs send the collected device characteristics to a server for further processing, except for OpenIDFA, DFID and DFIDSwift, which are non-commercial products. Additional passive fingerprinting may be performed on the server side. The SDKs that do not include server communication generate a hash on the device and return it. Nevertheless, we argue that fingerprinting is generally only useful when the fingerprint leaves the device. Therefore, it can be assumed that in these cases the caller of the fingerprinting framework handles the network communication. It is obvious from the following table that fingerprinting SDKs share some collected properties, while other properties are exclusively queried by some SDKs. Very commonly used properties are: device model, identifier for vendor (similar to advertisement ID), device name, storage space properties, battery level and different locale identifiers. Other properties like device fonts, commonly used in web based fingerprinters, is rather seldom used. We suppose, due to the low entropy of this value on iOS devices. In conclusion, one can say that fingerprinters usually use multiple properties and don’t rely on a unique identifier provided through the advertisement ID in iOS.
Fingerprinting Access Pattern
We were able to observe access to common iOS APIs gathered in the previous step. Frida hooks were written to log access to respective APIs in a dynamic app analysis on a jailbroken iPhone X (iOS 14.4.2). In the next step, we analysed several top apps from the app store to see if fingerprinting can be observed based on API access pattern during execution.
Non-fingerprinting App
Firstly, let’s have a look at an example of an app without fingerprinting in the following figure. One can see that different properties were used during the app execution, some less, some more frequently. The accesses to the observed properties came from different caller modules (packages or libraries) from inside the app. Also, the properties were accessed throughout the whole runtime. Access to properties such as timezone and locale are typically required to properly display and run the app and totally legitimate.
Fingerprinting App
Now, let’s look at an app which we judge as fingerprinting in the following figure. One can see that from second 15 to 17, many properties are accessed in a very short timespan. Furthermore, the accesses come from the same caller module called ForterSDK, which is a known fingerprinting SDK. We observed such behaviour for all fingerprinting SDKs: Many properties are collected by the same caller module in a short time frame.
Fingerprinting Detection
As a proof of concept, we developed a fingerprinting detector to detect fingerprinting on the traces of a dynamic app analysis. The analyser is implemented as a sliding window of a certain width. The window of four seconds width slides over the event timeline and counts the number of accesses to known fingerprinting APIs per caller module. Access numbers above a certain threshold are then justified as fingerprinting. In a first test, we were able to correctly detect fingerprinting in 85% of the cases just by analysing access pattern. As further refinements, machine learning could improve accuracy as well as the inclusion of known fingerprinting SDK properties through a static analysis. Further insights on our used detection mechanism can be read in our extensive paper at ACM, published at the EICC conference 2023.
Conclusion
Fingerprinting as of today is used in many apps. Smartphone users are mostly unaware of the existence and the privacy issues related with such. This research lays the fundamentals to identify device fingerprinting in iOS apps. Future work will be to come up with remediation ideas to balance the privacy interests of smartphone users and fingerprinters.
Acknowledgements
This research work was supported by the National Research Center for Applied Cybersecurity ATHENE. ATHENE is funded jointly by the German Federal Ministry of Education and Research and the Hessian Ministry of Higher Education, Research and the Arts.
Note: Fingerprinting detection is currently work in progress and not (yet) part of the Appicaptor service.
With Apple’s release of iOS 14.5 at the end of April, iOS app developers are required to request permission in order to track their users beyond the app’s border. While there are already complaints about the low opt-in rate published by the app analytics company Flurry, which are updated weekly, we were curious to see how the overall adoption rate for app implementations would look like.
Developers need to provide a tracking description in the apps info.plist file – together with localized versions in file InfoPlist.strings of language’s project directory – called NSUserTrackingUsageDescription that informs the user why an app is requesting permission to use data for tracking the user or the device. However, in 57.3 % of the current German Top 2000 iOS apps no tracking description was provided by the developers although the app contains tracking code. On devices with iOS 14.5 or later this causes iOS to deny the request for access to the identifier for advertisers (IDFA). So the opt-in rate of users could be higher if those 57.3% of the developers would have provided a description, so that the user is at least presented with a decision dialog.
Property
%
No tracking description included but tracking code detected
57.3 %
Tracking description included and tracking code detected
33.5 %
No tracking description included and no tracking code detected
9.0 %
Tracking description included but no tracking code detected
0.2 %
Evaluation of Tracking Descriptions in German Top 2000 iOS Apps of all Categories except Games and Stickers (Appicaptor, July 2021)
Another effect we observed is a missing individualization and usefullness of the description. The table below lists the Top 10 used descriptions in the Top 2000 Apps, with 77 Apps just repeating the example text provided by Apple. Other just include placeholder text, such as “YOUR TEXT”, “NSUserTrackingUsageDescription”, “none” or even “-“.
Description
#
This identifier will be used to deliver personalized ads to you.
77
Your data will be used to deliver personalized ads to you.
9
Dadurch können wir Ihnen relevantere Werbung anzeigen, ohne deren Anzahl zu erhöhen.
7
Dies wird verwendet, um den Dienst zu identifizieren, der dich weitergeleitet hat, um die ein individuelles Erlebnis zu bieten.
6
Diese Kennung wird verwendet, um Ihnen personalisierte Anzeigen zu liefern.
6
Ihre Daten werden verwendet, um Ihnen personalisierte Werbung zu zeigen.
6
Your data will be used to deliver personalized ads.
5
Datenerhebung zur Verbesserung der App und für Werbezwecke zulassen
4
Deine Aktivitäten werden verwendet, um Dein Nutzererlebnis und Werbung zu personalisieren.
4
Mithilfe dieser ID können wir Dir für Dich ausgewählte Werbung anzeigen.
4
Top 10 Tracking Descriptions by Occurence in German Top 2000 iOS Apps of all Categories except Games and Stickers (Appicaptor, July 2021)
This leaves us with the impression, that creating a fitting description is currently only deamed important for 1/3 of the developers. Obviously the motivation depends on the benefits the developers gain from providing a tracking description.
There are many business cases which rely on or have a benefit from cross-app user tracking. The players of these business cases (e.g., ad providers and app developers who generate ad revenue) have an interest to achieve high opt-in rates. Fitting or at least reasonable descriptions for the permission dialogue will be the key for broad acceptance rates. The current evaluation shows that (1) only a minority of apps have at least a description included and (2) that they are very unspecific.
But there are use cases which currently integrate cross-app user tracking, however it does not have a beneficial effect for the using party. For example, this is the case when an app developer integrates a runtime diagnostic library. As he is only interested in the telematic data of his app, cross-app tracking would not be his interest and for that reason he may not include the description for the permission dialogue. In this case Apple’s initiative would help to reduce user tracking from companies that provide a runtime diagnostic services with a business model of selling retrieved analytics data sets to third parties or similar use cases.
Retrieving a unique identifier may allow app developers, advertisers, analytic companies and others to identify the user’s device or the user himself. Furthermore, most of these identifiers are persistent means for tracking, advertising and marketing activities on devices. Unique identifiers might however be also necessary for certain app functionality to work as expected.
Appicaptor tracks app’s access to various unique identifiers that can be categorized in three different groups:
The first group refers to mobile communication relevant IDs. Examples of this category are access to the phone’s IMEIs and MAC addresses, country code of the mobile network provider, as well as the phone / voice mail number, serial number of the SIM card and mobile subscriber ID (IMSI / TIMSI) of the user.
The second group is identification information about the hardware or operating system given by the operating system itself. When the mobile operating system is compiled different parameters for model, hardware, serial and display size are included in the operating system build. Furthermore, a build fingerprint can distinguish different operating system builds even if the operating system version is equal.
The third group consists of identifiers
like the Android Device ID, Advertisement ID,
properties that the user could configure like font size / type, audio volume, timezone, display orientation lock and screen brightness, Bluetooth pairings, power saving mode configuration, audio singnal output port (speaker, headphones, Bluetooth, etc.)
installed app list
hardware parameters like cpu and set of available hardware sensors (gyroscope, barometer, …)
other parameters like battery or device memory (RAM and data) usage.
Every month Appicaptor evaluates the IT security quality of thousands of Android and iOS apps. The following two charts depict for each month which identifier usage is rising and which is falling. The charts plot the identifier usage (total number of apps within the 2,000 most popular apps in German Google Play Store that accesses an identifier) relatively to the identifier usage in January 2020.
As the relative change (given in the two charts before) does not give the perspective, which identifiers are commonly utilized and to which extent, the following table provides the absolute numbers. This table shows how many apps within the 2,000 most popular apps in German Google Play Store access an identifier in the Appicaptor analysis runs of January 2020 and February 2021. Furthermore, based on every monthly analysis run between January 2020 and February 2021 we predict a trend if the identifier usage is rising or falling based on our data.
Name
Identifier Uage (in January 2020)
Identifier Uage (in February 2021)
Trend
Unique Android ID
1,944
1,947
stable
Build model
1,947
1,945
stable
Build manufacturer
1,916
1,941
▲
Build fingerprint
1,679
1,873
▲
Build product
1,633
1,760
▲
Build brand
1,537
1,712
▲
Build hardware
1,179
1,632
▲
Build display
1,478
1,486
stable
Country Code + Mobile Network Code
922
1,158
▲
Build serial
877
1,016
▲
Mobile Country Code
862
1,000
▲
Wifi-MAC address
754
717
▼
IMEI/MEID
689
591
▼
MAC address(es)
547
380
▼
Phone number
281
264
▼
Subscriber ID (IMSI)
312
258
▼
SIM card serial
243
178
▼
Voice mail number
69
62
stable
Total number of apps that access an identifier according to Appicaptor analysis of the 2,000 most popular apps in German Google Play Store
The analysis of Appicaptor shows that the access to (generally speaking) unspecific unique identifiers (like the build related parameters) is currently rising. One might think that the access to unspecific unique identifiers (like the build brand or hardware) may be not an privacy issue since they are equal at thousands of devices/users. And that the access to a more specific unique identifier (like the SIM serial or phone number) should be more an privacy issue. However, there is more to take into consideration.
A detailed manual inspection of access patterns and looking on the landscape of the mobile value-chain shows that most of the accesses of unspecific unique identifiers are executed in 3rd party libraries, which are included in the app by the developer. Furthermore each of these unspecific information portions (if seen alone) can not be utilized to identify a specific device or person. But certain libraries access a magnitude of these unspecific unique identifiers, creating a device fingerprint from all them and transmit the data to a server backend. As an other example, an open source library of this type can be found here. It claimes to create a device identifier from all available Android platform signals, that is fully stateless and will remain the same after reinstalling or clearing application data.
The further manual inspection of other identified libraries shows as well, that libraries which probably execute device fingerprinting are utilized in many apps of different app types. A linkage between the device fingerprint and your person is possible, when you think of an app that utilizes an library that joins identifiers as device fingerprint and you give that app information about your person (name, email address, etc.). That would bring the provider of the library in the position to track your identity throughout the usage of different apps, based solely on unspecific unique identifiers.
So what can we learn from these numbers?
The usage of almost all specific unique identifiers are currently falling. That trend is supposed to be related to privacy preserving functions in the mobile operating systems that limit the app’s access to correct values of these identifiers. If you enable these privacy preserving functions, fake random values are provided.
The usage of unspecific unique identifiers is currently rising throughout all identifiers. From our perspective that rising is based on the reasoning outlined above (device fingerprinting) as well as to facilitate user identification in the presence of the current drawbacks (uncertainty if correct or fake specific unique identifiers are reported to the app by the operating system).
Therefore, in the app evaluation process one should take a look at the composition and magnitude of the list of accessed unique identifiers of an app: if many unspecified unique identifiers are accessed, this should draw one’s attention the same way as the access of an specified unique identifier should do.