iOS Privacy Manifest: Data Collection and Usage in Top Free iOS Apps

Since May 2024, the inclusion of an iOS Privacy Manifest has been a requirement for app submissions with newly added third-party SDKs. We analyzed first results about data collection practices, compliance issues with Apple’s guidelines, and privacy risks posed by SDK providers.

Apple mandates that all app submissions with specific newly added third-party SDKs have to include a comprehensive iOS Privacy Manifest. This manifest outlines the data collection and usage practices of apps, providing users with transparency and empowering them to make informed decisions about their digital privacy. As outlined in our previous blog post, Appicaptor correlates the privacy manifest contents with additional privacy findings through static analysis.

In our evaluation, we analyzed the iOS Privacy Manifests definitions of our app sample set, that consists of the most popular 2,000 free iOS apps available on the German Apple App Store in June 2024.

Apple mandates that apps with newly added third-party software development kits (SDK) of one of the 86 listed third-party SDKs have to include a iOS Privacy Manifest (see Apple Developer Support). Our analysis of the app sample set revealed the presence of all 86 SDKs in at least one app. We evaluated whether apps integrating these SDKs adhered to the privacy manifest requirement: Our findings indicate that only 47 out of the 86 SDKs were associated with at least one app that defined a privacy manifest. Conversely, the remaining 39 SDKs were utilized at least within one app within the sample set, but no app employing these SDKs defined a privacy manifest. This discrepancy may stem from the following factors: either the SDK developers may have not provided a privacy manifest template for their SDKs, or the app developers have not incorporated SDK versions that include privacy manifest template or explicitly deleted the manifest entry, or the app update does not include a newly added third-party SDK from the list. So our analysis can only investigate the Privacy Manifest declarations of about half of the libraries that Apple has selected. However, these first results already provide interesting insights in the data analytics industry.

What data is requested in real apps and why?

As first evaluation on the manifest content, we evaluated the data types declared within the iOS Privacy Manifests of our app sample set. Our analysis revealed that 50% of the apps declare in the manifest that they collect data types such as OtherDiagnosticData, CrashData, and DeviceID. Especially the data type DeviceID is privacy-sensitive as it provides the possibility to correlate actions of users across the sandbox boundaries of individual apps.

The full list of collected data types and the number of apps in which such data is collected according to the privacy manifest is given in the following graph. It can be seen, that a wide range of data types are defined in the app set, including very sensitive data types as Health (Health and medical data), PhysicalAddress, PreciseLocation and SensitiveInfo (racial or ethnic data, sexual orientation, pregnancy or childbirth information, disability, religious or philosophical beliefs, trade union membership, political opinion, genetic information, or biometric data).

Although Apple clearly specifies how data type classes should be defined in the iOS Privacy Manifest, we found 205 apps within our app sample set that included malformed or self-defined data types in iOS Privacy Manifests. Consequently, these do not align with Apple’s specified requirements and the discrepancy highlights issues in the implementation and adherence to Apple’s guidelines.

Following the evaluation of data types, we proceeded to assess the given purposes. The next graph presents the distribution of purposes defined in the iOS Privacy Manifests per app. Our analysis indicates that collecting data is declared for the purposes of AppFunctionality or Analytics in over 60% of the apps of the sample set. Consistent with our findings in the data type evaluation, we observed that a substantial number of apps (235 apps) which included purpose definitions deviate from Apple’s expected values. These are summarizes in the category Malformed / Self-defined Purpose in the chart below.

Do the declared data type and their reasoning align?

Further analysis was conducted to determine which data types are collected for which specific purpose and in how many apps of the app sample set this occurs. For this evaluation, we analyzed the iOS Privacy Manifest definitions of the 2,000 apps in our sample set, focusing on tuples of data types and associated purposes. The following graphic illustrates which data types are accessed for which purposes. The size of the circles corresponds to the number of apps in which that specific data type-purpose tuple was defined in the iOS Privacy Manifest. It is interesting to see, that collecting certain sensitive data types such as SensitiveInfo, PhysicalAddress, PreciseLocation and Health are declared for purposes besides AppFunctionality and almost all data types are used also for the purpose of Analytics, which could have an effect on user’s privacy.

Are there component providers that stand out in terms of data type and usage?

Data types and purposes are defined for specific components in the iOS Privacy Manifest. For that reason, our next focus was to investigate if specific purposes to collected data types are specific to certain component providers. Therefore, the following analysis examines the purposes for collected data types grouped for each component provider within the iOS Privacy Manifest. The following graph highlights the relationship between purposes and the SDK provider. To do so, we associated the providers to their SDKs and extracted the 20 most frequently contained SDK providers in the Privacy Manifests within the evaluated app set. We analyzed in how many apps each purpose is defined for these top 20 providers within the app set. In the diagram it can be seen that certain providers, such as Firebase, concentrate on a few specific and targeted purposes. In contrast, others, like Adjust, request data for a wide array of purposes. As the purposes relate to the activities of the SDK providers, this can be seen as summary on the functionality aspects provided by the SDK components and their providers.

Similar to the evaluation of the specific purposes related to component providers, we expanded the view to cluster the information according to the data types accessed. To do so, we again took the 20 most frequently mentioned SDK component providers in the Privacy Manifests within the evaluated app set and counted how many apps declare a data type and purpose tuple for these top 20 component providers. In the following graph the size of the circles in the graph corresponds to the number of apps in which each specific data type-purpose tuple was declared in the iOS Privacy Manifest. Like the former evaluation, this graph shows, that certain component providers (like Firebase) focus on certain data type and purpose tuples, whereas other component providers define various data type and purpose tuples (e.g, Google and Facebook).

The data type definition may additionally contain boolean processing flags, that should specify external usage of the data. The processing flag linked specifies that the data type is linked to the user’s identity, whereas tracked specifies that the data type is used to track users. Our final evaluation of the Privacy Manifest data of the app sample set focus on the aspect whether certain data providers specify data types with these processing flags or not.

Data types flagged with tracked can be shared with a data broker. Therefore, if the processing flag tracked or linked/tracked is set, this may threaten the user’s privacy significantly. Therefore, we examined what processing flags are set by different component providers for requested data types. We took the ten most frequently mentioned SDK component providers in the Privacy Manifests within the evaluated app set and analyzed how many apps declare a data type, processing type and purpose tuple for these top ten component providers. The size of the circles in the following graph corresponds to the number of apps in which each specific data type-processing flag-purpose tuple was defined in the iOS Privacy Manifest. The graph groups the data types in relation to the requested purpose in circle groups. The circle group’s label states which processing flags are set to true: If the circle group is labeled with linked, then only the data type is linked to the user’s identity. If circle group is labeled with linked/tracked, then the data type is linked to the user’s identity, and it is used to track users. Elements of a circle group that has no label are neither linked nor tracked.

A significant difference in the processing flag usage can be seen when comparing the results for Google and Facebook. Google defines the processing flag linked or no processing flag for most data types and purposes. In contrast, Facebook sets the processing flag with the most possible extent linked/tracked for most data types and purposes.

Conclusion

The analysis of iOS Privacy Manifests for the 2,000 most popular free iOS apps on the German Apple App Store in June 2024 reveals several insights about data collection practices and compliance with Apple’s guidelines.

About half of the Privacy Manifests in apps have declared to collect data type OtherDiagnosticData, CrashData or DeviceID, with various sensitive data types also being collected. However, the presence of apps with malformed or self-defined data types indicates inconsistencies in adhering to Apple’s guidelines. Additionally, the observed entries for data collection purposes were found to violate Apple’s specification. This undermines the effectiveness of the privacy manifest and hopefully will be addressed with checks by Apple during the app review process.

However, even in this preliminary analysis with a lot of data missing for SDKs, the benefit of the Privacy Manifests can be seen. This way, it is possible to inspect the relations between collected data types, purposes and components, showing that in some apps sensitive data types are collected for purposes not related to the app’s functionality.

The examination of specific data type-purpose tuples and their association with component providers revealed that certain SDK providers, focus on targeted purposes, while others, request data for a broader range of purposes. Notably, the processing flags for data types, particularly those flagged as tracked, pose significant privacy risks. The contrast between providers, which primarily uses the linked flag, and those which extensively use the linked/tracked flag, may underscore the varying levels of privacy impact across different SDK providers.

Security Implications of Apple ID Switching and Keychain Merging in COPE Mobile Environments

The ability to switch Apple IDs on iOS devices poses a security risk for Corporate-Owned, Personally Enabled (COPE) mobile environments. When users opt to retain Keychain entries while transitioning from a corporate Apple ID to a private Apple ID, it results in the merging of passwords to the private Apple ID. This merging poses a significant challenge for organizations that need to maintain strict control over corporate credentials, as enterprise passwords can then be synced with all private devices using the private Apple ID.

Switching Apple IDs on an iOS device involves opening the “Settings” app on an iOS device, navigate to the Apple ID section and opt to sign out of the current Apple ID. This prompts a confirmation step, asking users if they wish to keep a copy of their data on the device or to delete it. This decision is crucial to consider because if the user opts to keep the Keychain data to be stored on the device, this decision will have implications for the passwords associated with the Apple ID currently signing out. Following the sign-out, users can proceed to sign in with the new Apple ID. Once the necessary information is entered and the setup is complete, the iOS device becomes associated with the new Apple ID. The device syncs with the new account, applying apps, data, and settings associated with the updated Apple ID. However, apps installed with the previous Apple ID are still usable.

Saved passwords of personal Apple ID
Saved passwords of corporate Apple ID

Keychain Merging and Its Effects

When logging into a website using Safari, a pop-up typically appears, asking if the user likes to store the password for AutoFill. This method simplifies the process of adding accounts to the password store. Alternatively, when creating a new account, users receive an automatically generated strong password. In either case, users can add accounts and passwords in the password store. The passwords are stored in the Keychain. If the user chooses to keep the Keychain data stored on the device during the Apple ID switching, this results in a merging of the stored passwords of the former and the latter configured Apple ID on the device.

COPE environments prioritize the separation of work and personal data, but the Keychain merging while switching Apple IDs on iOS devices undermines this separation. This has to be considered in situations where employees use their personal Apple IDs alongside corporate profiles on the same device. If one Apple ID during the Apple ID switching process is the corporate ID and the other is the private ID, that leads to merging of corporate and personal credentials compromising the security posture of COPE mobile environments and/or the privacy of the user’s private credentials.

Data retention dialogue when signing out from one Apple ID: When the “Keychain” option is selected in the dialogue whilst signing out from corporate Apple ID and before signing in to the personal Apple ID, stored corporate passwords are merged with stored personal passwords.
Corporate passwords from signed out corporate Apple ID are merged to saved passwords of personal Apple ID. All shown passwords are available in personal Apple ID.

Lack of Mitigation Through MDM Restrictions

Currently, there is an absence of Mobile Device Management restrictions to mitigate the unintended consequences of password merging during Apple ID switching. The inability to prevent Keychain merging in COPE environments can have a significant impact for organizations. The only measure is to enforce employee education and awareness regarding the risks associated with Apple ID switching and Keychain merging. Training programs should emphasize the importance of choosing secure options during account transitions and the potential impact on corporate and private data security.

Conclusion

The merging of Keychain entries during Apple ID switching on iOS devices poses a significant security challenge, particularly in COPE mobile environments. Organizations currently need to be proactive in addressing this issue through employee education and awareness programs. Collaboration between organizations, MDM providers and Apple, is needed to develop comprehensive solutions that safeguard corporate credentials and mitigate the risks associated with Apple ID switching.

No Patch, no Trust: Consequences of iOS Data Flow Restriction Bypasses for Enterprises

The number of ways to bypass iOS data flow restrictions meanwhile has further increased, but Apple still does not bother to fix them. So, the question is: How trustworthy are iOS MDM restrictions if even simple tricks to bypass them are not closed by Apple?

In many enterprises, there is the need to separate data of different origins or trust levels. In COPE (company owned, personally enabled) deployments this typically is the separation between private and business apps. Whereas in COBO (company owned, business only) deployments often data of supporting tools (such as travel management) should be separated from confidential enterprise data or other data with higher protection requirements.

Apple acknowledges this demand of controlling the data flow, “to keep it protected from both attacks and user missteps” by applying the Mobile Device Management (MDM) settings “Managed Open In” and “Managed pasteboard”. However, we have already demonstrated that it is possible to circumvent these MDM restrictions with the pre-installed iOS Shortcuts App. Others also reported issues already in 2020 with the possibility to use the “Quick Look” functionality of the native iOS email client for a bypass, also demonstrated in a video.

New Findings

The recently new discovered and reported issues are related to the iOS Files app. The first issue is related to the “Recents” Tab of the Files app. When a user had opened any document in a managed app before and now open an unmanaged document in the “Recents” tab of the Files app, the share action incorrectly presents a list of managed apps instead of unmanaged apps. The opposite direction for the bypass (managed to unmanaged app) is the second newly reported issue. Deleting a managed document with the Files app and copying it to an unmanaged app folder (instead of restoring it) removes the MDM restriction and permits the user to open the document in the unmanaged app. This is also shown in our extended demonstration video, now demonstrating five unfixed issues with the iOS data flow restrictions.

Reactions to Reported Issues

We again reported the two new issues to Apple, but also this time, Apple argued that “MDM profiles provide configuration management but do not establish additional security boundaries beyond what iOS and iPadOS have to offer.” A similar argument was also given for the reported MDM restriction bypass of SySS GmbH.

For all these six bypass issues, there are currently no mitigations available. Taking all the current conversations with Apple into account, and as reported issues of 2020 are still not fixed, there seems to be no justifiable reason to believe that our findings will be fixed sooner.

Consequences for Enterprises

MDM restrictions are a vital instrument for securing the usage of iOS devices in enterprise environments. Only thanks to these restrictions, the requirements of enterprises can be fulfilled for a device that is primarily developed for the consumer market. Even with effective MDM restrictions it is still a challenging task for enterprises to keep up with constantly changed and extended features of mobile smartphone operating systems. And often enough, MDM restrictions that would make new features acceptable for enterprises are released only major versions later, if any (e.g., still no possibility to prevent a mix-up of private and enterprise passwords when switching between private and enterprise Apple IDs in COPE deployments, discussed in an upcoming article).

Experiencing the lack of support for fixing multiple reported issues in advertised security features now raises the question: Which other MDM restrictions are insecure? When one follows Apple’s argument, for any other bypass of such MDM restrictions a patch will be rejected and probably already have been rejected before, as MDM restrictions are always just configuration options per definition. This isn’t then only a technical issue, it becomes an issue of trust in supporting enterprises with reliable solutions.

In consequence, the missing patches reduce the possibilities for a secure iOS enterprise deployment in COPE and COBO scenarios.

Trivial Bypass of iOS MDM Restrictions Apple doesn’t fix

Apple promotes MDM restrictions that should prevent data flows between managed and unmanaged apps. These restrictions can be bypassed easily and effortlessly using an app that is pre-installed on every new iOS device: the Shortcuts App. This is a finding of our practical tests on iOS 17.1 that should concern all enterprises that rely on these restrictions to use iOS devices in a compliant way with their enterprise policy for personal enabled usage.

Update: Problem is not only related to iOS Shortcuts App. MDM Restrictions can also be bypass with iOS Files App and iOS E-Mail client.

Apple promotes the data flow control between managed and unmanaged apps as a solution for data separation between personal and corporate data, “to keep it protected from both attacks and user missteps“. To achieve this protection, many enterprises configure the following Mobile Device Management (MDM) restrictions for the managed iOS devices:

Managed pasteboard. In iOS 15 and iPadOS 15 or later, this restriction helps control the pasting of content between managed and unmanaged destinations. When the restrictions above are enforced, pasting of content is designed to respect the Managed Open In boundary between third-party or Apple apps like Calendar, Files, Mail, and Notes. And with this restriction, apps can’t request items from the pasteboard when the content crosses the managed boundary.

Allow documents from managed sources in unmanaged destinations. Enforcing this restriction helps prevent an organization’s managed sources and accounts from opening documents in a user’s personal destinations. This restriction could prevent a confidential email attachment in your organization’s managed mail account from being opened in any of the user’s personal apps.

(see Apple Document “Managing Devices and Corporate Data”, accessed October 9th, 2023)

With the first restriction, a user should not be able to paste the content of the pasteboard to an unmanaged app, if the content was copied from a managed app, or vice versa. The second restriction applies the same restriction to documents. And in our tests, iOS prevented us from performing such actions, after the MDM restrictions for the test devices have been activated.

However, the iOS Shortcuts App, which is an integral part since iOS 13 and can be used to automate almost any task on iOS, can be abused to bypass such restrictions. The first shortcut (below on the left side) simply reads the clipboard and copies the content back to the global clipboard, which is then also synced with all other devices that the user has logged in with the same Apple account. When the user activates this shortcut, it strips of the information, that the original content was copied from a managed app and so iOS does no longer prevent the paste action to an unmanaged app. Each shortcut can also be combined with a trigger action that can automated the execution of it. So, the bypass of iOS MDM Restrictions can also be automated.

The second example shortcut uses the input of the share sheet to save the content of a document from a managed app. When using this shortcut in the share sheet, iOS prompts for name and location of the file and stores it without the information, that the content originated from a managed app. This way the document can be opened also from unmanaged apps. We created a video for both bypass methods to show the environment and the results in action. But these are only simple examples of the problem. There are many more possibilities to use actions of the Shortcuts App to bypass the iOS MDM restrictions.

Apparently, the Shortcuts App has privileges to access any data from managed apps, although it is not configured as a managed app through the MDM profile. This way the app can access the data, but when the data is saved by the Shortcuts App, it is saved as an unmanaged app. The fix therefore should be to keep the origin information of the content the Shortcuts app processes. This way the app could still access content from managed apps, but the processed content would be marked as originated from a managed app and this way the data separation would be kept intact.

We reported this flaw via Apple’s responsible disclosure process. However, the issue was rejected. We provided a detailed description, screenshots and videos, but also on a second try, where we explained again that the bypass circumvents an advertised security feature of iOS, it was not accepted. We also reached out to Apple via feedbackassistant.apple.com, but did not receive any answer in 2 months. As the current Shortcuts App version 6.1.2 is still vulnerable, we decided to publish the findings to warn enterprises that they cannot rely on iOS data flow restrictions.

A short-term solution would be to delete the iOS Shortcuts App from the devices and block the installation with the MDM blacklist via the bundle ID “com.apple.shortcuts” or to exclude devices with installed Shortcuts App from the domain as non-compliant, depending on the provided functionality of the MDM solution used. However, in the VMware Workspace ONE UEM (AirWatch) MDM tested, the Shortcuts App blacklisting did not prevent the installation nor was the device marked as non-compliant.

Without a proper blocking of the Shortcuts App, company data can currently be sent unintentionally in unmanaged apps through manually used or automated shortcuts and thus leave the company in an uncontrolled manner.

You can visit us on it-sa Expo&Congress, 10. – 12. October in Hall 6, Booth 210 for further details.

Device Fingerprinting on iOS Apps

Device fingerprinting is a technique that got popular at the end of the 90s by websites, to identify and track users. Apps have in contrast to websites often access to a much wider property range usable for fingerprinting. The unique device identification via fingerprinting across sandbox borders of multiple apps is a relevant privacy issue as it increase the possibilities for tracking users, even without requiring permissions. Many smartphone users are unaware of such practices and possibilities because such activities happen unnoticeably in the background.

Just like each person has an individual fingerprint, electronic devices are also uniquely identifiable. Through manufacturing tolerances, different hardware configurations, software properties and individual software configurability such devices become uniquely identifiable. Depending on the number and the variance of properties used, more or less unique fingerprints can be generated. Typically, several hard- and software device properties are combined through a hashing algorithm to an ideally unique bit sequence.

Motivations for app developers to include tracking libraries are: advertisement, bot detection, account takeover, spam, fraud detection and secure environment detection for payments. Mobile operating systems such as Android and iOS are aware of the user’s transparency through fingerprinting and take steps to empower the user to regain privacy whilst trying to maintain legitimate tracking objectives. The advertisement ID was introduced especially to target device tracking with a resettable ID provided by iOS. It allows advertisers to personalize ads and at the same time give the user options to reset it or hide it for individual apps. Also, Privacy Labels are added to the app store to better explain the app’s data usage to the user. Permissions and Privacy Labels should shed light on the usage of tracking, and it’s purpose during app install and usage. Apple AppStore’s licence agreement also states clear rules on data and identifier usage. However, as shown by Deng et al., many apps exist in the AppStore infringing such rules, working their way around permissions and being dishonest with their Privacy Labels. One increasing way to bypass such restrictions is to use fingerprinting techniques not requiring user consent.

Tracking Possibilities with Fingerprinting

The fingerprint generated by each app containing the same fingerprinting SDK are ideally the same on the same device. The user becomes very transparent by correlating the fingerprint with other data provided during app usage. As an example, one could think of a scenario, pictured in the following figure, where a shopping app, a search engine app and a navigation app all contain the same fingerprinting SDK. All user input of each app like recent purchases, search terms and the device location could be accumulated in the cloud under the same fingerprint. The data collection of each app alone is already a privacy risk to the user, but the combination of different app data makes users fully transparent. Such data is extremely valuable for example for advertisement companies to personally tailor advertisements for individual users.

Example of tracking possibilities with fingerprinting SDKs

Fingerprinting SDKs and Commonly used Properties

To gain insight into the current state of iOS fingerprinting and the latest techniques being used, we identified real iOS fingerprinting SDKs through systematic internet research and conducted manual static and dynamic analysis. The respective SDKs source-code was manually analysed with Ghidra and if possible, the SDK was tested in a minimal app construct. During the manual analysis we judged the SDK as fingerprinter or non-fingerprinter based on the observed behaviour and used properties. This leaves 13 SDKs that we classify as fingerprinters and further analyse in more detail. The SDKs are listed in the following table. We found that all 13 SDKs use the native iOS API to collect device characteristics. Through dynamic analysis of custom-built test apps, we have found that the collection of features shows up through temporal spikes. Most SDKs send the collected device characteristics to a server for further processing, except for OpenIDFA, DFID and DFIDSwift, which are non-commercial products. Additional passive fingerprinting may be performed on the server side. The SDKs that do not include server communication generate a hash on the device and return it. Nevertheless, we argue that fingerprinting is generally only useful when the fingerprint leaves the device. Therefore, it can be assumed that in these cases the caller of the fingerprinting framework handles the network communication. It is obvious from the following table that fingerprinting SDKs share some
collected properties, while other properties are exclusively queried by some SDKs. Very commonly used properties are: device model, identifier for vendor (similar to advertisement ID), device name, storage space properties, battery level and different locale identifiers. Other properties like device fonts, commonly used in web based fingerprinters, is rather seldom used. We suppose, due to the low entropy of this value on iOS devices. In conclusion, one can say that fingerprinters usually use multiple properties and don’t rely on a unique identifier provided through the advertisement ID in iOS.

Used device properties by different fingerprinting SDKs

Fingerprinting Access Pattern

We were able to observe access to common iOS APIs gathered in the previous step. Frida hooks were written to log access to respective APIs in a dynamic app analysis on a jailbroken iPhone X (iOS 14.4.2). In the next step, we analysed several top apps from the app store to see if fingerprinting can be observed based on API access pattern during execution.

Non-fingerprinting App

Firstly, let’s have a look at an example of an app without fingerprinting in the following figure. One can see that different properties were used during the app execution, some less, some more frequently. The accesses to the observed properties came from different caller modules (packages or libraries) from inside the app. Also, the properties were accessed throughout the whole runtime. Access to properties such as timezone and locale are typically required to properly display and run the app and totally legitimate.

Fingerprinting App

Now, let’s look at an app which we judge as fingerprinting in the following figure. One can see that from second 15 to 17, many properties are accessed in a very short timespan. Furthermore, the accesses come from the same caller module called ForterSDK, which is a known fingerprinting SDK. We observed such behaviour for all fingerprinting SDKs: Many properties are collected by the same caller module in a short time frame.

Fingerprinting Detection

As a proof of concept, we developed a fingerprinting detector to detect fingerprinting on the traces of a dynamic app analysis. The analyser is implemented as a sliding window of a certain width. The window of four seconds width slides over the event timeline and counts the number of accesses to known fingerprinting APIs per caller module. Access numbers above a certain threshold are then justified as fingerprinting. In a first test, we were able to correctly detect fingerprinting in 85% of the cases just by analysing access pattern. As further refinements, machine learning could improve accuracy as well as the inclusion of known fingerprinting SDK properties through a static analysis. Further insights on our used detection mechanism can be read in our extensive paper at ACM, published at the EICC conference 2023.

Conclusion

Fingerprinting as of today is used in many apps. Smartphone users are mostly unaware of the existence and the privacy issues related with such. This research lays the fundamentals to identify device fingerprinting in iOS apps. Future work will be to come up with remediation ideas to balance the privacy interests of smartphone users and fingerprinters.

Acknowledgements

This research work was supported by the National Research Center for Applied Cybersecurity ATHENE. ATHENE is funded jointly by the German Federal Ministry of Education and Research and the Hessian Ministry of Higher Education, Research and the Arts.

Note: Fingerprinting detection is currently work in progress and not (yet) part of the Appicaptor service.

Insecure Cryptography Usage: Tracing Cryptographic Agility in Android and iOS Apps

How has cryptography quality of the top 2000 Android and iOS applications evolved over the past three years? We show an overview of used hashing functions and symmetric encryption algorithms now and then. The results indicate that the majority of apps still use insecure cryptography.

Cryptography algorithms are applicable in many use cases such as for example encryption, hashing, signing. Cryptography has been used since centuries, some cryptography algorithms have been proven to be easily breakable (under certain configurations or conditions) and should thus be avoided. It is not easy for a developer with little cryptographic background to choose secure algorithms and configurations from the plenitude of options. Cryptographic agility is the ability exchange (insecure) cryptography algorithms with secure ones in computer programs.

Analysis Environment & Apps

The analysis results are based on the Appicaptor analysis results of the top 2000 Android and iOS apps. Appicaptor analyzed the current versions of the top 2000 apps along with the three-year-old counterpart of the respective apps. The apps are grouped into top apps from the top 2000 list and business apps uploaded or requested by Appicaptor customers.

Used Hashing Functions

Hashing functions such as MD5 and its predecessors as well as SHA1 are long known to be insecure and prone to collision attacks. It is advised by NIST to move to more secure alternatives like SHA224 or up to SHA512.

The used hashing functions in business apps and the top apps for iOS and Android were analyzed to see the current situation. Afterwards, the 2019 version of the apps is compared to the 2022 version to show the trend for cryptographic agility.

Percentage of apps using the specified hashing functions in top and business apps (2019 VS 2022)

Surprisingly, outdated SHA1 and MD5 hashing is still found in 70% to 80% of the analyzed apps and thus the most used hashing algorithms in iOS and Android in both, the top and business app groups. Even the long outdated MD2 algorithm is still used in 5% to 10% of all apps. These are alarming news regarding security. SHA256 is the only used, yet secure algorithm which is as widespread as MD5 and SHA1.

Comparing hashing in top and business apps, one can see that business apps use less hashing functionality in general.

Looking at the evolution of the top apps on Android and iOS from 2019 to 2022, one can see that the usage of MD5 and SHA1 mostly remains constant with only slight variations. On Android, the SHA2 family and especially SHA512 usage increased. In the case of SHA512, the usage in apps doubled, which is at first sight a positive trend. However, since the usage of outdated algorithms remains constant, one must say, that only more hashing algorithms are used and secure algorithms are not replacing the outdated ones. On iOS, the situation is vice versa: The usage of the SHA2 family even declines which leads to the assumption that less hashing is used on iOS.

In conclusion, one can say that even though Android developers embracing the SHA2 family, outdated hashing functions constantly and heavily remain in Android and iOS apps.

Used Symmetric Encryption Algorithms

As one would expect, AES is the most widely used symmetric encryption algorithm. DES and 3DES are used in around 3% of the analyzed apps. One exception is DES in Android which still seems very popular with a usage in around 12% of the tested apps. Throughout the years, DES and 3DES usage remains mostly constant. However, looking at AES usage over time, one can see that the usage in Android increases in the latest app versions, while at the same time the AES encryption in iOS decreases.
Especially on Android, a discrepancy between business and top apps becomes obvious. Business apps seem to use less AES and slightly more DES encryption.

Percentage of apps using the specified encryption functions in top and business apps (2019 VS 2022)

Usage AES in Insecure ECB Mode

Usage of the ECB mode is a very common weakness when applying cryptography. ECB mode outputs the same ciphertext for the same plaintext (when the same key is used). This means that pattern are not hidden very well and one could draw conclusions on the plaintext. With other techniques like CBC or CTR mode, succeeding block’s encryption depend on one another, which introduces randomness and hides pattern.
We are aware that under certain conditions the usage of ECB mode is fine, but we advise against using it since secure conditions might easily become insecure during app upgrades, code restructuring or new requirements.

We visualize the usage of insecure ECB mode versus other modes. In the visualization we lay focus on the explicit transition of used secure and insecure modes from 2019 to 2022.

ECB mode usage transitions from 2019 to 2022

Looking at the transitions for Android top apps, we see that 48.9% ECB mode usage in 2019 shrinks to 32.3% in 2022, which is very positive. One can also see that many apps shift from ECB mode to other secure modes. However, a small percentage of apps used secure cryptography in 2019, are now using ECB in 2022.

The situation on iOS looks much different. The transition diagram shows that out of the top apps on iOS, only 12% use ECB mode in 2019 and the majority uses secure alternatives. However, after three years, things didn’t turn out well for iOS apps. With 16% for top apps and 12% on business apps in 2022, more apps are using insecure ECB mode compared to 2019. Even though numbers increased on iOS, ECB usage on Android is still far more widespread, but decreases.

Causes of ECB mode usage

From all observations, we find the transitions from secure (non-ECB) to insecure (ECB) cryptography and vice versa very interesting. Understanding reasons for the transitions could give hints on how developers could be lead towards better cryptography standards. The transitions from ECB to non-ECB and non-ECB to ECB is significantly strong in Android top apps.

Triggers for a change from secure encryption (non-ECB) to insecure encryption (ECB) on Android and iOS

A more in-depth analysis of these apps reveals that in around 90% of the cases, the transition from ECB or towards ECB is triggered by an included third-party library. On iOS in 70% of the cases the transition is triggered through third-party libraries and in 30% of the cases through code changes of the app developer.

Libraries triggering transition insecure cryptography (non-ECB) to secure cryptography (ECB) on Android

Different Android third-party libraries which cause the transition from an insecure to a secure cryptography mode and vice versa were analyzed. The transition from ECB to non-ECB is pretty clear, 98% of the apps discontinued using ECB due to not using Google GMS Advertisement library anymore. In 2% of the apps, the respective library was not identifiable due to obfuscation. The pie charts also show which libraries triggered the ECB usage in 2022 apps. Leading with 29% is Google GMS Advertisement library followed by Icelink (12%), Microsoft Identity (10%) and Apache Commons (10%). Respective apps were deeper analyzed, to see if ECB was introduced through a third-party library update or just by adding a new third-party library with ECB usage. In fact, in 92% of the cases libraries with ECB usage were added and only in 8% of the cases a third-party library update introduced ECB.

Conclusion

This analysis has proven that the majority of apps still use insecure cryptography. The trend over the past years unfortunately shows no significant drift towards secure algorithms on the broad front. Some single aspects like ECB usage on Android point into the right direction. The detailed analysis in finding causes of the ECB usage on Android showed that this flaw is mostly introduced through the usage of third-party libraries during app development.

Sources

The contents of this blog post is a condensed version of the award-winning paper published at the international ICISSP 2023 Conference. The full paper can be viewed at Scitepress.