iOS App Tracking Transparency – Adoption Rate of App Implementations

With Apple’s release of iOS 14.5 at the end of April, iOS app developers are required to request permission in order to track their users beyond the app’s border. While there are already complaints about the low opt-in rate published by the app analytics company Flurry, which are updated weekly, we were curious to see how the overall adoption rate for app implementations would look like.

Developers need to provide a tracking description in the apps info.plist file – together with localized versions in file InfoPlist.strings of language’s project directory – called NSUserTrackingUsageDescription that informs the user why an app is requesting permission to use data for tracking the user or the device. However, in 57.3 % of the current German Top 2000 iOS apps no tracking description was provided by the developers although the app contains tracking code. On devices with iOS 14.5 or later this causes iOS to deny the request for access to the identifier for advertisers (IDFA). So the opt-in rate of users could be higher if those 57.3% of the developers would have provided a description, so that the user is at least presented with a decision dialog.

Property %
No tracking description included but tracking code detected57.3 %
Tracking description included and tracking code detected33.5 %
No tracking description included and no tracking code detected9.0 %
Tracking description included but no tracking code detected0.2 %
Evaluation of Tracking Descriptions in German Top 2000 iOS Apps of all Categories except Games and Stickers (Appicaptor, July 2021)

Another effect we observed is a missing individualization and usefullness of the description. The table below lists the Top 10 used descriptions in the Top 2000 Apps, with 77 Apps just repeating the example text provided by Apple. Other just include placeholder text, such as “YOUR TEXT”, “NSUserTrackingUsageDescription”, “none” or even “-“.

Description #
This identifier will be used to deliver personalized ads to you.77
Your data will be used to deliver personalized ads to you.9
Dadurch können wir Ihnen relevantere Werbung anzeigen, ohne deren Anzahl zu erhöhen.7
Dies wird verwendet, um den Dienst zu identifizieren, der dich weitergeleitet hat, um die ein individuelles Erlebnis zu bieten.6
Diese Kennung wird verwendet, um Ihnen personalisierte Anzeigen zu liefern.6
Ihre Daten werden verwendet, um Ihnen personalisierte Werbung zu zeigen.6
Your data will be used to deliver personalized ads.5
Datenerhebung zur Verbesserung der App und für Werbezwecke zulassen4
Deine Aktivitäten werden verwendet, um Dein Nutzererlebnis und Werbung zu personalisieren.4
Mithilfe dieser ID können wir Dir für Dich ausgewählte Werbung anzeigen.4
Top 10 Tracking Descriptions by Occurence in German Top 2000 iOS Apps of all Categories except Games and Stickers (Appicaptor, July 2021)

This leaves us with the impression, that creating a fitting description is currently only deamed important for 1/3 of the developers. Obviously the motivation depends on the benefits the developers gain from providing a tracking description.

There are many business cases which rely on or have a benefit from cross-app user tracking. The players of these business cases (e.g., ad providers and app developers who generate ad revenue) have an interest to achieve high opt-in rates. Fitting or at least reasonable descriptions for the permission dialogue will be the key for broad acceptance rates. The current evaluation shows that (1) only a minority of apps have at least a description included and (2) that they are very unspecific.

But there are use cases which currently integrate cross-app user tracking, however it does not have a beneficial effect for the using party. For example, this is the case when an app developer integrates a runtime diagnostic library. As he is only interested in the telematic data of his app, cross-app tracking would not be his interest and for that reason he may not include the description for the permission dialogue. In this case Apple’s initiative would help to reduce user tracking from companies that provide a runtime diagnostic services with a business model of selling retrieved analytics data sets to third parties or similar use cases.

Rise and Fall of Specific Unique Identifiers

Retrieving a unique identifier may allow app developers, advertisers, analytic companies and others to identify the user’s device or the user himself. Furthermore, most of these identifiers are persistent means for tracking, advertising and marketing activities on devices. Unique identifiers might however be also necessary for certain app functionality to work as expected.

Appicaptor tracks app’s access to various unique identifiers that can be categorized in three different groups:

  • The first group refers to mobile communication relevant IDs. Examples of this category are access to the phone’s IMEIs and MAC addresses, country code of the mobile network provider, as well as the phone / voice mail number, serial number of the SIM card and mobile subscriber ID (IMSI / TIMSI) of the user.
  • The second group is identification information about the hardware or operating system given by the operating system itself. When the mobile operating system is compiled different parameters for model, hardware, serial and display size are included in the operating system build. Furthermore, a build fingerprint can distinguish different operating system builds even if the operating system version is equal.
  • The third group consists of identifiers
    • like the Android Device ID, Advertisement ID,
    • properties that the user could configure like font size / type, audio volume, timezone, display orientation lock and screen brightness, Bluetooth pairings, power saving mode configuration, audio singnal output port (speaker, headphones, Bluetooth, etc.)
    • installed app list
    • hardware parameters like cpu and set of available hardware sensors (gyroscope, barometer, …)
    • other parameters like battery or device memory (RAM and data) usage.

Every month Appicaptor evaluates the IT security quality of thousands of Android and iOS apps. The following two charts depict for each month which identifier usage is rising and which is falling. The charts plot the identifier usage (total number of apps within the 2,000 most popular apps in German Google Play Store that accesses an identifier) relatively to the identifier usage in January 2020.

Rising Unique Identifiers: identifier usage within the 2,000 most popular apps in German Google Play Store relatively to the identifier usage in January 2020
Falling Unique Identifiers: identifier usage within the 2,000 most popular apps in German Google Play Store relatively to the identifier usage in January 2020

As the relative change (given in the two charts before) does not give the perspective, which identifiers are commonly utilized and to which extent, the following table provides the absolute numbers. This table shows how many apps within the 2,000 most popular apps in German Google Play Store access an identifier in the Appicaptor analysis runs of January 2020 and February 2021. Furthermore, based on every monthly analysis run between January 2020 and February 2021 we predict a trend if the identifier usage is rising or falling based on our data.

NameIdentifier Uage
(in January 2020)
Identifier Uage
(in February 2021)
Trend
Unique Android ID1,944 1,947stable
Build model1,9471,945stable
Build
manufacturer
1,9161,941
Build fingerprint1,6791,873
Build product1,6331,760
Build brand1,5371,712
Build hardware1,1791,632
Build display1,4781,486stable
Country Code +
Mobile Network
Code
9221,158
Build serial8771,016
Mobile Country
Code
8621,000
Wifi-MAC address754717
IMEI/MEID689591
MAC address(es)547380
Phone number281264
Subscriber ID
(IMSI)
312258
SIM card serial243178
Voice mail
number
6962stable
Total number of apps that access an identifier according to Appicaptor analysis of the 2,000 most popular apps in German Google Play Store

The analysis of Appicaptor shows that the access to (generally speaking) unspecific unique identifiers (like the build related parameters) is currently rising. One might think that the access to unspecific unique identifiers (like the build brand or hardware) may be not an privacy issue since they are equal at thousands of devices/users. And that the access to a more specific unique identifier (like the SIM serial or phone number) should be more an privacy issue. However, there is more to take into consideration.

A detailed manual inspection of access patterns and looking on the landscape of the mobile value-chain shows that most of the accesses of unspecific unique identifiers are executed in 3rd party libraries, which are included in the app by the developer. Furthermore each of these unspecific information portions (if seen alone) can not be utilized to identify a specific device or person. But certain libraries access a magnitude of these unspecific unique identifiers, creating a device fingerprint from all them and transmit the data to a server backend. As an other example, an open source library of this type can be found here. It claimes to create a device identifier from all available Android platform signals, that is fully stateless and will remain the same after reinstalling or clearing application data.

The further manual inspection of other identified libraries shows as well, that libraries which probably execute device fingerprinting are utilized in many apps of different app types. A linkage between the device fingerprint and your person is possible, when you think of an app that utilizes an library that joins identifiers as device fingerprint and you give that app information about your person (name, email address, etc.). That would bring the provider of the library in the position to track your identity throughout the usage of different apps, based solely on unspecific unique identifiers.

So what can we learn from these numbers?

  • The usage of almost all specific unique identifiers are currently falling. That trend is supposed to be related to privacy preserving functions in the mobile operating systems that limit the app’s access to correct values of these identifiers. If you enable these privacy preserving functions, fake random values are provided.
  • The usage of unspecific unique identifiers is currently rising throughout all identifiers. From our perspective that rising is based on the reasoning outlined above (device fingerprinting) as well as to facilitate user identification in the presence of the current drawbacks (uncertainty if correct or fake specific unique identifiers are reported to the app by the operating system).

Therefore, in the app evaluation process one should take a look at the composition and magnitude of the list of accessed unique identifiers of an app: if many unspecified unique identifiers are accessed, this should draw one’s attention the same way as the access of an specified unique identifier should do.

Known TwitterKit Vulnerability, still an Issue?

Nearly 18 months ago we published a vulnerability in our Appicaptor blog that the current Twitter Kit framework for iOS does not properly validate the TLS certificate. This vulnerability can be utilized for man-in-the-middle-attacks.

Twitter stated that there will be no fixed version of the vulnerable library, because it is seen as deprecated and no longer supported. Therefore, developers should have taken action and have removed the TwitterKit from their apps.

Every month Appicaptor evaluates the IT security quality of thousands of Android and iOS apps. So one might ask: Does the German Top 2000 iOS Apps still utilize the vulnerable Twitter Kit library version and is there a trend visible? The following chart depicts for each month the number of apps that are prone to the vulnerability within the 2,000 most popular apps in iOS App Store.

It is visible in the chart that the amount of apps with the vulnerable library quite constantly and slowly decreases throughout the year, reducing the risks for users. Although there is quite some fluctuation on the considered top list, a clear trend is visible, even if it took much longer than thought. But nevertheless, from our point of view, none of the most popular apps should have such a massive vulnerability. Therefore we would like to advise again that app developers should switch to alternative APIs.

iOS: 40% Critical Log Statement Usage in Apps

Developers commonly need log statements in their apps to track down problems by printing out information about the current program state. However, this can also lead to serious information disclosure to third parties, as many developers still use the old NSLog statement in about 40% of the Top 2000 German iOS apps. Many developer sites state, that information logged with NSLog will not be persisted on the device and therefore the usage is not critical. However, that’s not correct as we will demonstrate in the following for current iOS devices.

Over the years, Apple has changed a lot under the hood of iOS. Likewise the logging mechanism has changed in multiple aspects. One major change was the introduction of unified logging with iOS 10, which provides log levels, information hiding for sensitive entries and many other configuration capabilities. However, these new feature are only usable if the new os_log macro is used.

When using NSLog, the log messages are stored with default log level persistently for a certain time, which was tested with iOS 12.4, iOS 13.3 and 13.4 on non-jailbroken devices. Depending on the usage intensity of the iOS device, the stored log messages can go back days or months.

Log entries on these iOS devices are stored system-wide in the directory /var/db/diagnostics/Persist in files of the tracev3 binary format, which can be made readable again e.g. with the OSX log tool or platform independent wth UnifiedLogReader. The stored database files are protected by iOS DataProtection class NSFileProtectionCompleteUntilFirstUserAuthentication with the device passcode until the first user’s logon and can only be read by the administrative user root.

This means, in a lost-device scenario for an iOS device without passcode, these logging outputs can be read directly via USB using the iOS sysdiagnose function. If a passcode is set, the passcode is required to read the logging outputs.

However, since users are often asked to send the sysdiagnose data to Apple or App developers (see instructions e.g: https://faq.pdfviewer.io/en/articles/1458505-ios-how-to-send-a-sysdiagnose), in this case third parties will get access to log messages of all utilized apps (within the persisted time frame).

Among many other debug information, the transmitted file sysdiagnose_[date]_iPhone_OS_[device].tar.gz contains the file system_logs.logarchive. It is compressed and needs to be converted first to make use of it. This can be done quite easy on OSX.

The file system_logs.logarchive can be viewed on OSX with the log command:

log show system_logs.logarchive --info --debug > logs.txt

To use the UnifiedLogReader instead, one first has to extract the files from the system_logs.logarchive to a folder and start the python script inside this folder like this, e.g. on Windows systems:

python [path_to]\UnifiedLogReader.py .\ .\timesync\ .\Persist [output_folder]

In the output file logs.txt, one can search for the app binary name to find the related log messages. In the following example the binary is called TestApp:

2020-03-25 15:18:19.707346+0100 0x517db  Default  0x0  737 0 TestApp: My NSLog example GPS: {"geoData": {"latitude": 49.xxx, "longitude": 8.xxx, "radius": 2.818104}, "filters": {}, "exclude": []}
2020-03-25 15:18:25.635420+0100 0x517db  Default  0x0  737 0 TestApp: My NSLog example GPS: {"geoData": {"latitude": 49.xxx, "longitude": 8.xxx, "radius": 5.515101}, "filters": {}, "exclude": []}
2020-03-25 15:18:28.657474+0100 0x517db  Default  0x0  737 0 TestApp: My NSLog example GPS: {"geoData": {"latitude": 49.xxx, "longitude": 8.xxx, "radius": 10.625031}, "filters": {}, "exclude": []} 

In these log messages, we often find GPS-positions along with email addresses, generated encryption keys, full credit card information and much more entered user content. Even for apps that primarily do not store sensitive data, the log can also reveal sensitive information such as sensitive app names, their installation dates and how and what was used inside the apps.   

From a user’s perspective, it should now be clear:

  • Do not send sysdiagnose data to anyone!
  • Deny access for Apple (see https://support.apple.com/en-us/HT202100), however, this does not disable the logging nor does this disable the possibility to create sysdiagose files.
  • Use a good passcode to make it harder to access these files unauthorized. 

But the best protection is to use apps, that don’t store sensitive data to logfiles!

So, make a test for yourself and inspect your sysdiagonse file to learn more about your apps and the things they store.

Developers should take a look at iOS Unified Logging with the os_log macro. It can be used to programmatically enable a persistent storage only for cases when needed for remote debugging (if that’s necessary at all). For all other cases it can be configured to use only console output, preventing a data leakage via persitent log files.

Vulnerable Library Warning: TwitterKit for iOS

The Twitter Kit framework through 3.4.2 for iOS does not properly validate the TLS certificate for api.twitter.com. That’s a finding found with our static binary analysis and reported to Twitter.

There will be no fixed version, as this library is no longer supported by Twitter, but this vulnerable library was still found in many apps. It is urgently advised that their app developers switch to alternative APIs.

Such issues are likewise common, which illustrates the need to check for vulnerable or outdated 3rd party code.

What makes this issue a bit special is the way the developers broke the validation of TLS certificates. Apparently, they wanted to increase the security by implementing a public key pinning of trusted root certificate authorities (CA), such as VeriSign, DigiCert and GeoTrust. So they created the following array with entries of 21 public key hashes for the CAs:

"1a21b4952b6293ce18b365ec9c0e934cb381e6d4",  // "VERISIGN_CLASS3_G2"
"2343d148a255899b947d461a797ec04cfed170b7",  // "VERISIGN_CLASS1"
"5519b278acb281d7eda7abc18399c3bb690424b5",  // "VERISIGN_CLASS1_G3"
"1237ba4517eead2926fdc1cdfebeedf2ded9145c",  // "VERISIGN_CLASS2_G2"   
"5abec575dcaef3b08e271943fc7f250c3df661e3",  // "VERISIGN_CLASS2_G3"    
"22f19e2ec6eaccfc5d2346f4c2e8f6c554dd5e07",  // "VERISIGN_CLASS3_G3"
"ed663135d31bd4eca614c429e319069f94c12650",  // "VERISIGN_CLASS3_G4"
"b181081a19a4c0941ffae89528c124c99b34acc7",  // "VERISIGN_CLASS3_G5"
"3c03436868951cf3692ab8b426daba8fe922e5bd",  // "VERISIGN_CLASS4_G3"
"bbc23e290bb328771dad3ea24dbdf423bd06b03d",  // "VERISIGN_UNIVERSAL"
"c07a98688d89fbab05640c117daa7d65b8cacc4e",  // "GEOTRUST_GLOBAL"
"713836f2023153472b6eba6546a9101558200509",  // "GEOTRUST_GLOBAL2"
"b01989e7effb4aafcb148f58463976224150e1ba",  // "GEOTRUST_PRIMARY"
"bdbea71bab7157f9e475d954d2b727801a822682",  // "GEOTRUST_PRIMARY_G2"
"9ca98d00af740ddd8180d21345a58b8f2e9438d6",  // "GEOTRUST_PRIMARY_G3"
"87e85b6353c623a3128cb0ffbbf551fe59800e22",  // "GEOTRUST_UNIVERAL"
"5e4f538685dd4f9eca5fdc0d456f7d51b1dc9b7b",  // "GEOTRUST_UNIVERSAL2"
"d52e13c1abe349dae8b49594ef7c3843606466bd",  // "DIGICERT_GLOBAL_ROOT"
"83317e62854253d6d7783190ec919056e991b9e3",  // "DIGICERT_EV_ROOT"
"68330e61358521592983a3c8d2d2e1406e7ab3c1",  // "DIGICERT_ASSUREDID_ROOT"
"56fef3c2147d4ed38837fdbd3052387201e5778d",  // "TWITTER1"

On every new connection, TwitterKit for iOS checks in method evaluateServerTrust whether the received certificate chain contains a certificate with a fitting public key of the list above. This way, certificates for api.twitter.com issued by possibly untrusted CAs should be blocked. However, this approach lacks a very important verification: The domain name of the leaf certificate is not verified by iOS, as TwitterKit for iOS implemented an own delegate method for its public key pinning functionality. In this case, iOS only verifies, that the certificate chain is valid regarding signatures. All other checks have to be performed by the delegate method, to provide the flexibility for alternative verification methods. A simple fix would have been additionally calling the iOS method secTrustEvaluate and utilize its result value to reject certificates for other domains.

Because of the missing domain name verification, any valid certificate chain containing a certificate with a public key hash of that list is accepted by the app. An attacker with a valid certificate for his own domain, issued by one of these CAs, can use this certificate for man-in-the-middle-attacks against apps communicating via the Twitter Kit for iOS with api.twitter.com. As the implementation does not check the position inside the chain, the matching public key could also be in the middle of the chain, such as in case of an intermediate certificate.

We used a matching legitimate certificate, issued by DigiCert for a domain under our control to verify the impact of the vulnerability. So we redirected the traffic for api.twitter.com to our server, that answers the request with our own certificate. The received content is logged and transmitted to the ‘real’ Twitter servers. The server’s response is also logged and transmitted back to the app. However, as the login process for Twitter involves a WebView, which does not use the vulnerable pinning functionality, it would not accept our certificate. As the WebView loads its content via the same domain name, we had to distinguish TLS connections based on differences in ALPN extension of the TLS Client Hello and route WebView connections without interception, to create a fully working proof of concept attack.

During the Login with Twitter process, our man-in-the-middle proxy recorded for a vulnerable news app the OAuth 1.0 oauth_token_secret together with the authorized oauh_token. This enabled us to fully use the provided Twitter API with these long term secrets. Attackers could perform actions like changing content of the profile, creating fake tweets and direct messages or abusing the account to push tweets via fake likes. It would also be possible to read private direct messages sent or received within the last 30 days. We could not retrieve the password nor set it to a known value, so an attacker could not use the vulnerability to lock out a user from his account. However, by changing the Twitter password a victim would also not be able to invalidate the sniffed OAuth tokens. For this it is required to revoke the app’s authorization within the Twitter settings.

Further, on every app start the vulnerable app checks the validity of the Twitter account by invoking the Twitter API account/verify_credentials.json. In case the credentials are valid, the response contains detailed information about the victims Twitter profile, such as ID, name, location and last activities. As the response can be read in our attack scenario, the information can be used to collect information about the victim to track him or dynamically create targeted phishing attacks.

We will demonstrate the vulnerability and its detection at it-sa 2019 fair in Nuremberg, Germany at Hall 9, Booth 234.