Researchers from Princeton University, conducting a privacy survey of the top one million web sites, discovered a variety of tracking and identification techniques in use, including a novel tactic that uses audio signals to fingerprint machines and browsers.
The Princeton study measured a slew of different stateful and stateless tracking techniques, with the goal of measuring the ways that the most popular sites track visitors. User tracking is one of the more controversial topics in the privacy and security communities, and site owners and publishers are looking for new ways to identify and make money from users as more and more of them employ ad blockers. The idea of tracking users across the Web with cookies and third-party identifiers is not new, but as users continue to learn about and resist these kinds of techniques, site owners and technology providers are searching for other methods to accomplish the task.
In the study, the Princeton researchers ran a set of measurements on the home pages of the Alex top one million sites. IN addition to finding typical cookie tracking, the researchers discovered a small number of sites that are using the HTML5 AudioContext API to perform fingerprinting visitors. There are at least two different ways that sites are doing this fingerprinting, including one technique that produces an audio signal and then uses a script to process it.
“In the simplest case, a script from the company Liverail checks for the existence of an AudioContext and OscillatorNode to add a single bit of information to a broader fingerprint. More sophisticated scripts process an audio signal generated with an OscillatorNode to fingerprint the device. This technique appears conceptually similar to that of canvas fingerprinting,” the Princeton study says.
Among the sites the researchers found to have audio fingerprinting scripts running are Expedia, Hotels.com, Travelocity, and several other travel sites.
LiveRail, a company owned by Facebook, bills itself as a “monetization platform”, and works with publishers. The researchers found that the LiveRail script was present on 512 of the one million sites they surveyed in March. In order to fingerprint the browser and machine visiting a site, scripts that the researchers discovered hash the signal produced by the OscillatorNode and use that as an identifier. The researchers said that the LiveRail script on the sites they surveyed simply checks for the presence of the AudioContext API, and then adds that to the overall fingerprint.
“Audio signals processed on different machines or browsers may have slight differences due to hardware or software differences between the machines, while the same combination of machine and browser will produce the same output,” the paper says.
“We tested the output of the scripts on a small sample of machines, and confirmed the values returned are largely stable on the same machine and different for different machines.”
The use of audio to track devices and users is becoming a trend. Companies such as SilverPush are using audio beacons in TV ads to track users across multiple devices. Regulators have taken notice of the practice, and in March the FTC warned Android app developers about the use of the SilverPush code.
The Princeton researchers plan to continue their privacy survey on a monthly basis and will be reporting the results on their Web Census page.