A couple of weeks ago, a lobby group of big internet providers like Verizon and Comcast attacked a set of online privacy regulations that, according to them, are way too strict. In a filing to the Federal Communication Commission, the group stated that providers should be able to sell customer`s internet history without their authorization, as such information should not be considered sensitive. Also, the group argues, web traffic encryption is significantly increasing, making it impossible to providers to get access to this info.
Web traffic encryption is rising indeed. Statistics from the Mozilla company clearly shows that more than 50% of the web pages use HTTPS – the standard way of web traffic encryption. If websites like The Atlantic use HTTPS, in the web browsers of the users appears a lock icon, meaning that the information being sent from and to servers is scrambled and it can`t be read by third parties which intercept it, including ISPs.
However, even if all website were encrypted, ISPs would still manage to extract a pretty big amount of detailed information about their customers` online activities. This is of great importance when it comes to a bill that passed Congress this week, allowing ISPs to sell their customers browsing history without their permission.
Even though the provider is not able to see the exact URL of a page, accessed through HTTPS, they can still see the domain the URL is on. For instance, if you are visiting a news website which uses HTTPS, your ISP cannot tell which story exactly you are reading, but it can still tell which site you are visiting. However, if you are visiting a page which doesn’t use HTTPS, the ISP would be able to extract much more sensitive information.
“The network patterns that belong to each video title have very, very strong meaning.”
This is an example from a 2016 report, made by Upturn – a think tank that focuses on technology and civil rights. The report sets out some quite sneaky methods of how users` activity can be decoded based only on the unencrypted metadata which accompanies web traffic. This metadata is also known as “side channel information”. These sneaky tactics may not be widely used in the moment, but if ISPs decide to learn more about encrypted web traffic, they may be deployed.
For instance, website fingerprinting uses the unique web pages` characteristics in order to reveal when exactly it is being accessed. When a user visits a site, their browsers pull data from several servers in a particular order. Then, using this patterns, the internet provider may be able to tell what page the user is accessing even without having access to any of the actual data streams it`s transporting. In order for this to work, the ISP would have to have analyzed the loading pattern in advance.
In November last year, a group of experts from Ariel Universities and Israel`s Ben-Gurion found a way to extend the website fingerprinting idea to YouTube videos. The researchers were able to tell what video from the limited set a particular user was watching by matching the encrypted data patterns created by the user viewing a particular video to an index they’d created previously. This tactic has a 98% accuracy.
The author of this research paper is Ran Dubin, a Ph.D. candidate at Ben Gurion. Dubin says that the discovery came out while he was working on optimizing video streaming. He wanted to know if it is possible to figure out the quality at which users were watching videos on YouTube, so analyzed the way devices received data as they streamed. He did find something big.
“The network patterns that belong to each video title have very, very strong meaning.” – Dubin said – “I found out that I could actually recognize each stream.”
The giveaway, Dubin found, was embedded in the way devices choose a bitrate (an indicator of video quality) at which to stream the video. When the stream is beginning, the player receives spurt of data which space apart once the video has been playing for a while and the player has chosen a bitrate. The pattern of these spikes is used for the identification of each video.
The experts took fingerprints from 100 YouTube videos with the help of a browser crawler to automatically download each video and them cataloged the resulting data pattern. Then, they analyzed the traffic patterns which a device created while playing one of 2,000 videos, including the 100 target ones. The researchers were able to tell that one of the target videos was being watched by using an algorithm to match the stream to the nearest fingerprint.
Dubin says that this technique could be used by law enforcement to identify users who are watching ISIS propaganda. However, the technique could also be used to complete users` viewing data and sell it to advertisers. Here come the privacy rules that just passes Congress. If the American president, Donald Trump, signs the bill, ISPs will be able to sell their customers` data without having to ask for their permission.