This is an old concern I'm trying to revive to figure out a path forward on. I'm going to do my best to summarize this issue. The issue at hand is that codec support is not uniform across all users. The 'why' for this is that ffmpeg is not installed on all Linux machines; and that certain versions of Windows (N and KN versions in Europe) don't have the Windows Media Pack.
I don't know for certain of any other situations where unusual codecs might be present and usable that allow fingerprinting or situations where common codecs would be missing or disabled (excepting, of course, about:config changes.) But it seems likely? If a user has installed a weird and unusual codec, the browser might still be able to play the video because the codec's on the system? Or perhaps a user is on an old Linux that has an old ffmpeg that doesn't support a newer codec?
I don't think we have data about how common/uncommon it is that Linux users don't have ffmpeg or Windows users are missing WMF; although an approximate answer was '90% or more people have it.' Firefox detects when a user tries to play a video with a codec and it doesn't work; MediaWMFNeeded is the most relevant; it redirects people to a page that tells them to install WMF: https://support.mozilla.org/en-US/kb/fix-video-audio-problems-firefox-window... https://searchfox.org/mozilla-central/source/dom/locales/en-US/chrome/dom/do... https://searchfox.org/mozilla-central/rev/bee8cf15c901b9f4b0c074c9977da4bbeb...
So the codec is the issue; but how it gets exposed through the platform is three ways: - canPlayType - and 'old' API that asks if a <video> element can play a particular codec - The Media Capabilities API - the new style API - Just trying to play videos with different codecs and seeing if it works
In addition to exposing codec information; it also exposes things like whether certain codecs are hardware accelerated or if a machine can handle high resolution video or not.
My understanding is that the third one is essentially impossible to prevent; absent requiring the user to manually click play for each video.
As far as a path forward; there are a few ideas: - Maybe we can allowlist codecs so only certain codecs are usable? - In the short term; we can modify canPlayType and Media Capabilities to accurately report (effectively) whether or not the user has WMF/ffmpeg or not but lie about hardware acceleration and all other aspects. (We have a patch for this written.) - Alternately; we could lie and say no one has it installed (this would be a bad experience for most users) or everyone has it installed (bad for a small percentage of users) - Tor Browser could detect whether or not you have ffmpeg / WMF and inform you in some capacity that installing this software will make you less identifiable - Tor can probably bundle ffmpeg in some capacity; but it can't bundle WMF due to licensing.
All of these have some drawbacks; and we probably want to use data to make an informed decision. I don't think we know what codecs we care about; or are installed by ffmpeg/WMF. I (personally) don't know if Firefox will pass non-standard codec requests out to the system to see "Hey do you happen to have a codec for 'randomweirdthing'" I don't think we know what the install rate for ffmpeg/WMF is. We have some data about codec use; however I don't fully understand it: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_d...
There is a bug on file here https://bugzilla.mozilla.org/show_bug.cgi?id=1461454 ; however most discussion over the months has taken place in an email thread.
-tom
On 1/16/19 5:26 PM, Tom Ritter wrote:
I don't know for certain of any other situations where unusual codecs might be present and usable that allow fingerprinting or situations where common codecs would be missing or disabled (excepting, of course, about:config changes.) But it seems likely? If a user has installed a weird and unusual codec, the browser might still be able to play the video because the codec's on the system? Or perhaps a user is on an old Linux that has an old ffmpeg that doesn't support a newer codec?
Last I cared about this, firefox would happily use libgstreamer if it was installed as well, though my notes (read: code comments) indicate that upstream was deprecating support for this particular nightmare.
I solved the problem by explicitly whitelisting the codec providers that I bind mounted into the container (a handful of versions of libavcodec-ffmpeg and libavcodec), which likely wasn't perfect, but did solve the libgstreamer issue.
In addition to exposing codec information; it also exposes things like whether certain codecs are hardware accelerated or if a machine can handle high resolution video or not.
Force disable hardware acceleration? It's what I did.
- Tor can probably bundle ffmpeg in some capacity; but it can't
bundle WMF due to licensing.
Time to pay money to MPEG-LA to ship something actually useful.
Regards,
Hi Tom!
Sorry for the delay in answering.
Tom Ritter:
This is an old concern I'm trying to revive to figure out a path forward on. I'm going to do my best to summarize this issue.
Thanks for the great write-up!
[snip]
As far as a path forward; there are a few ideas:
- Maybe we can allowlist codecs so only certain codecs are usable?
- In the short term; we can modify canPlayType and Media Capabilities
to accurately report (effectively) whether or not the user has WMF/ffmpeg or not but lie about hardware acceleration and all other aspects. (We have a patch for this written.)
- Alternately; we could lie and say no one has it installed (this
would be a bad experience for most users) or everyone has it installed (bad for a small percentage of users)
- Tor Browser could detect whether or not you have ffmpeg / WMF and
inform you in some capacity that installing this software will make you less identifiable
- Tor can probably bundle ffmpeg in some capacity; but it can't
bundle WMF due to licensing.
All of these have some drawbacks; and we probably want to use data to make an informed decision.
Yes, I agree with that but I could think of Tor Browser generally shipping the short term solution you already have a patch for to try it out and see how big the fallout is. We might not be able to help much, though, with the Media Capabilities API given that we are on ESR 60.
Skimming the spec a bit, following the advice of the spec in §4.1 in
https://wicg.github.io/media-capabilities/#security-privacy-considerations
seems not unreasonable to me at first glance. But I am still a bit unsure about the trade-offs here as I'd need to look closer at all the things besides the codec being exposed. Do we have a table somewhere showing the entropy those things add? CanPlayType and actually trying to play it give just a yes/no back per codec, right? How many bits get exposed by the Media Capabilities API?
When the spec is saying:
""" This information is expected to have a high correlation with other information already available to the web pages as a given class of device is expected to have very similar decoding/encoding capabilities. """
what other information available to web pages is it talking about? Do we spoof/deal with that already? And if so, would it help here taking the decisions that got made for them into account while developing defense? And if not, would it help tackling those other vectors together with the codec one?
I don't think we know what codecs we care about; or are installed by ffmpeg/WMF. I (personally) don't know if Firefox will pass non-standard codec requests out to the system to see "Hey do you happen to have a codec for 'randomweirdthing'" I don't think we know what the install rate for ffmpeg/WMF is. We have some data about codec use; however I don't fully understand it: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_d...
I wonder how we could get the answer to all of those questions to actually make an informed decision as you suggest.
Georg
On Thu, 31 Jan 2019 at 10:51, Georg Koppen gk@torproject.org wrote:
Skimming the spec a bit, following the advice of the spec in §4.1 in
https://wicg.github.io/media-capabilities/#security-privacy-considerations
seems not unreasonable to me at first glance. But I am still a bit unsure about the trade-offs here as I'd need to look closer at all the things besides the codec being exposed. Do we have a table somewhere showing the entropy those things add? CanPlayType and actually trying to play it give just a yes/no back per codec, right?
canPlayType can return yes, no, or maybe. The maybe case *appears* to happen when the container format of the media allows the codec parameter; but the parameter is not provided. Someone trying to fingerprint could induce the maybe case but doing so would get them no benefit when they could get a definitive answer; so it is effectively 'yes/no'.
How many bits get exposed by the Media Capabilities API?
It exposes three bits of information about a request: support, smooth, and power efficient. powerEfficient maps to whether the codec is hardware accelerated. smooth is always true except for non-hardware-accelerated VP9 on low power devices.
Effectively this means that Media Capabilities exposes: a) If you don't have hardware-accelerated VP9, a fairly precise CPU benchmark b) Whether certain codecs are hardware accelerated c) The codecs supported
I tried running down (b). It's pretty tough to get precise info. I found one case in WMFVideoMFTManager relating to .wmf playback but I couldn't figure out what the hardware feature involved was easily. On mac it appears that all modern macs (2010 and newer) have hardware acceleration of h264. HEVC/h265 is hardware accelerated on some models. [0]
For the rest I couldn't figure out how we get that information from Mozilla's code =/ Wikipedia says: https://en.wikipedia.org/wiki/VP9#Hardware_implementations There's also opus, vp8...
AFAIK no one's built a media codec fingerprinting test page. It requires a listing of codecs to test for and I don't know an exhaustive list yet.
[0] * MacBookPro October 2016 and newer: yes * iMac 5K, Late 2015: yes * iMac 21.5, 4K and 5K, June 2017: yes * iMac Pro: yes * any Mac mini, MacBook Air, Mac Pro or earlier releases of iMac or MacBook Pro are not
When the spec is saying:
""" This information is expected to have a high correlation with other information already available to the web pages as a given class of device is expected to have very similar decoding/encoding capabilities. """
what other information available to web pages is it talking about? Do we spoof/deal with that already?
I imagine it's referring to the number of CPU cores as well as a rough CPU benchmark that could be done by WebGL/javascript...
We don't really intentionally try to protect against an attacker benchmarking your computer's performance; aside from making WebGL click to play, turning off the JIT optimizations....
And if so, would it help here taking the decisions that got made for them into account while developing defense? And if not, would it help tackling those other vectors together with the codec one?
/shrug
Preventing benchmarking by someone who wants to do it against Tor Browser seems very difficult...
I don't think we know what codecs we care about; or are installed by ffmpeg/WMF. I (personally) don't know if Firefox will pass non-standard codec requests out to the system to see "Hey do you happen to have a codec for 'randomweirdthing'" I don't think we know what the install rate for ffmpeg/WMF is. We have some data about codec use; however I don't fully understand it: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_d...
I wonder how we could get the answer to all of those questions to actually make an informed decision as you suggest.
I suppose step one is to make a fingerprinting test that identifies what codecs you can handle...
-tom
On Wed, 6 Feb 2019 at 11:33, Tom Ritter tom@ritter.vg wrote:
I suppose step one is to make a fingerprinting test that identifies what codecs you can handle...
I started work on this; but got some tribal knowledge from jya and others: https://mozilla.logbot.info/media/20190206#c15931239-c15932085
Summarizing: - Only AAC, h264, and vp9 may give different reports
- AAC Mac - always there - AAC Windows - May not be present on Windows N or KN - AAC Linux - Missing if ffmpeg is missing
- h264 Mac - always there. Almost always accelerated, except "hackintosh with nvidia cards and the mac pro 2013 trashcan" - h264 Windows - always there; but not hardware accelerated if your GPU card is blocklisted - h264 Linux - Missing if ffmpeg is missing or not compiled with h264 support. never hardware accelerated.
- vp9 Mac - Always present - vp9 Windows - Always present, but may or may not be hardware accelerated. If not hardware accelerated, you can roughly benchmark the machine using different video size inputs and the resulting output of the smooth value - vp9 Linux - Always present
So by hardcoding a response for hardware acceleration and the smooth value; the only data we would leak would be information about your ffmpeg install (or lack thereof) on Linux; and on Windows whether you're missing the Media Framework (which is a good indicator of being a European user AIUI).
-tom
Tom Ritter:
On Wed, 6 Feb 2019 at 11:33, Tom Ritter tom@ritter.vg wrote:
I suppose step one is to make a fingerprinting test that identifies what codecs you can handle...
I started work on this; but got some tribal knowledge from jya and others: https://mozilla.logbot.info/media/20190206#c15931239-c15932085
Thanks for getting this started, really appreciated! What's the story on Android (given that we are not far away from shipping stable Tor Browser bundles to mobile users)?
Summarizing:
- Only AAC, h264, and vp9 may give different reports
I think it would still be good to verify that for other codecs and be it just to catch possible regressions.
AAC Mac - always there
AAC Windows - May not be present on Windows N or KN
AAC Linux - Missing if ffmpeg is missing
h264 Mac - always there. Almost always accelerated, except
"hackintosh with nvidia cards and the mac pro 2013 trashcan"
- h264 Windows - always there; but not hardware accelerated if your
GPU card is blocklisted
- h264 Linux - Missing if ffmpeg is missing or not compiled with h264
support. never hardware accelerated.
- vp9 Mac - Always present
- vp9 Windows - Always present, but may or may not be hardware
accelerated. If not hardware accelerated, you can roughly benchmark the machine using different video size inputs and the resulting output of the smooth value
- vp9 Linux - Always present
So by hardcoding a response for hardware acceleration and the smooth value; the only data we would leak would be information about your ffmpeg install (or lack thereof) on Linux; and on Windows whether you're missing the Media Framework (which is a good indicator of being a European user AIUI).
Yep. So, let's get started with that approach.
FWIW: I am not sure where we are with the Display Capabilities section of the spec but §4.2 seems to show the way forward with respect to the resist fingerprinting mode.
Georg
On Thu, 7 Feb 2019 at 10:08, Georg Koppen gk@torproject.org wrote:
Tom Ritter:
On Wed, 6 Feb 2019 at 11:33, Tom Ritter tom@ritter.vg wrote:
I suppose step one is to make a fingerprinting test that identifies what codecs you can handle...
I started work on this; but got some tribal knowledge from jya and others: https://mozilla.logbot.info/media/20190206#c15931239-c15932085
Thanks for getting this started, really appreciated! What's the story on Android (given that we are not far away from shipping stable Tor Browser bundles to mobile users)?
I'm told that it's approximately the same situation as Linux. h264, aac is technically provided by the OS but it's "always" there... which means that there is some small fraction of users who are probably looking pretty unique...
https://mozilla.logbot.info/developers/20190208#c15942787-c15942846
Summarizing:
- Only AAC, h264, and vp9 may give different reports
I think it would still be good to verify that for other codecs and be it just to catch possible regressions.
Yes; although this would be best checked with fpcentral; since we'd be looking for a machine with weird hardware rather than our very similar testing machines.
-tom
Tom Ritter:
- Tor can probably bundle ffmpeg in some capacity
I've always wondered why Linux TB doesn't bundle way more libraries than it does. For example, there was a ticket where IIRC a font fingerprinting issue was traced back to libm, of all things.
Obviously download size is a factor, but maybe not such a major one these days, compared to the benefits of permanently squashing a ton of intra-Linux fingerprinting issues, present and future.
Has anyone experimented with a Linux TB package that just ships *everything*, right down to libX11/libxcb, libc, etc.? Like almost a container filesystem, although it wouldn't necessarily make use of any kernel namespace stuff. What size would it be?
Rusty