- From: Mike West <mkwst@google.com>
- Date: Tue, 9 Dec 2014 12:34:33 +0100
- To: Brian Smith <brian@briansmith.org>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>, Jochen Eisinger <eisinger@google.com>
- Message-ID: <CAKXHy=dxbBCsnuAUiOOSjFdWHQ97FGAwMJZbVhPCHSMZ_yEyVw@mail.gmail.com>
On Wed, Dec 3, 2014 at 10:23 PM, Brian Smith <brian@briansmith.org> wrote: > Hi, > > I've now written down my ideas on how to improve browsers' handling of > the Referer header field: > > https://briansmith.org/referrer-01.html Thanks for putting this together, Brian! It's a well thought-out plan, and I think there are a lot of good ideas there to dig through. That said, I've coincidentally been chatting with some folks inside Google about a similar proposal since early November (see chrome://flags/#reduced-referrer-granularity for the public bits of that work). My TL;DR is that the proposal as written would be quite harmful to the way folks make money on the web. That's a strong claim, so I'll elaborate a bit. :) # Advertising + Full URL for subresource requests Your proposal simplifies the model by limiting cross-origin subresource referrer information to 'origin' (or 'none') by default, and disallowing an opt-in above that default. I don't believe that's feasible in the short-to-medium term. The central issue that I've heard internally is that our ads teams have a hard requirement to know the URL of the page embedding an advertisement. We certainly use this information for content targeting and building user profiles, and dropping it would cost some amount of money. However, the most critical driver of this requirement actually comes from policy enforcement: we must not serve ads on certain kinds of pages (see [1 <https://support.google.com/adsense/answer/1348688?hl=en&topic=1271507&rd=1>] for examples), and if we can't crawl the page to determine its content, we can't make these policy guarantees. Aside from just being a bad thing to do, there are regulatory implications to lapses here. I talked with folks about alternative mechanisms for transmitting the URL (e.g. GET parameters, postMessage, etc). I think that's an area still worth exploring, but two aspects were raised as concerning: * URLs are limited to ~2k, and we already see truncation on a semi-regular basis (in the low hundreds of millions of requests a day). Cramming an encoded URL into a GET parameter would bring us above that limit more frequently. * For a variety of reasons, JavaScript-driven embedding is significantly less reliable than parser-driven embedding. I heard claims of >1% loss between a JS-requested image, and a plain <img> tag (I'm following up on those claims for details). The conclusion internally is that any change to the defaults would need a mechanism for publishers to opt-into the status quo for subresource requests (and would result in many/most doing so). # Navigational requests The proposal blocks cross-origin navigational referrer information entirely (with a carveout for HTTPS->HTTPS opt-in to 'origin'). A similar aspect of my internal proposal worried our fraud detection team, who often use navigational referrer information to discover and analyze some groups of scammers. Dropping that data would cause some amount of financial impact. # Compatibility In your proposal, you mention sites that use referrer information as part of an access control scheme. Folks internally raised similar concerns. It's difficult to estimate what percentage of sites would be affected, but it's certainly >0; it's not clear to me how your proposal addresses those concerns. # Analytics Obviously, this would have a substantial impact on folks' ability to analyze traffic patterns incoming to their sites. Equally obviously, this is a privacy impact we'd like to address. It seems like there's a balance to be struck, and I'd suggest that 'origin' is closer than 'none' to a reasonable position. # Anyway: I very much like the notion of separating subresource and navigational referrer information controls. We should do that. I also very much like the idea of separating a page-level policy out from a policy for specific requests/links. We should explore that. `rel="whatever-referrer"` makes sense to me for links, I think a similar attribute would be valuable for other request-generating tags, like <iframe>, <form>, etc. Thanks again for putting this together! +Jochen, who might or might not agree with anything I've said above. :) [1]: https://support.google.com/adsense/answer/1348688?hl=en&topic=1271507&rd=1 -mike -- Mike West <mkwst@google.com> Google+: https://mkw.st/+, Twitter: @mikewest, Cell: +49 162 10 255 91 Google Germany GmbH, Dienerstrasse 12, 80331 München, Germany Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores (Sorry; I'm legally required to add this exciting detail to emails. Bleh.)
Received on Tuesday, 9 December 2014 11:35:22 UTC