- From: Sergey Shekyan <shekyan@gmail.com>
- Date: Mon, 16 Jan 2017 13:47:30 -0800
- To: Jonathan Garbee <jonathan.garbee@gmail.com>
- Cc: Daniel Veditz <dveditz@mozilla.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAPkvmc_h54RpyVq4v4xN_2QZLMvbZP5su=ogy3agL5nuvVp7jQ@mail.gmail.com>
robots.txt is either is an on/off switch, while what I propose is more granular, allowing websites to chose how to respond. On Sat, Jan 14, 2017 at 5:52 AM, Jonathan Garbee <jonathan.garbee@gmail.com> wrote: > I don't see where having a header or something to help detect automated > access will be beneficial. We can already automate browser engines. > Headless mode is just a native way to do it. So, if someone is already not > taking your robots.txt into account, they'll just use another method or > strip whatever we add to say headless mode is in use out. Sites don't gain > any true benefit from having this kind of detection. If someone wants to > automate tasks they do regularly, that's their prerogative. We have > robots.txt as a respectful way to ask people automating things to avoid > certain areas and actions, that easily continues into headless mode. > > On Sat, Jan 14, 2017, 4:28 AM Sergey Shekyan <shekyan@gmail.com> wrote: > >> I am talking about tools that automate user agents, e.g. headless >> browsers (PhantomJS, SlimerJS, headless Chrome), Selenium, curl, etc. >> I mentioned navigation requests as don't see so far how advertising >> automation to non-navigation requests would help. >> Another option to advertise can be a property on navigator object, which >> would defer possible actions by authors to second request. >> >> >> On Sat, Jan 14, 2017 at 12:56 AM, Daniel Veditz <dveditz@mozilla.com> >> wrote: >> >> On Fri, Jan 13, 2017 at 5:11 PM, Sergey Shekyan <shekyan@gmail.com> >> wrote: >> >> I think that attaching a HTTP request header to synthetically initiated >> navigation requests (https://fetch.spec.whatwg.org/#navigation-request) >> will help authors to build more reliable mechanisms to detect unwanted >> automation. >> >> >> I don't see anything in that spec about "synthetic" navigation requests. >> Where would you define that? How would you define that? Is a scripted >> window.open() in a browser "synthetic"? what about an iframe in a page? >> Does it matter if the user expected the iframe to be there or not (such as >> ads)? What if the page had 100 iframes? >> >> Are you trying to solve the same problem robots.txt is trying to solve? >> If not what kind of automation are you talking about? >> >> - >> Dan Veditz >> >> >>
Received on Monday, 16 January 2017 21:48:23 UTC