- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 6 Apr 2009 21:54:47 +0300
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: Simon Pieters <simonp@opera.com>, Doug Schepers <schepers@w3.org>, HTML WG <public-html@w3.org>, "www-svg@w3.org" <www-svg@w3.org>
On Apr 1, 2009, at 10:37, Jonas Sicking wrote: >>>>>> Problems with 2: >>>>>> Just stripping a heading and trailing "<![CDATA[" / "]]>" would >>>>>> break >>>>>> markup like: >>>>>> <style> >>>>>> <![CDATA[ >>>>>> rect { fill: yellow; } >>>>>> ]]> >>>>>> <![CDATA[ >>>>>> circle { fill: blue; } >>>>>> ]]> >>>>>> </style> >>>>>> >>>>>> which probably happens occasionally due to copy-n-pasting. >>>> >>>> I don't like this, because it requires going back and modifying >>>> buffers that >>>> had been already built instead of just tweaking forward-only >>>> tokenizer state >>>> transitions, and it doesn't even work in the case where there are >>>> multiple >>>> CDATA sections as shown above. If we end up doing something other >>>> than >>>> what's currently in the draft, I'd much rather have what what >>>> Simon proposes >>>> as #4. >>> >>> The stripping doesn't happen at a tokenizer stage. It happens after >>> all parsing is done when the inline data is taken from the DOM and >>> passed to the serializer. >> >> Do you mean passed to the script engine? > > Yes, thanks. > >> So the string "<![CDATA[" would appear in the content of the text >> node in the DOM? > > Yes If "<![CDATA[" ends up in the DOM, I think the end result could be made more robust if the operation of handing DOM data to the CSS or JS parser didn't try to drop "<![CDATA[" and "]]>" but instead the JS and CSS parser were changed to treat those strings as comments, i.e. like "/* */". This way, they wouldn't be dropped from within potentially existing string literals. This approach would cause notable leakage of the SVG-in-text/html feature into other parts of a browser engine, though, which isn't very nice. Also, I'm a bit concerned that letting "<![CDATA[" and "]]>" reach the DOM would result in those strings being escaped as ">![CDATA[" and "]]<" if serialized to XML, so going back and forth a couple of times through real serializer and via copying and pasting would result in some ugly cruft. >> What about <![CDATA[ in SVG subtrees outside <script> and <style>? >> It's useful for graceful degradation but still involves feedback to >> the tokenizer unless supported anywhere outside foreign content as >> well. > > I think that is mostly an orthogonal issue. But I would like <! > [CDATA[ ]]> in to be parsed as in XML both in foregin content mode, > and in normal mode. To keep things consistent. I think it's relevant in two ways: 1) If the syntax behaves as in XML outside <script> and <style> but not as in XML inside <script> and <style>, the result may be confusing. 2) Having CDATA sections that behave like XML CDATA sections in HTML5 parsers but like bogus comments in earlier browsers is useful for hiding SVG text from old browsers for graceful degradation. However, if this syntax causes feedback from the tree builder to the tokenizer, we haven't managed to completely eliminate the (non-trivial) feedback to the tokenizer meaning the other efforts to do so wouldn't be very useful. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 6 April 2009 18:55:30 UTC