Reproduction steps: 1. Insert "foo bar _baz_" into a text field of contenteditable area 2. option+delete or control+backspace after _baz_ Expected result: _baz_ is deleted Actual result: bar _baz_ is deleted
I don't know why but the bug no longer reproduces on Safari / ToT WebKit for me. It appears that this is something to do with ICU shipped with Chromium.
Actually, I can reproduce this with r92646 on Lion.
(In reply to comment #2) > Actually, I can reproduce this with r92646 on Lion. Lion ships with a newer ICU, correct?
+mitz, +ap since they tend to know more about ICU than I do.
Actually, I cannot reproduce in any nightlies, but I can reproduce with local builds. Perhaps there is broken debug only code (like ASSERT(a = b) instead of ASSERT(a == b))?
After talking with ap on IRC, it appears that this is a bug inside findWordBoundary in platform/text/TextBoundaries.cpp
> Actually, I cannot reproduce in any nightlies, but I can reproduce with local builds My mistake - I had local changes in my tree. The problem occurs when using cross platform implementation of findWordBoundary() - perhaps code that's calling it expects Mac behavior.
http://crbug.com/92116
I have checked the issue with some other words and below is the observation: 1: foo bar -baz_ --> control+backspace --> foo bar - 2: foo bar @baz_ --> control+backspace --> foo bar @ 3: foo bar _baz_ --> control+backspace --> foo The below code in method "findNextWordFromIndex" is used to calculate the position of character after ctrl+bkspace:- position = textBreakPreceding(it, position); case 1 and 2 ============ returns the position text just after the special character ie (b) whose position is 9. So the isAlphanumeric check passes and returns the same position. case 3 ====== The position returned is 8, i.e position of (_) . So the isAlphanumeric check fails and it continues to find the next position. And it returns the position of 4 which is the start of 2nd word. So it deletes upto that point. The "findNextWordFromIndex" method uses the ICU library to determine the position. So I think the ICU library should also returns the correct position when underscore(_) is used. Also what should be the right output: foo bar _baz_ --> control+backspace --> foo bar OR foo bar _baz_ --> control+backspace --> foo bar _ Please provide your feedback.
(In reply to comment #9) > case 1 and 2 > ============ > returns the position text just after the special character ie (b) whose position is 9. So the isAlphanumeric check passes and returns the same position. > > case 3 > ====== > The position returned is 8, i.e position of (_) . So the isAlphanumeric check fails and it continues to find the next position. And it returns the position of 4 which is the start of 2nd word. So it deletes upto that point. > > The "findNextWordFromIndex" method uses the ICU library to determine the position. So I think the ICU library should also returns the correct position when underscore(_) is used. It would be interesting to know the difference between findWordBoundary used by mac port and findWordBoundary used by other ports.
Some more analysis on the output of different ports:- Windows port ============ 1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar - 2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @ 3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo GTK port ========= 1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar - 2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @ 3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo mac port ======== 1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar - 2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @ 3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo bar So it seems that all the ports (that I looked), returned same values. But the mac port is dispalying correct result. (should the output be "foo bar" or "foo bar _" ?). Also checked the mac port and found that there is no "isAlphanumeric" check for the characters on the position reurned.It simply returns the position. File: platform/text/mac/TextBoundaries.mm method: findNextWordFromIndex()
Created attachment 144743 [details] Reference Patch. I couldn't get much information about the "isAlphanumeric" check by looking at track. The "TextBoundaries.mm" was added in change-set "http://trac.webkit.org/changeset/13187" and the method "findNextWordFromIndex" is nearly identical. The original file was KWQTextUtilities.mm. Also the "TextBoundaries.cpp " was originally "TextBoundariesICU.cpp". Was moved in change-set "http://trac.webkit.org/changeset/50977". Also earlier the check was for "u_isalnum" and in this change-set it was changed to "isAlphanumeric". For testing, I removed the "isAlphanumeric" check and this issue was working. I am attaching the reference patch here to see its effect on other test cases (using EWS).
Attachment 144743 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/platform/text/TextBoundarie..." exit_code: 1 Source/WebCore/platform/text/TextBoundaries.cpp:69: An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement. [readability/control_flow] [4] Total errors found: 1 in 1 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 144743 [details] Reference Patch. Attachment 144743 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/12846865 New failing tests: editing/execCommand/toggle-unlink.html editing/execCommand/createLink.html editing/selection/caret-mode-paragraph-keys-navigation.html editing/execCommand/button.html editing/pasteboard/paste-blockquote-3.html editing/selection/4932260-2.html editing/execCommand/hilitecolor.html editing/spelling/spelling-backspace-between-lines.html editing/pasteboard/page-zoom.html editing/execCommand/remove-format-multiple-elements.html editing/pasteboard/merge-end-5.html editing/style/make-text-writing-direction-inline.html editing/pasteboard/do-not-copy-unnecessary-styles.html editing/pasteboard/copy-text-with-backgroundcolor.html editing/inserting/insert-before-link-1.html editing/execCommand/toggle-link.html editing/selection/extend-by-word-002.html editing/deleting/delete-cell-contents.html editing/style/5065910.html editing/deleting/smart-editing-disabled.html editing/execCommand/unlink.html editing/selection/4932260-3.html editing/deleting/delete-by-word-001.html editing/selection/extend-selection-enclosing-block.html editing/deleting/delete-by-word-002.html editing/pasteboard/merge-end-borders.html editing/pasteboard/paste-list-004.html editing/execCommand/query-command-state.html
Created attachment 144824 [details] Archive of layout-test-results from ec2-cr-linux-02 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-02 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
(In reply to comment #12) > For testing, I removed the "isAlphanumeric" check and this issue was working. > I am attaching the reference patch here to see its effect on other test cases (using EWS). Please run the tests locally before posting them on Bugzilla. We have finite resource for EWS, and spamming them with patches that may or may not work isn't a great way to utilize it.
(In reply to comment #16) > Please run the tests locally before posting them on Bugzilla. We have finite resource for EWS, and spamming them with patches that may or may not work isn't a great way to utilize it. yes, I will keep in mind next time. The only reason I used is I don't all the ports with me, and I wanted to test the other lay out tests with the change.
Similar fixed by Blink here - https://chromium.googlesource.com/chromium/src.git/+/d73a0995d91e0f059e43f7b0034b4d1badad0587
(In reply to Ahmad Saleem from comment #18) > Similar fixed by Blink here - > https://chromium.googlesource.com/chromium/src.git/+/ > d73a0995d91e0f059e43f7b0034b4d1badad0587 This compiles: f (forward) { position = ubrk_following(it, position); while (position != UBRK_DONE) { // We stop searching when the character preceeding the break is alphanumeric or underscore. if (static_cast<unsigned>(position) < text.length() && (u_isalnum(text[position - 1]) || text[position - 1] == lowLine)) return position; position = ubrk_following(it, position); } return text.length(); } else { position = ubrk_preceding(it, position); while (position != UBRK_DONE) { // We stop searching when the character following the break is alphanumeric or underscore. if (position && (u_isalnum(text[position]) || text[position] == lowLine)) return position; position = ubrk_preceding(it, position);