Bug 65898 - DeleteWordForward deletes 2 words
Summary: DeleteWordForward deletes 2 words
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Nobody
URL:
Keywords: HasReduction
Depends on:
Blocks:
 
Reported: 2011-08-08 22:13 PDT by Ryosuke Niwa
Modified: 2024-05-16 07:16 PDT (History)
15 users (show)

See Also:


Attachments
Reference Patch. (1.42 KB, patch)
2012-05-30 01:04 PDT, Antaryami Pandia (apandia)
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ec2-cr-linux-02 (482.06 KB, application/zip)
2012-05-30 08:28 PDT, WebKit Review Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2011-08-08 22:13:36 PDT
Reproduction steps:
1. Insert "foo bar _baz_" into a text field of contenteditable area
2. option+delete or control+backspace after _baz_

Expected result:
_baz_ is deleted

Actual result:
bar _baz_ is deleted
Comment 1 Ryosuke Niwa 2011-08-09 11:15:12 PDT
I don't know why but the bug no longer reproduces on Safari / ToT WebKit for me.  It appears that this is something to do with ICU shipped with Chromium.
Comment 2 Alexey Proskuryakov 2011-08-09 11:20:55 PDT
Actually, I can reproduce this with r92646 on Lion.
Comment 3 Ryosuke Niwa 2011-08-09 11:21:39 PDT
(In reply to comment #2)
> Actually, I can reproduce this with r92646 on Lion.

Lion ships with a newer ICU, correct?
Comment 4 Ryosuke Niwa 2011-08-09 11:22:35 PDT
+mitz, +ap since they tend to know more about ICU than I do.
Comment 5 Alexey Proskuryakov 2011-08-09 11:34:00 PDT
Actually, I cannot reproduce in any nightlies, but I can reproduce with local builds. Perhaps there is broken debug only code (like ASSERT(a = b) instead of ASSERT(a == b))?
Comment 6 Ryosuke Niwa 2011-08-09 13:34:10 PDT
After talking with ap on IRC, it appears that this is a bug inside findWordBoundary in platform/text/TextBoundaries.cpp
Comment 7 Alexey Proskuryakov 2011-08-09 13:49:04 PDT
> Actually, I cannot reproduce in any nightlies, but I can reproduce with local builds

My mistake - I had local changes in my tree. The problem occurs when using cross platform implementation of findWordBoundary() - perhaps code that's calling it expects Mac behavior.
Comment 9 Antaryami Pandia (apandia) 2012-05-28 18:34:20 PDT
I have checked the issue with some other words and below is the observation:

1: foo bar -baz_ --> control+backspace --> foo bar -
2: foo bar @baz_ --> control+backspace --> foo bar @
3: foo bar _baz_ --> control+backspace --> foo

The below code in method "findNextWordFromIndex" is used to calculate the position of character after ctrl+bkspace:-
position = textBreakPreceding(it, position);

case 1 and 2
============
returns the position text just after the special character ie (b) whose position is 9. So the isAlphanumeric check passes and returns the same position.

case 3
======
The position returned is 8, i.e position of (_) . So the isAlphanumeric check fails and it continues to find the next position. And it returns the position of 4 which is the start of 2nd word. So it deletes upto that point.

The "findNextWordFromIndex" method uses the ICU library to determine the position. So I think the ICU library should also returns the correct position when underscore(_) is used.

Also what should be the right output:
foo bar _baz_ --> control+backspace --> foo bar 
OR
foo bar _baz_ --> control+backspace --> foo bar _

Please provide your feedback.
Comment 10 Ryosuke Niwa 2012-05-28 22:11:17 PDT
(In reply to comment #9)
> case 1 and 2
> ============
> returns the position text just after the special character ie (b) whose position is 9. So the isAlphanumeric check passes and returns the same position.
> 
> case 3
> ======
> The position returned is 8, i.e position of (_) . So the isAlphanumeric check fails and it continues to find the next position. And it returns the position of 4 which is the start of 2nd word. So it deletes upto that point.
> 
> The "findNextWordFromIndex" method uses the ICU library to determine the position. So I think the ICU library should also returns the correct position when underscore(_) is used.

It would be interesting to know the difference between findWordBoundary used by mac port and findWordBoundary used by other ports.
Comment 11 Antaryami Pandia (apandia) 2012-05-29 00:38:55 PDT
Some more analysis on the output of different ports:-

Windows port
============
1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar -
2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @
3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo

GTK port
=========
1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar -
2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @
3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo

mac port
========
1: foo bar -baz_ --> control+backspace --> position returned 9 --> output: foo bar -
2: foo bar @baz_ --> control+backspace --> position returned 9 --> output: foo bar @
3: foo bar _baz_ --> control+backspace --> position returned 8 --> output: foo bar 

So it seems that all the ports (that I looked), returned same values.
But the mac port is dispalying correct result. (should the output be "foo bar" or "foo bar _" ?).

Also checked the mac port and found that there is no "isAlphanumeric" check for the characters on the position reurned.It simply returns the position.

File: platform/text/mac/TextBoundaries.mm
method: findNextWordFromIndex()
Comment 12 Antaryami Pandia (apandia) 2012-05-30 01:04:23 PDT
Created attachment 144743 [details]
Reference Patch.

I couldn't get much information about the "isAlphanumeric" check by looking at track.

The "TextBoundaries.mm" was added in change-set "http://trac.webkit.org/changeset/13187" and the method "findNextWordFromIndex" is nearly identical. The original file was KWQTextUtilities.mm.

Also the "TextBoundaries.cpp " was originally "TextBoundariesICU.cpp". Was moved in change-set "http://trac.webkit.org/changeset/50977". Also earlier the check was for "u_isalnum" and in this change-set it was changed to "isAlphanumeric".

For testing, I removed the "isAlphanumeric" check and this issue was working.
I am attaching the reference patch here to see its effect on other test cases (using EWS).
Comment 13 WebKit Review Bot 2012-05-30 01:07:46 PDT
Attachment 144743 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/platform/text/TextBoundarie..." exit_code: 1
Source/WebCore/platform/text/TextBoundaries.cpp:69:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 1 in 1 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 WebKit Review Bot 2012-05-30 08:28:42 PDT
Comment on attachment 144743 [details]
Reference Patch.

Attachment 144743 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/12846865

New failing tests:
editing/execCommand/toggle-unlink.html
editing/execCommand/createLink.html
editing/selection/caret-mode-paragraph-keys-navigation.html
editing/execCommand/button.html
editing/pasteboard/paste-blockquote-3.html
editing/selection/4932260-2.html
editing/execCommand/hilitecolor.html
editing/spelling/spelling-backspace-between-lines.html
editing/pasteboard/page-zoom.html
editing/execCommand/remove-format-multiple-elements.html
editing/pasteboard/merge-end-5.html
editing/style/make-text-writing-direction-inline.html
editing/pasteboard/do-not-copy-unnecessary-styles.html
editing/pasteboard/copy-text-with-backgroundcolor.html
editing/inserting/insert-before-link-1.html
editing/execCommand/toggle-link.html
editing/selection/extend-by-word-002.html
editing/deleting/delete-cell-contents.html
editing/style/5065910.html
editing/deleting/smart-editing-disabled.html
editing/execCommand/unlink.html
editing/selection/4932260-3.html
editing/deleting/delete-by-word-001.html
editing/selection/extend-selection-enclosing-block.html
editing/deleting/delete-by-word-002.html
editing/pasteboard/merge-end-borders.html
editing/pasteboard/paste-list-004.html
editing/execCommand/query-command-state.html
Comment 15 WebKit Review Bot 2012-05-30 08:28:46 PDT
Created attachment 144824 [details]
Archive of layout-test-results from ec2-cr-linux-02

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-02  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 16 Ryosuke Niwa 2012-05-30 10:11:23 PDT
(In reply to comment #12)
> For testing, I removed the "isAlphanumeric" check and this issue was working.
> I am attaching the reference patch here to see its effect on other test cases (using EWS).

Please run the tests locally before posting them on Bugzilla. We have finite resource for EWS, and spamming them with patches that may or may not work isn't a great way to utilize it.
Comment 17 Antaryami Pandia (apandia) 2012-05-30 21:28:12 PDT
(In reply to comment #16)
> Please run the tests locally before posting them on Bugzilla. We have finite resource for EWS, and spamming them with patches that may or may not work isn't a great way to utilize it.

yes, I will keep in mind next time. The only reason I used is I don't all the ports with me, and I wanted to test the other lay out tests with the change.
Comment 19 Ahmad Saleem 2024-05-16 07:16:35 PDT
(In reply to Ahmad Saleem from comment #18)
> Similar fixed by Blink here -
> https://chromium.googlesource.com/chromium/src.git/+/
> d73a0995d91e0f059e43f7b0034b4d1badad0587

This compiles:

f (forward) {
        position = ubrk_following(it, position);
        while (position != UBRK_DONE) {
            // We stop searching when the character preceeding the break is alphanumeric or underscore.
            if (static_cast<unsigned>(position) < text.length() && (u_isalnum(text[position - 1]) || text[position - 1] == lowLine))
                return position;

            position = ubrk_following(it, position);
        }

        return text.length();
    } else {
        position = ubrk_preceding(it, position);
        while (position != UBRK_DONE) {
            // We stop searching when the character following the break is alphanumeric or underscore.
            if (position && (u_isalnum(text[position]) || text[position] == lowLine))
                return position;

            position = ubrk_preceding(it, position);