Page MenuHomePhabricator

CX2: User created content triggers too much unmodified text error when typing and pasting
Closed, ResolvedPublic

Description

Content translation provides mechanisms to encourage users to review the initial content that gets added to the translation such as errors (T190283) and warnings (T190279). These mechanisms apply when the user uses machine translation or the source content as an initial content (i.e., when "Use source text" or "Use Apertium/Yandex/X" from the Automatic translation card options is applied).

However, these mechanisms should not apply to cases where the content is initiated by the user such as when the "Don't use machine translation" option is selected and the user types content or pastes it from the clipboard. We should make sure that we are not counting that content as unreviewed, and that it does not block users from publishing.

Example issue with pasting:

  1. On ContentTranslation page, start translating and make "Don't use machine translation" option as default.
  2. Copy some text (from somewhere) and paste it to the destination article panel.
  3. The MT abuse warning does not appear - "Publish" button is active.
  4. Click on "Publish" button - "Your translation cannot be published because it contains too much unmodified machine-translated text" warning appears.

For this case, typing the same text that you pasted makes publishing to work.

Example issue with typing:

Translating from Hebrew to English. The target title is "Rif Neeman test" (it's indeed a test and not supposed to by published).

"Don't use machine translation" is the default when translating from RTL to LTR and MT is not available. I see "Your translation cannot be published because it contains too much unmodified machine-translated text", which is wrong because the content was started from scratch.

Event Timeline

I had a similar issue, but both pasting and typing by hand didn't let me publish.

I was translating from Hebrew to English. The target title is "Rif Neeman test" (it's indeed a test and not supposed to by published).

"Don't use machine translation" is the default when translating from RTL to LTR and MT is not available. I see "Your translation cannot be published because it contains too much unmodified machine-translated text", which is probably wrong, because I didn't use any machine translation at all.

Pginer-WMF renamed this task from Pasted text triggers warning "Your translation cannot be published..." to CX2: User created content triggers too much unmodified text error when typing and pasting.Oct 26 2018, 8:38 AM
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF raised the priority of this task from Medium to High.Nov 9 2018, 9:16 AM

Change 472916 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Fix check for modified section

https://gerrit.wikimedia.org/r/472916

Change 472916 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Fix check for modified section

https://gerrit.wikimedia.org/r/472916

Checked in cx2-testing and wmf.4 - both cases - pasting text and pasting/typing text option for language pairs that do not have MT option (e.g. he-en).
Also, the fix resolves T200748: CX2: The warning of unmodified machine-translated text is displayed when 'Use source text' option is used.

However there is one minor issue: when 'Use source text is used', the issue card correctly refers to the issue as 'too much unmodified text', but the publishing warning will have a reference to 'too much unmodified machine-translation text' which is incorrect. Filed as

@Petar.petkovic Re-opening the ticket as per additional testing (https://www.mediawiki.org/wiki/Topic:Upo25jnc3xhqqsjq)- the warning "your translation contain 100% of unmodified text" will appear when the translation with a typed text reloaded:

  • Start translation with v2 he->en (do not think it's specific)
  • with 'Don't use machine translation' option set as a default, click to "translate" a paragraph - nothing happens, it's expected.
  • start typing some text for that paragraph.
  • for some other paragraph, do not click in the destination article, just type (to exclude that the attempting to translate a paragraph somehow triggers "too much unmodified text" message)
  • no warnings appear
  • click on 'All translations' and then click on the article again to go back to translation - the warnings will appear.

Screen Shot 2018-12-04 at 4.10.55 PM.png (450×941 px, 84 KB)

I guess @Etonkovidova meant to link to T210113 .

Yes, this is correct - thanks!

Change 478227 had a related patch set uploaded (by Petar.petkovic; owner: Petar.petkovic):
[mediawiki/extensions/ContentTranslation@master] Save content started from scratch as unmodified MT

https://gerrit.wikimedia.org/r/478227

Change 478227 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Save content started from scratch as unmodified MT

https://gerrit.wikimedia.org/r/478227

Tested in testwiki and cawiki (wmf.9) - user-generated text (typed and pasted) does not trigger any warnings and saved/loaded correctly.

@Petar.petkovic - one interesting thing is that I cannot copy the paragraph from the source article (displayed in the source panel) - i.e. select an entire paragraph in the source panel and try to copy it to the target article panel. The following error is displayed:

Error: Cannot adopt content from /paragraph nodes into cxSection nodes (at index 0)
writeElement
ve.dm.Document.prototype.fixupInsertion
ve.dm.Document.prototype.fixupInsertion
ve.dm.TransactionBuilder.static.newFromDocumentInsertion
ve.dm.SurfaceFragment.prototype.insertDocument
ve.ce.Surface.prototype.afterPasteInsertExternalData
ve.ce.Surface.prototype.afterPasteAddToFragmentFromExternal
ve.ce.Surface.prototype.afterPaste
ve.ce.Surface.prototype.onPaste/<

It feels like a feature not a bug. I can copy portions of a paragraph without any problem; only attempting to copy the whole paragraph displays the Console error above.

@Petar.petkovic - one interesting thing is that I cannot copy the paragraph from the source article (displayed in the source panel) - i.e. select an entire paragraph in the source panel and try to copy it to the target article panel.

I have tried translating en:Airspeed to Serbian in testwiki. First paragraph was a simple one, with a couple of links. I clicked on "Add translation" placeholder and got Yandex translation. Then, I selected whole source paragraph and drag-and-dropped after MT generated content. It works smoothly.

For the next paragraph, I switched to "Don't use Machine Translation". Then, selected whole source paragraph, which has one reference and no links, just plain text. When attempting drag-and-drop or copy-paste, I get following error:

ve.dm.AnnotationSet.js?b26ca:20 Uncaught Error: Annotation with hash h73bf93a1af42eedb not found in store
    at new VeDmAnnotationSet (ve.dm.AnnotationSet.js?b26ca:20)
    at VeDmElementLinearData.ve.dm.ElementLinearData.getAnnotationsFromOffset (ve.dm.ElementLinearData.js?6d4f2:482)
    at VeDmElementLinearData.ve.dm.ElementLinearData.sanitize (ve.dm.ElementLinearData.js?6d4f2:1404)
    at VeDmDocument.ve.dm.Document.newFromHtml (ve.dm.Document.js?6186f:1563)
    at VeUiHTMLStringTransferHandler.ve.ui.HTMLStringTransferHandler.process (ve.ui.HTMLStringTransferHandler.js?ae95d:41)
    at VeUiHTMLStringTransferHandler.ve.ui.DataTransferHandler.getInsertableData (ve.ui.DataTransferHandler.js?b9275:110)
    at VeCeSurface.ve.ce.Surface.handleDataTransferItems (ve.ce.Surface.js?5c9cf:2690)
    at VeCeSurface.ve.ce.Surface.handleDataTransfer (ve.ce.Surface.js?5c9cf:2654)
    at VeCeSurface.ve.ce.Surface.onDocumentDrop (ve.ce.Surface.js?5c9cf:1181)
    at HTMLDivElement.dispatch (jquery.js?6a07d:5183)

@Etonkovidova, please report this.