Bug 139974 - Data Corruption with more than 16367 characters in one paragraph in cell if edited
Summary: Data Corruption with more than 16367 characters in one paragraph in cell if e...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.0.4 release
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:7.4.0 target:7.3.1 target:7.2....
Keywords:
Depends on:
Blocks: Cell-Formula
  Show dependency treegraph
 
Reported: 2021-01-28 15:14 UTC by remasch
Modified: 2022-01-24 19:44 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file (8.92 KB, application/vnd.oasis.opendocument.spreadsheet)
2021-02-03 15:06 UTC, Telesto
Details
example formula with more than 16368 characters (16.80 KB, text/plain)
2021-02-03 15:24 UTC, remasch
Details
Updated unit test document / sample file. (8.44 KB, application/vnd.oasis.opendocument.spreadsheet)
2022-01-24 15:24 UTC, Eike Rathke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description remasch 2021-01-28 15:14:10 UTC
Description:
When a cell contains more than 16368 characters the 16369th character gets eaten and a breakline is inserted into the formula.

Steps to Reproduce:
1.Create a formula with more than 16368 characters.
2. Click into the cell and try to copy the formula.
3. Comapare to original with the newly copied one.

Actual Results:
The 16369th character is replaced with a breakline.

Expected Results:
There should be no change in data!


Reproducible: Always


User Profile Reset: No



Additional Info:
Very destructive bug.
Comment 1 Telesto 2021-01-28 21:59:04 UTC
Must it be a formula.. Or any input. And would you mind to give an example formula.. and what would be missing

People around here confirming bugs - like me - are kind of lazy. And with backlog of issue already we tendency to attempt to experiment around it the hope we encounter the issue too. which might be even platform specific.. Linux, Windows, macos
Comment 2 remasch 2021-01-29 11:06:58 UTC
This bug happens for me under Fedora 33.

As an example you can create following formula:

=1234567890+1234567890+1234567890+1234567890+ [...]

until you have more than 16368 characters.

You can create this inside a Calc-cell by copy pasting those numbers or outside and then paste it.
Comment 3 remasch 2021-01-31 14:10:54 UTC
Addition: I also was able to reproduce this under Windows 10 Pro with the same version of LO (7.0.4.2).
Comment 4 remasch 2021-02-03 13:06:58 UTC
Can someone please change the importance to critical for this one?
Comment 5 Telesto 2021-02-03 15:06:47 UTC
Created attachment 169437 [details]
Example file

So copy/paste of - in the same file - F1 should go wrong?
Comment 6 remasch 2021-02-03 15:24:25 UTC
Created attachment 169439 [details]
example formula with more than 16368 characters

The formula needs more than 16368 characters. Please test with the textfile attached.
Comment 7 remasch 2021-02-09 15:40:45 UTC
Hello, I really don´t want to stress but due to the severity of this bug I want to kindly ask someone to review this in a timely manner. Thank you
Comment 8 Telesto 2021-02-09 22:17:39 UTC
(In reply to remasch from comment #7)
> Hello, I really don´t want to stress but due to the severity of this bug I
> want to kindly ask someone to review this in a timely manner. Thank you

Bug tracker is managed by volunteers.. so slightly understaffed. And appears to be not to common (based on number of reports). And there are lot of issue, and only so many developers :-( 

Not intending to downplay importance.. Or attempting to downplay the issue. And comments are not 'exactly' to the point of what you're asking. Only an attempt to put it in context.
Comment 9 Telesto 2021-02-09 22:18:23 UTC
@Buovjaga
Do you have some time to test this one?
Comment 10 Buovjaga 2021-02-10 10:49:01 UTC
The behaviour is the same since 4.2. Before that, it just truncated the formula in a random position.
Comment 11 Eike Rathke 2022-01-18 17:52:37 UTC
This is a limitation of the EditEngine and not restricted to formulas, it happens with any cell content while editing. See the MAXCHARSINPARA define at
https://opengrok.libreoffice.org/xref/core/editeng/source/editeng/impedit.hxx?r=bc413e15#88
and its use throughout editeng/source/editeng/impedit2.cxx and editeng/source/editeng/impedit4.cxx
https://opengrok.libreoffice.org/s?refs=MAXCHARSINPARA&project=core

Apparently also related to VCL, but MAXCHARSINPARA seems not to be used there (anymore?) already since the OOo times, or the header was more or less copied from editeng/, it's there since the initial import.

Not sure if that actually is still a requirement for EditEngine. Processing overly long paragraphs also has performance penalties. Blindly munging that last character and replacing it with a linefeed is of course bad. Maybe breaking at the previous word boundary instead would be an option.
Comment 12 Eike Rathke 2022-01-19 16:06:28 UTC
Odd, there is no corruption when pasting such long data string (or pasting behind 16367 characters) into a cell. When editing the cell again, a line break is inserted and the last character of the first line is not displayed (hence also not copied when the content is copied), but appears to be still part of the original data. If content is modified and cell closed with Enter the character is munged.

Btw, it's if more than 16367 characters in one paragraph, the 16368th character is munged.

Investigating.
Comment 13 Commit Notification 2022-01-19 21:55:34 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2cfa04cbf03fe5c2ce32a7384082cdc5de5a4785

Resolves: tdf#139974 Do not munge character after forced line break

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2022-01-20 02:16:35 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-3":

https://git.libreoffice.org/core/commit/2b8946e52fa170c7df4bf71440e2ed63474db28f

Resolves: tdf#139974 Do not munge character after forced line break

It will be available in 7.3.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Eike Rathke 2022-01-20 12:38:02 UTC
Pending review https://gerrit.libreoffice.org/c/core/+/128668 for 7-3-0
Comment 16 Commit Notification 2022-01-20 15:37:26 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/81a5ba3cfc8b0d95724b38e7cc7cafdd83fb870d

Related: tdf#139974 Try to find boundary for forced line break

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2022-01-20 19:33:25 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/36a186f28757eadb8fddac5887f5ebecebfe8229

tdf#139974: sc: Add UItest

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2022-01-21 01:04:02 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ea933cc3b6d72b1d7ff09c9e85286f4d3343f335

Related: tdf#139974 Keep HYPHEN-MINUS with a number to the right

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2022-01-21 14:32:57 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/8bbe0eacd826f33f9cd4f1aec77220b29d4a4e7b

Resolves: tdf#139974 Do not munge character after forced line break

It will be available in 7.2.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2022-01-24 10:50:27 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-3-0":

https://git.libreoffice.org/core/commit/b4de5e9f55ed9a9d1d161bab5e96a66c88fc1236

Resolves: tdf#139974 Do not munge character after forced line break

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Eike Rathke 2022-01-24 15:24:00 UTC
Created attachment 177756 [details]
Updated unit test document / sample file.

The attached document in A1 contains the overly long =1234567890+1234567890+... formula of 17204 characters (including leading '=') with result 1930864179960 (1564*1234567890) of attachment 169439 [details]. Check length with =LEN(FORMULA(A1)) in any other cell.

On A1 the sequence F2 (in-cell, inserting a line break), Ctrl+A (select all), Ctrl+C (Copy), go to A2, F2 (in-cell), Ctrl+V (Paste), Enter
reproduces the same formula in A2 but with a line break inserted after the 16357th (a '+') character, hence length 17205. Check length with =LEN(FORMULA(A2)), and same result of 1930864179960.
Comment 22 Commit Notification 2022-01-24 19:44:57 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2c22c95d41cd67ab865d40292e659abdd04a1b3e

tdf#139974: sc: fix test

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.