Bug 129372 - FILEOPEN: PPTX: CRASH: File format error found at SfxBaseModel::storeToStorage: 0x20d(row,col) – works w/ PowerPoint 2013
Summary: FILEOPEN: PPTX: CRASH: File format error found at SfxBaseModel::storeToStorag...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Impress (show other bugs)
Version:
(earliest affected)
5.2 all versions
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.0.0 target:6.4.4 target:6.3.6
Keywords: bibisected, bisected, haveBacktrace, regression
Depends on:
Blocks:
 
Reported: 2019-12-13 15:32 UTC by Tobias Burnus
Modified: 2020-04-14 18:18 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
gdb bt (238.22 KB, text/plain)
2019-12-13 22:22 UTC, Julien Nabet
Details
Flight management course pptx version (2.57 MB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2020-04-05 14:58 UTC, William Gathoye
Details
Flight management course odp version converted from pptx to odp using Office 365 - March 2020 (2.04 MB, application/vnd.oasis.opendocument.presentation)
2020-04-05 14:59 UTC, William Gathoye
Details
Reduced example (33.72 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2020-04-08 08:37 UTC, Julien Nabet
Details
Minimal example (33.07 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2020-04-08 09:35 UTC, Julien Nabet
Details
bt (4.71 KB, text/plain)
2020-04-08 18:47 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2019-12-13 15:32:52 UTC
I try to open the following PPTX file under openSUSE – Version: 6.3.3.2.0+

http://www.cosmo-model.org/content/consortium/generalMeetings/general2019/wg7/Gebhardt_GM2019_parallelWG1-WG7.pptx

Result: File format error found at SfxBaseModel::storeToStorage: 0x20d(row,col)

I can open that file with PowerPoint 2013 under Windows.
Comment 1 Julien Nabet 2019-12-13 22:22:13 UTC
Created attachment 156581 [details]
gdb bt

On pc Debian x86-64 with master sources updated today, I could reproduce this.
Comment 2 Zach Schoen 2020-03-10 18:09:08 UTC
I reproduce this as well, even in latest LibreOffice 6.4.1.2 on Windows (32-bit and 64-bit).  Let me know if my sample is needed, but Tobias' looks like same thing -- exact same "SfxBaseModel::storeToStorage: 0x20d(row,col)" message, and similar content in slides (it's the pages with calculus math symbols.)
Comment 3 William Gathoye 2020-04-05 14:58:06 UTC
Created attachment 159341 [details]
Flight management course pptx version

Issue reported via Twitter in DM.

Flight management course pptx having the exact same problem.
Comment 4 William Gathoye 2020-04-05 14:59:24 UTC
Created attachment 159342 [details]
Flight management course odp version converted from pptx to odp using Office 365 - March 2020

Issue reported via Twitter in DM.

Flight management course pptx having the exact same problem.
Comment 5 Julien Nabet 2020-04-08 08:37:12 UTC
Created attachment 159418 [details]
Reduced example

I tried to reduce the example which allows to reproduce the bug.
It may help to investigate.
Comment 6 Julien Nabet 2020-04-08 08:38:38 UTC
I noticed these logs on console:
warn:legacy.osl:4632:5136:sax/source/expatwrap/saxwriter.cxx:399: lone 2nd Unicode surrogate
warn:legacy.osl:4632:5136:sax/source/expatwrap/saxwriter.cxx:424: illegal Unicode character

The problem seems related to a wrong use of surrogates (Unicode notion) at a moment. Indeed, surrogates are always by pairs (high and low surrogates). Most of the times, it's ok but sometimes you got one of them but not followed by the other part of the pair.
For the moment, I don't know why.
Comment 7 Julien Nabet 2020-04-08 09:16:24 UTC
Symbols and Unicode convert from the reduced slide:
𝑋 %uD835%uDC4B
𝑢 %uD835%uDC62
≡ %u2261
𝜕 %uD835%uDF15
Comment 8 Julien Nabet 2020-04-08 09:35:32 UTC
Created attachment 159420 [details]
Minimal example

So the pb is triggered with character "𝜕" (%uD835%uDF15)
Why the surrogates mechanism doesn't work for it and it works for the others, I don't know yet.
Comment 9 Julien Nabet 2020-04-08 18:47:32 UTC
Created attachment 159432 [details]
bt

Part of bt where partial d is analyzed
On ICU, I found this line:
source/data/unidata/confusables.txt:1115:1D715 ;	2202 ;	MA	#* ( 𝜕 → ∂ ) MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL	#
Comment 10 Julien Nabet 2020-04-08 18:49:19 UTC
Eike: I'm a bit a lost between starmath/i18npool and icu.
Any thought why this character which uses surrogates (but other with also surrogates don't fail) is wrongly parsed?
Comment 11 Julien Nabet 2020-04-08 21:25:42 UTC
I gave a try with:
https://gerrit.libreoffice.org/c/core/+/91941

At least all the attached files open with it.
However, I'm still confused why some surrogate pairs are well taken into account others not...
Comment 12 Julien Nabet 2020-04-09 07:04:50 UTC
Locally, I got no problem to build but on Jenkins it fails :-(
Comment 13 Julien Nabet 2020-04-09 07:54:45 UTC
Thanks to Stephan Bergmann, I could simplify the patch, waiting for Jenkins results.
Comment 14 Commit Notification 2020-04-09 09:39:37 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/11b57129b53e1e2d71a5f969e2417226b4e2ddd9

tdf#129372: PPTX: error at SfxBaseModel::storeToStorage: 0x20d(row,col)

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Julien Nabet 2020-04-09 09:44:50 UTC
Not sure if the patch is ok, I submitted it so it'll be on daily build in 24/48 hours and people may test it.

Meanwhile, I cherry-picked it for 6.4 branch and let it for review (anyway, I can't submit myself since it must be validated by at least one person).
Comment 16 Xisco Faulí 2020-04-09 10:56:25 UTC
Also reproduced in

Version: 5.4.0.0.alpha1+
Build ID: 9feb7f7039a3b59974cbf266922177e961a52dd1
CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); Calc: group
Comment 17 Xisco Faulí 2020-04-09 10:58:23 UTC
but not in

Version: 5.2.0.0.alpha0+
Build ID: 3ca42d8d51174010d5e8a32b96e9b4c0b3730a53
Threads 4; Ver: 4.19; Render: default;
Comment 18 Xisco Faulí 2020-04-09 11:19:14 UTC
For the record, the document opens fine up to https://cgit.freedesktop.org/libreoffice/core/commit/?id=d81d104833f0ee9349ebcd0d79d2de84ba9a7262

author	Michael Stahl <mstahl@redhat.com>	2016-02-12 18:22:51 +0100
committer	Michael Stahl <mstahl@redhat.com>	2016-02-12 18:54:33 +0100
commit	d81d104833f0ee9349ebcd0d79d2de84ba9a7262 (patch)
tree	20069a32b56b52b9b8cdb4d37c5a0b22bfeb5c82
parent	e2bfae9006e6adc4de17d0167dac6661b002f126 (diff)
sfx2: related tdf#56270: loss of embedded objects imported from DOCX

after this commit, LibreOffice prompts a Genera Error: General input/output error.
LibreOffice started to crash after https://cgit.freedesktop.org/libreoffice/core/commit/?id=178f5306979ef55a5682191dcdafb9e926e57cde

Bisected with bibisect-linux-64-5.2
Comment 19 Commit Notification 2020-04-09 11:38:20 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/67af725a8623a509960a8463f7876fcd680565ad

tdf#129372: PPTX: error at SfxBaseModel::storeToStorage: 0x20d(row,col)

It will be available in 6.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Julien Nabet 2020-04-09 11:41:50 UTC
Backport on 6.3 branch waiting for Jenkins:
https://gerrit.libreoffice.org/c/core/+/91902
Comment 21 Commit Notification 2020-04-09 14:41:05 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/commit/ab0078509c352ee5d7b8ae5334d49f7c14fc26a5

tdf#129372: PPTX: error at SfxBaseModel::storeToStorage: 0x20d(row,col)

It will be available in 6.3.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2020-04-09 15:26:07 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ed3b44ef622bc87da2425322521c293c2a46a1c5

tdf#129372: Add unittest

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 Xisco Faulí 2020-04-14 18:18:37 UTC
Verified in

Version: 7.0.0.0.alpha0+
Build ID: 35fc5ef0a759884b24ed8b83cd05702a0fab64cc
CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); UI-Language: en-US
Calc: threaded

@Julien, thanks for fixing this issue!