Bug 75914 - Opening document properties dialog removes line breaks from custom properties
Summary: Opening document properties dialog removes line breaks from custom properties
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: framework (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: lowest trivial
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: File-Properties
  Show dependency treegraph
 
Reported: 2014-03-08 17:45 UTC by sergio.callegari
Modified: 2019-03-03 17:06 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
A document with a testcase for the issue (9.38 KB, application/vnd.oasis.opendocument.text)
2016-04-18 10:07 UTC, sergio.callegari
Details

Note You need to log in before you can comment on or make changes to this bug.
Description sergio.callegari 2014-03-08 17:45:23 UTC
This is seen on current 4.1.5 and 4.2.2 RC1. May be ultimately inherited from Openoffice.  

May appear as a corner case or even a little abuse of properties, but current behavior is breaking an extension. Furthermore, current behavior may be non conforming to the opendocument standard.


Issue description.
------------------

Libreoffice documents can have custom properties associated to them. These properties can be edited using the File->Properties dialog and the Custom Properties tab. Alternatively, custom properties may be set by macros via the available interfaces.

Custom properties have a name, a type and a value, that is set according to the type. Among the possible types, there is the string type (named Text in the user interface dialog).

The programmatic interface allows to set a string type custom property whose value is a string containing line breaks, tabs, etc. When a value containing a line break is programmatically set, it can be programmatically retrieved and used by other macros without any issues.

The user interface for setting custom properties has an issue with this, though. It is sufficient to open the File->Properties dialog (without actually editing any custom property) to have all the line breaks removed from custom properties.

This breaks those extensions that need to store pieces of text that may include line breaks as custom properties. One known example is the Texmaths extensions that stores the so called 'LaTex preamble' as the 'TexMathsPreamble' property. This bit of text needs to include line-breaks. As long as the extension is used without touching the file document properties everything is fine. As soon as one edits the document properties (e.g. to associate a title to the document) all that depends on the extension is broken.


To reproduce the issue with the texmaths extension:
---------------------------------------------------

1) Install the texmath extension
2) Open a new presentation or drawing
3) click the "curly pi" icon used to enter texmaths equations
4) When the texmaths dialog opens, press the preamble button
5) Note how the default preamble is organized on multiple lines
6) Close the preamble dialog by pressing the 'save' button, to assure that
a custom document property is created
7) Close the texmaths dialog by pressing 'ESC'
8) Open the File->Properties dialog go to the 'Custom Properties' tab. Do not edit anything, just note that there is a TexMathsPreamble custom property
9) Move to the 'Description' tab. Assign a title to the document (e.g. 'foobar') and close the dialog pressing the OK button
10) click again the "curly pi" icon used to enter texmaths equations
11) When the texmaths dialog opens, press the preamble button
12) Note that now the preamble is corrupted, with all the line breaks removed


Relationship to the opendocument file format:
---------------------------------------------

I am not an expert with this, but I believe that custom properties that include text should be 'string' type by the standard and need not to be 'normalizedString'. If this is the case, the text type custom properties should be allowed to contain characters like line breaks, tabs, initial spaces, etc.

If this is the case (please confirm), the current behavior is a non-conformance to the standard.

Note that there is no need to have a GUI allowing line breaks to be entered. But if line-breaks are being entered programmatically, the GUI should not mess with them.


Questions
---------

1) In case I am wrong and the standard forbids the inclusion of line break chars in custom properties, please let me know. In this case, I'll suggest the texmaths author to use base64 encoding to store the preamble.

2) Please let me know if, apart from custom properties, there is any other mechanism that extensions can use to store their own data together with a document. If there is a better mechanism, this can be suggested to the texmaths extension author.

Sorry for the long post, thanks in advance
Comment 1 Joel Madero 2014-03-08 19:15:28 UTC
Setting version to at least 4.1.5 as version field is the oldest version that we can reproduce the problem
Comment 2 Thomas Hackert 2014-04-06 14:04:14 UTC
Hello Sergio, *,
I cannot confirm this bug neither with LO Version: 4.1.5.3 Build-ID: 1c1366bba2ba2b554cd2ca4d87c06da81c05d24 nor with LO Version: 4.2.3.3
Build ID: 6c3586f855673fa6a1576797f575b31ac6fa0ba3 (parallel installed, following the instructions from https://wiki.documentfoundation.org/Installing_in_parallel), both with installed Germanophone lang- as well as helppack under Debian Testing i686, sorry ... :(

What I did:
1. Downloaded and installed http://extensions.libreoffice.org/extension-center/texmaths-1/releases/0.39/texmaths-0-39.oxt
2. Created a Writer (but later I tried it also with Impress) document
3. Followed your instructions from point 1 to point 12 ... ;)

But when I reach your point 12, my preamble is not corrupted at all. It looks the same as the first time.

Which version of the extension did you use, btw.? And would you be so kind to tell us, which OS/architecture you used to find this bug, please?
TIA
Thomas.
Comment 3 sergio.callegari 2014-04-06 14:20:44 UTC
This is because texmaths 0.39 has a workaround for the issue.
It substitutes a rarely used char for the line break.

You can test with texmaths 0.38 or writing a macro that programmatically sets a custom property to a string with a line break.
Comment 4 Joel Madero 2015-02-25 15:44:49 UTC
If the latest texmath has a workaround - I'm wondering if we shouldn't just close this bug. It seems strange to go test old extensions to show an issue in LibreOffice.

@Sergio - thoughts about this?

Also changing version to inherited from OOo as I just saw in the description that that's the correct version (again version is oldest version not latest tested on)

Thanks all
Comment 5 sergio.callegari 2015-02-26 12:37:41 UTC
My feeling is that the odf standard should be the rule here.

In principle, if it allows custom properties to contain special characters, then the UI of LibO should not remove them.  According to the standard, a custom property of type 'string' ('text' in the LibO interface) should allow

#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
(any Unicode character, excluding the surrogate blocks, FFFE, and FFFF).

If LibO decides to remove any of these, the same incident that happened with texmaths could happen in the future with new extensions, or one might end up breaking documents made with other production tools compatible with the odf standard.

In fact, the LibO roundtrip is already OK, because when LibO loads+saves a file, it preserves these chars. The only issue is that when opening the document-properties dialogue and editing /any/ custom property, /all/ the custom properties get deprived of some of the special characters.

So, the whole issue is confined to the management of an interface dialog and the
point is how to fix it without complicating too much the codebase. IMHO, there is no need to provide interface elements to let one enter special characters in the custom properties, and it can even be acceptable to remove special characters from those entries that are edited. So the only thing that should be avoided is the cleaning of properties that are not being edited.
It would probably be enough to have a "modified" flag associated to each custom property row in the interface and to restrict the update of custom properties inside the document to those that get the "modified" flag set.

Since no known extension currently depends on special chars in custom properties, this can be a very low priority task. But I would confirm the bug and keep it open as a reminder of the minor issue with the odf standard (e.g. Priority -> lowest, Severity -> minor or trivial).
Comment 6 sergio.callegari 2015-02-26 12:55:03 UTC
A couple of side notes:

1) A regression test for this can be easily implemented, there is no need to test against an old extension. It is enough to define a LibO macro that puts a string in containing the line termination char in a custom property.

2) IMHO custom properties should be used for tasks where properties need to be stored by the user and there is no other obvious interface for the user to enter them.  In the case of extensions such as texmaths that need to store big chunks of data that is 'private' to the extension and for which the extension offers its own user interface, custom properties seem not very appropriate to me. In fact, big chunks of data tend to clutter the document custom properties significantly (think of an extension that decides to use 1000 custom properties for which it is the only user... this would make the custom property dialog very polluted and difficult to search).

If I read the odf standard properly, there is the possibility to store "Text content being used as RDF metadata. Sec. 4.2.1" see http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#InContentMetadata.
Now, I wonder if LibO provides an API to macro writers for this. If this is the case, extension writers could be incouraged to use this option when they have relatively large chunks of data to store. If currently there is no API for this, it may be worth providing it.

Custom metadata (custom XML elements within meta.xml). Sec. 4.3.1 http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#CustomMetadataElements could be an alternative, but I see it is in being deprecated.
Comment 7 tommy27 2016-04-16 07:28:52 UTC Comment hidden (obsolete)
Comment 8 sergio.callegari 2016-04-18 10:06:22 UTC
The bug is still present as of 5.1.2.
Comment 9 sergio.callegari 2016-04-18 10:07:17 UTC
Created attachment 124453 [details]
A document with a testcase for the issue
Comment 10 sergio.callegari 2016-04-18 10:17:31 UTC
Just added a document with a test case for the issue.

Document foo.odt is a text document. Looks empty, but contains two macros:

1) Open the document and enable macros

2) Run macro "set_test_property". This programmatically sets a custom property in the document to the string "This is a string with a line break", with a line break between the words 'string' and 'with'

3) Test that this is the case by running the macro "print_test_property". You should see a message box with the text

This is a string
with a line break

4) Open the properties editor: File->Properties->Custom Properties. See that there is a "test" property. Notice how the text associated to it reads "This is a stringwith a line break". Press OK.

5) Run again the macro "print_test_property". Now, you see a message box with the text

This is a stringwith a line break

This proves that opening the properties editor has broken the custom property by removing the line break.

Now, to the best of my understanding, the odf standard enables line breaks and other special chars in custom properties, namely:

#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
(any Unicode character, excluding the surrogate blocks, FFFE, and FFFF).

Hence, the document properties editor should not break properties containing any of these chars, otherwise it may end up breaking macros and extensions that rely on setting these properties.
Comment 11 QA Administrators 2019-03-02 03:50:36 UTC Comment hidden (obsolete)
Comment 12 edera 2019-03-02 10:31:04 UTC
Still there in
Version: 6.1.3.2
Locale: en-US (en_US.UTF-8);

sergio.callegari test document
http://bugs.documentfoundation.org/attachment.cgi?id=124453
macro set_test_property sets a test property where the line break was removed:
"This is a stringwith a line break"

The print_test_property macro does nothing here.
Comment 13 QA Administrators 2019-03-03 03:41:43 UTC Comment hidden (obsolete)
Comment 14 edera 2019-03-03 17:06:49 UTC
Still there in
Version: 6.1.3.2
Locale: en-US (en_US.UTF-8);

sergio.callegari test document
http://bugs.documentfoundation.org/attachment.cgi?id=124453
macro set_test_property sets a test property where the line break was removed:
"This is a stringwith a line break"

The print_test_property macro does nothing here.