Bug 100525 - [FILEOPEN] New lines missing in field (from custom docprop) in specific DOCX
Summary: [FILEOPEN] New lines missing in field (from custom docprop) in specific DOCX
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: interoperability
Keywords: filter:docx
Depends on:
Blocks: DOCX-Fields
  Show dependency treegraph
 
Reported: 2016-06-21 16:57 UTC by Cor Nouws
Modified: 2022-03-14 13:31 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Screen shot showing correct field display on page 1 in Word (54.54 KB, image/png)
2016-06-21 16:57 UTC, Cor Nouws
Details
Image showing current situation of fields without space in LibreOffice (20.34 KB, image/png)
2016-06-21 16:58 UTC, Cor Nouws
Details
Minimized document from Word (16.64 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2019-10-10 12:59 UTC, NISZ LibreOffice Team
Details
Screenshot of the minimized document in Writer and Word (70.14 KB, image/png)
2019-10-10 13:03 UTC, NISZ LibreOffice Team
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cor Nouws 2016-06-21 16:57:43 UTC
Created attachment 125811 [details]
Screen shot showing correct field display on page 1 in Word

Open File https://bugs.documentfoundation.org/attachment.cgi?id=125792 (bug 100513)

The second field holds the text in  one line:
  "Aan de Voorzitter van de Tweede Kamer der Staten-GeneraalPostbus 200182500 EA DEN HAAG" 

However it should be in four lines
  "Aan de Voorzitter van de Tweede Kamer 
   der Staten-Generaal
   Postbus 20018
   2500 EA DEN HAAG"

It is the custom document properly "adres"

Also see pdf from Word in  https://bugs.documentfoundation.org/attachment.cgi?id=125793
Comment 1 Cor Nouws 2016-06-21 16:58:56 UTC
Created attachment 125812 [details]
Image showing current situation of fields without space in LibreOffice
Comment 2 Buovjaga 2016-06-26 11:39:20 UTC
Confirmed.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.3.0.0.alpha0+
Build ID: ff25ea3d5ccf3a990767cbb1ef99037d3f84b072
CPU Threads: 8; OS Version: Linux 4.6; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8)
Built on June 26th 2016
Comment 3 QA Administrators 2018-10-10 03:05:39 UTC Comment hidden (obsolete)
Comment 4 Cor Nouws 2018-10-10 06:03:14 UTC
bug still present in Version: 6.2.0.0.alpha0+
Build ID: e46f8a9a4e3c5b0542c0813b476b449f3af8d607
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk2; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2018-10-07_21:58:32
Locale: nl-NL (nl_NL.UTF-8); Calc: threaded
Comment 5 NISZ LibreOffice Team 2019-10-10 12:59:35 UTC
Created attachment 154897 [details]
Minimized document from Word
Comment 6 NISZ LibreOffice Team 2019-10-10 13:03:43 UTC
Created attachment 154898 [details]
Screenshot of the minimized document in Writer and Word

Bibisected using bibisect-win32-4.3 to:

author	Michael Stahl <mstahl@redhat.com>	2014-03-02 00:32:17 +0100
committer	Michael Stahl <mstahl@redhat.com>	2014-03-03 13:53:23 +0100

fdo#47811: RTF import: fix Database field content

Before that only one field was imported, since this all the field names are merged into that one.

Probably we can't say it worked fine before, so not setting regression tag.
Comment 7 Justin L 2022-03-14 13:31:42 UTC
The idea is that the information comes from a variable - which excludes any kind of formatting etc. (In fact, I have no idea how MS Word allowed this "adres" variable to contain a carriage return because the UI doesn't seem to allow it.)

However, in this case the data doesn't even exactly match the variable's content, as can be seen (in MS Word) when pressing F9 with the field selected. In this case, it becomes three lines long instead of four.

But yes, when pressing F9 on the field, MS Word still maintains several paragraphs of content. So somehow the variable accepts carriage returns. In LO those are stripped out and all the text becomes one long variable string. So the first question is - can we import a \n into a variable string?

<property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="3" name="adres">
  <vt:lpwstr>
    Aan de Voorzitter van de Tweede Kamer der Staten-Generaal_x000d_Postbus 20018_x000d_2500 EA DEN HAAG_x000d_ _x000d_
  </vt:lpwstr>
</property>

And then secondly (and much less important), since the content doesn't match the variable, how do we keep the displayed value and still maintain some kind of connection to the variable behind it? (This happens in DOC's read-only-field, but DOCX emulates as editable content field.)