Bug 118465 - RTF import does not repeat header / repeat heading / repeat as header row for table
Summary: RTF import does not repeat header / repeat heading / repeat as header row for...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:rtf, preBibisect, regression
: 134002 153450 (view as bug list)
Depends on:
Blocks: RTF-Tables
  Show dependency treegraph
 
Reported: 2018-06-29 22:55 UTC by Michael J. Evans
Modified: 2023-05-31 17:24 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Example RTF file, as created by MS Word 2010 (171.76 KB, application/rtf)
2018-06-29 22:55 UTC, Michael J. Evans
Details
MS Word 2010 PDF export of the example (342.14 KB, application/pdf)
2018-06-29 22:55 UTC, Michael J. Evans
Details
LibreOffice6 PDF export of the example (INCORRECT RENDERING) (254.99 KB, application/pdf)
2018-06-29 22:56 UTC, Michael J. Evans
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael J. Evans 2018-06-29 22:55:06 UTC
Created attachment 143215 [details]
Example RTF file, as created by MS Word 2010

I'm currently trying to use libreoffice to batch convert some documents (RTF in this case) to PDFs, however after investigating it became clear that the table header row repeat isn't being correctly handled.

I am attaching an example .rtf file which was created by MS Word 2010, as well as the PDF renderings made by that same version of MS Word and LibreOffice 6.
Comment 1 Michael J. Evans 2018-06-29 22:55:50 UTC
Created attachment 143216 [details]
MS Word 2010 PDF export of the example
Comment 2 Michael J. Evans 2018-06-29 22:56:27 UTC
Created attachment 143217 [details]
LibreOffice6 PDF export of the example (INCORRECT RENDERING)
Comment 3 Roman Kuznetsov 2018-06-30 10:34:38 UTC
confirmed with RTF from attach for

Version: 6.1.0.0.beta2+ (x64)
Build ID: fe1a23b5c49c94410a604c8d4a6f50f43d575403
CPU threads: 4; OS: Windows 10.0; UI render: default; 
TinderBox: Win-x86_64@42, Branch:libreoffice-6-1, Time: 2018-06-17_06:31:41
Locale: ru-RU (ru_RU); Calc: CL
Comment 4 Jacques Guilleron 2018-06-30 13:30:01 UTC
Hi Michael, kompilainenn,

I confirm too with 6.2.0.0.alpha0+ Build ID: 4a82543b3419339ae554485c582a80c41a57c417
CPU threads: 2; OS: Windows 6.1.

Once opened this file in Word, selected the whole table, looking at the table propreties, Line Tab, Line Options: Repeat at top of each page as header line is ticked.
Opening this file in Writer, after Table properties selecting, Text Flow Tab, Repeat heading is not ticked. When done, Table reprove the same aspect as in Word.
Comment 5 Xisco Faulí 2018-07-03 00:01:22 UTC Comment hidden (obsolete)
Comment 6 Michael J. Evans 2018-07-03 02:17:35 UTC
(In reply to Xisco Faulí from comment #5)
> Also reproduced in
> 
> Version: 5.2.0.0.alpha0+
> Build ID: 3ca42d8d51174010d5e8a32b96e9b4c0b3730a53
> Threads 4; Ver: 4.10; Render: default; 
> 
> Version: 4.3.0.0.alpha1+
> Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e
> 
> Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)
> 
> but not in
> 
> LibreOffice 3.3.0 
> OOO330m19 (Build:6)
> tag libreoffice-3.3.0.4

Given your investigation, I believe the most logical introduction of the bug was in the switch mentioned in the RTF metabug post 25:

https://bugs.documentfoundation.org/show_bug.cgi?id=81234#c25

It seems like a bunch of /other/ bugs were all quashed with that switch, so it's no surprise that a smaller corner case like this was missed.
Comment 7 Michael J. Evans 2018-07-07 01:02:55 UTC
After giving up on creating my /own/ example file by hand (MS Word wound up in an infinite loop generating the table over and over again), I have identified a more likely cause of the issue.

http://www.biblioscape.com/rtf15_spec.htm

\trhdr

"Table row header. This row should appear at the top of every page the current table appears on."

These look like the parts of the RTF filter that are broken.

https://cgit.freedesktop.org/libreoffice/core/tree/writerfilter/source/rtftok/rtfcontrolwords.hxx

https://cgit.freedesktop.org/libreoffice/core/tree/writerfilter/source/rtftok/rtfcontrolwords.cxx

The control word is identified, but no actions are taken based on it's existence.

A unit test might also be failing based on \trhdr existing in...

https://cgit.freedesktop.org/libreoffice/core/tree/sw/qa/extras/rtfimport/data/tdf99498.rtf

Some of the svgtools files mention it.

Interestingly there's also this file:

https://cgit.freedesktop.org/libreoffice/core/tree/compilerplugins/clang/unusedenumconstants.writeonly.results

RTF_TRHDR is among the many RTF_ enums that are defined but "never actually used".

Currently LibreOffice also cannot /export/ an RTF document with a repeating header row (however adding the missing keyword manually does allow MS Word to read it with that property).
Comment 8 Buovjaga 2018-07-07 16:07:04 UTC
Problem is seen in the oldest commit in 43all bibisect repo.
Comment 9 QA Administrators 2019-07-08 02:48:22 UTC Comment hidden (obsolete)
Comment 10 LibreUser 2020-01-14 17:03:16 UTC Comment hidden (obsolete)
Comment 11 Buovjaga 2020-01-14 17:50:47 UTC Comment hidden (obsolete)
Comment 12 LibreUser 2020-01-14 18:25:38 UTC Comment hidden (obsolete)
Comment 13 Buovjaga 2020-01-14 18:37:23 UTC
(In reply to LibreUser from comment #12)
> First of all, I would like to say big Thank You to all developers who
> contributed to such amazing software. 
> 
> I would more than happy to contribute. Please advise on the process

Contact a certified company or individual: https://www.documentfoundation.org/gethelp/developers/
Comment 14 LibreUser 2020-01-14 18:47:29 UTC Comment hidden (obsolete)
Comment 15 LibreUser 2020-01-14 18:49:18 UTC Comment hidden (obsolete)
Comment 16 Buovjaga 2020-01-14 18:55:22 UTC Comment hidden (obsolete)
Comment 17 Aron Budea 2020-06-23 05:30:56 UTC
The recently reported bug 134002 is really similar, I was tempted to close as duplicate, but that one is already buggy in 3.3.0, so there must be some kind of difference compared to this file.
Comment 18 Timur 2020-11-09 13:18:54 UTC
Repro 7.1+. For testers, same file resaved in MSO as DOC or DOCX opens OK in LO.
Comment 19 Timur 2020-11-09 13:23:46 UTC
*** Bug 134002 has been marked as a duplicate of this bug. ***
Comment 20 QA Administrators 2022-11-10 04:02:29 UTC Comment hidden (obsolete)
Comment 21 Tracy Logan 2022-12-12 22:11:58 UTC
While I am not the original author, I confirm that this bug still exists in the current version of LO:

Version: 7.4.3.2 / LibreOffice Community
Build ID: 1048a8393ae2eeec98dff31b5c133c5f1d08b890
CPU threads: 8; OS: Mac OS X 12.6.1; UI render: default; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

This bug also exists in v7.4.0.3 (on Mac OS X 12.6.1) and v7.0.4.2 (on Ubuntu 20.04).

LO 3.3 will not run (an update to it is required for this OS), so I cannot confirm the earlier assertion (in #6) that this bug is a regression -- however, FWIW, a similar bug existed in Open Office (https://bz.apache.org/ooo/show_bug.cgi?id=55088); the description of that Open Office bug report is:


"The tag for table row header (\trhdr) is being ignored.
The row which is marked with the \trhdr tag should appear at the top of every
page on which the table containing the row appears.
It works in oo 1.1.5 and 1.1.4, but not in 2.0."

This is the exact behavior I'm seeing in LO, which occurs both in the GUI, and when called with --headless to convert RTF to PDF.

Additional observations (using 7.4.3.2/Mac):

When opening an RTF document with a multi-page table that contains the \trhdr tag, that tag is indeed ignored; Table Properties... shows the Repeat Heading checkbox is unchecked.  Checking that box does prefix the specified header row to each page of the table.  Saving that file in ODT format does preserve that setting, but saving in RTF does not (no \trhdr tag is present), and emits the warning that the document may not save correctly.  Exporting as PDF with that box checked likewise results in the prefixed header on every page.
Comment 22 raal 2023-05-31 17:24:35 UTC
*** Bug 153450 has been marked as a duplicate of this bug. ***