Bug 144563 - Fields referencing numbered lists now have "." in the generated text where they did not before
Summary: Fields referencing numbered lists now have "." in the generated text where th...
Status: REOPENED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.2.0.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.4.0 target:7.3.3
Keywords: bibisected, bisected, regression
: 146948 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-09-17 04:05 UTC by Warren Young
Modified: 2023-08-24 12:27 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Screen capture showing the symptom (2.32 KB, image/png)
2021-09-17 21:02 UTC, Warren Young
Details
DOCX example with misc cross-references (17.08 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-03-09 08:51 UTC, Vasily Melenchuk (CIB)
Details
space before numbered paragraph reference (4.74 KB, image/png)
2022-03-26 02:16 UTC, Warren Young
Details
space bug demo requested in comment #10 (11.54 KB, application/vnd.oasis.opendocument.text)
2022-03-27 23:34 UTC, Warren Young
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Warren Young 2021-09-17 04:05:08 UTC
If you have a document with a numbered list and then say Insert → Field → More Fields..., then select Numbered Paragraphs from the Type box and Number from the "Insert reference to" box and click one of the list elements in the Selection box to the right, it used to put just the referenced number or letter into the document.

Some very recent release version changed this. I'm not sure exactly when, but I track the latest releases with the Homebrew cask, so LibreOffice gets silently upgraded on me soon after each release.

This represents a regression in longstanding behavior. It causes my carefully-formatted documents to be reflowed, since the field contents are now longer.

It's fine if this is an intended feature, since although I consider the change ugly, I can see that some people may want it. However, please offer an option to revert it to the prior behavior. Also, the behavior probably shouldn't be changed on existing documents.
Comment 1 Warren Young 2021-09-17 21:02:45 UTC
Created attachment 175097 [details]
Screen capture showing the symptom

Homebrew updated my installation to 7.2.1.2, and the symptom is still happening. The attached photo shows the unwanted dots in the gray "field" boxes.

The old (and intended) behavior is that this bit of text was rendered as "[8, 15, 16]", being a list of references to numbered points later in the document.
Comment 2 Michael Warner 2021-09-18 18:36:33 UTC
I haven't checked what the behavior was in previous versions, but in 7.2.0.4, it seems to make the formatting of the reference match the formatting of the number in the list. 

For example, if you go to "Format"->"Bullets and Numbering...", select the "Customize" tab, and then change the "Number" field to "1, 2, 3, ...", you should see a Separator section appear in the dialog with Before and After fields. If there is a period in the After field, that period will appear both in the list item, and in a reference to a list item. If there is no period there, there will be no period in the list item nor in the reference to the list item. 

This is in:

Version: 7.2.0.4 (x64) / LibreOffice Community
Build ID: 9a9c6381e3f7a62afc1329bd359cc48accb6435b
CPU threads: 6; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded
Comment 3 Warren Young 2021-10-04 15:48:13 UTC
I've delayed replying hoping that there is work going on to revert this regression, but with 2 weeks of silence, I'm beginning to doubt that.

I'm hoping your suggestion that I change my numbered list style just to make my field references work properly is meant as a diagnostic test rather than a serious path going forward.

This change "fixes" something that wasn't broken.
Comment 4 Justin L 2022-03-08 07:27:33 UTC
When using a .DOC file, I bisected this to 7.1.0, 7.0.0.1, 6.4.5
commit 7e605bc3ff0cfea76be4683f0170d821fcae7203
Author: Vasily Melenchuk on Tue May 19 10:24:35 2020 +0300
    tdf#120394: doc import: use list format string
    
    Since introducion of list level format string there is
    no need in complex parsing of doc level string and convering
    it to prefix-number-suffix format. We can just replace there
    special chars by %n placeholders (used in docx and now in LO)
    and this should be enough.

When using an .ODT format, I bisected this to 7.3, 7.2
commit 9987b518fca1476bd0ce8c86bcf6ac7c81f7b580
Author: Vasily Melenchuk on Mon Jun 14 14:27:56 2021 +0300
    new ODF numbered list parameter loext:num-list-format
    
    Instead of style:num-prefix and style:num-suffix new list format
    is much more flexible for storing list multilevel numberings.
    Now it is possible to have not just prefix/suffix but any random
    separators between levels, arbitrary levels order, etc.
    
    Internal LO format for list format is changed: instead of placeholders
    like %1, %2, etc we right now use %1%, %2%... Reason: for ODT documents,
    having more than 9 levels there is ambiguity in "%10": it is "%1"
    followed by "0" suffix, or "%10"?
    
    Aux changes:
    * removed zero width space hack: since format string is always defined
      this hack is interfering with standard list numbers printing
      (see changes in ooxmlexport14.cxx, ww8export3.cxx tests)
    * changed cross-references values to lists: they are now including full
      list label string: previously this was bit self-contradictory (see
      changes in odfexport.cxx and check_cross_references.py tests)
Comment 5 Vasily Melenchuk (CIB) 2022-03-09 08:51:31 UTC
Created attachment 178741 [details]
DOCX example with misc cross-references

It is not something broken: in general as far as I see in my tests, cross-references are displaying *full* paragraph numberings. This is what was missing in LO previously. But there is specific case in MS Word: if numbering is ending with dot, one (and only one) dot is removed. This does not happen in any other cases and looks like a special behavior for just a dot.

See attached document I created to make these conclusions. Current behavior looks okay for me, just need to extend it with special case for dot suffix. Should be this also a true for non-MS documents?
Comment 6 Commit Notification 2022-03-12 17:22:06 UTC
Vasily Melenchuk committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/8c94a17e408db8a0d27101ce07345fc640bef64d

tdf#144563: remove final dot in cross-references to paragraph

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Vasily Melenchuk (CIB) 2022-03-13 07:11:17 UTC
*** Bug 146948 has been marked as a duplicate of this bug. ***
Comment 8 Commit Notification 2022-03-15 10:37:11 UTC
Vasily Melenchuk committed a patch related to this issue.
It has been pushed to "libreoffice-7-3":

https://git.libreoffice.org/core/commit/1a85d29bbc467354a5bc2d02e672fcdbffe5586d

tdf#144563: remove final dot in cross-references to paragraph

It will be available in 7.3.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Warren Young 2022-03-26 02:16:13 UTC
Created attachment 179113 [details]
space before numbered paragraph reference

The dot is gone as of build ccad7357b, but there is now an unwanted space before references to numbered paragraphs. The attachment shows a reference to paragraph #10 in square brackets. Notice how much more space is before the "1" than after the "0".

This isn't an artifact of "1" being a narrow character: this document uses a proportional font, and the same problem occurs with paragraphs beginning with other digits.
Comment 10 Vasily Melenchuk (CIB) 2022-03-27 06:07:20 UTC
(In reply to Warren Young from comment #9)
> The dot is gone as of build ccad7357b, but there is now an unwanted space
> before references to numbered paragraphs. The attachment shows a reference
> to paragraph #10 in square brackets. Notice how much more space is before
> the "1" than after the "0".
> 
> This isn't an artifact of "1" being a narrow character: this document uses a
> proportional font, and the same problem occurs with paragraphs beginning
> with other digits.

Can you attach a testcase with this problem here? Or create a new bug with this problem. I was not able to repro this problem from scratch.
Comment 11 Warren Young 2022-03-27 23:34:44 UTC
Created attachment 179153 [details]
space bug demo requested in comment #10

Re #10: See attached demo.
Comment 12 Vasily Melenchuk (CIB) 2022-03-30 11:25:48 UTC
(In reply to Warren Young from comment #11)
> Created attachment 179153 [details]
> space bug demo requested in comment #10
> 
> Re #10: See attached demo.

Looks like not a bug from my point of view: list numbering does contain a space as prefix (see "Before" in "Bullets and Numbering" for given list) and this space is displayed in numbering itself and in reference fields.
Comment 13 Warren Young 2022-03-30 17:23:22 UTC
That's the same argument as was given for having the dot: because that's how the referenced list item rendered the number.

I want my references to be styled in in the same font and style as the paragraph where it's inserted. Give me the raw value, and inherit the ambient styles.

The list is styled as it is for one reason, and the place the reference is used in styled another. If I change one to satisfy the other, all I've done is move the problem to the other side of the barrier: now the list is styled incorrectly.

In any case, this used to work properly, for years and years.
Comment 14 Vasily Melenchuk (CIB) 2022-04-06 09:43:12 UTC
Okay, right now about ODT (previously it was DOCX): in last sample we have:

<text:bookmark-ref text:reference-format="number"...>

ODF standard:
"number: displays the list label of the referenced item. The list position of the referenced item plus all of its superior list levels are its reference." (19.860 text:reference-format ODF standard 1.3).

List label in this list is " 1." (with prefix space). So I expect that in ideal case cross-reference should look exactly same way. 

But:
1. Final dot is removed to be uniform with MS Word: it does this unexpected magic. But is it okay for ODT? If no we are encountering some interoperability issues.
2. I agree, idea to have "bare numbering" without any prefixes/suffixes can be useful. Is it what text:reference-format="number" intended for? Or we need an extra option for text:reference-format?


Michael, what do you think on this topic?
Comment 15 Michael Stahl (allotropia) 2022-04-20 12:12:55 UTC
so i guess options are:

1) keep current behavior
2) revert to previous behavior
3) add compatibility flag
4) use another attribute value

i don't like 2) because the new behavior is closer to what Word does.

(we have just tested what Word does with the attached ODT and it doesn't properly support this reference, it imports as PAGEREF field which is wrong)

i guess i don't like to add another attribute value because there are already 3 different ones for "number", "number-all-superior", "number-no-superior" and this difference doesn't look super important (also, how many of these options should be shown in UI, how should user understand how they differ etc.).

but there is an argument to be made that existing documents should look as they used to, so i'm in favor of option 3: add a flag that is stored in ODT files in settings.xml and the field will expand depending on the flag.
Comment 16 Troy Rollo 2023-06-11 07:23:35 UTC
The original behaviour was discussed and agreed at <https://lists.freedesktop.org/archives/libreoffice/2011-April/010660.html> and was committed as a fix to bug 33960 - it was kept this way for years and some of the code still purports to implement this.

The logic behind this is as follows:

1. White space at the start or end of a numbering format are a formatting artifact, and do not represent meaningful information that should be included in the field, especially if it is at the start or end of the field where it prevents the user from sensibly formatting the document.
2. If a numbering format includes only a suffix (and no prefix other than white space), then the suffix is a separator between the number and the paragraph, not part of the paragraph number. It should be omitted unless it is part of a qualified number where the next level has no prefix in the number format, in which case it is needed to separate the number from the number at the next level.

Microsoft Word does remove the dot in "1.(a)", that is, it comes out as "1(a)". This reflects long standing convention that in list numbered in the form "1." "(a)" "(i)" "(A)" "(I)", the dot is omitted in cross-references. I have never seen any usage where a real world user wanted the dot kept in such a case. It is correct that Word only omits a trailing dot from the affected level in that case, but the discussion was that this was wrong. For example, if the top level format is "1/" or "1)", or "1:-" (at least the first two of which I have seen in the wild), the suffix should also be omitted.

I have committed a patch - gerrit 152850 <https://gerrit.libreoffice.org/c/core/+/152850>
Comment 17 Troy Rollo 2023-07-22 02:45:51 UTC
I have decided to discontinue contributing to LibreOffice, hence will not be doing any further work on this. The process for contributing to LibreOffice development has become a bureaucratic nightmare and it is just far too much work for casual contributions. I have better things to do with my time than try to jump through a never-ending series of hoops to get a simple patch accepted.

The patch is done and submitted but the bureaucracy is just far too much for me to continue.