Bug 148597 - Drop "Display" option in "Type" tab and add fields for "Caption No"/CN, "Caption Category"/CC, and "Caption Text"/CT in Entries tab for Table of Figures and Index of Tables (see comment 14)
Summary: Drop "Display" option in "Type" tab and add fields for "Caption No"/CN, "Capt...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: TableofContents-Indexes TableofContents-Indexes-Dialog
  Show dependency treegraph
 
Reported: 2022-04-14 17:51 UTC by arjan@bureauvoorarcheologie.nl
Modified: 2023-04-01 22:57 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Structure vs. paragraph style tab stop definition (27.48 KB, application/vnd.oasis.opendocument.text)
2022-07-02 08:37 UTC, ajlittoz
Details
Ideas on captioning (63.09 KB, application/vnd.oasis.opendocument.text)
2023-03-10 10:26 UTC, ajlittoz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description arjan@bureauvoorarcheologie.nl 2022-04-14 17:51:22 UTC
I would like to be able to automatically format the Table of Figures so that the numbers align and the next lines can be outlined with the tabstop after the figure number. It will make it possible to format the Table of Figures the way it can be done with the Table of Contents.

See for example screenshots:

https://ask.libreoffice.org/t/how-to-create-formatted-table-of-figures-using-tabs-before-and-after-index-number/76464?u=bva

This would improve professional look of the document.
Comment 1 Heiko Tietze 2022-04-22 10:49:06 UTC
You align the content label (Figure) at left and the number right, followed by a left aligned caption. Such a layout can be done only per table or in a ToC, by right-aligning a tab stop. But apparently we have a bug here, see bug 32360 and bug 94661.

The actual request is going beyond the ToC. You could also expect the text of headings to align at the second line. This can be done per indentation (positive value before text and negative for first line). If you give enough space for large numbers this works well. And it should do the trick as well for ToC if you modify the "Contents x" PS properly.

My take: NAB (or duplicate). What do you think, Mike?
Comment 2 ajlittoz 2022-06-29 10:13:27 UTC
The difficulty comes from the way data needed for the Table of Figures is collected.

The caption is captured as a whole, i.e. no distinction is made between generated contents ("Text", "Figure", …, number range inserted with a field and the final separator) and user provided text (the "caption" itself). The whole caption ends up in a E descriptor.

In the case of a TOC, chapter numbering is captured separately from the heading. These bits of information end up in E# and E descriptors which can be arranged separately.

There are two Type's to generate a table of figures from the TOC dialog: Table of Figures and Index of Tables.

BTW, why are there two items in the menu as they are strictly equivalent? Legacy of history?

The kind of table is selected by the Category menu which designates the number range.

From my naive point of view, we already have all elements to be able to discriminate components of the caption, all the more since Display menu offers to get rid of number or text if so desired.

User would gain more versatility if both data were made available in E# and E descriptors instead of the present composite E.

Or course, this has a huge compatibility impact on existing documents.

=============

bug 94661: I think it is a separate issue. See my comment there.

bug 32360: a "comfort" request? it can be done manually by manipulating the Structure line in the Entry tab. The only common factor is the mention of tab stops.

==============

The question of tab handling in TOC/Tables of … is a fundamental one. The Structure line introduces a bias but I can't tell if it is a simplification or additional difficult compared to paragraph style usage.

A paragraph style (PS) can define tab stops with various attributes (alignment, leading, …). The Structure line in TOC overrides the PS tab stops without the possibility of variants. The only user-friendly addition is the Align right check box for a quick reference to the right margin or indent (I remember however some "distortions" with non-zero indents and, most annoying, multi-column pages or sections).

Could it be possible to remove tab definition from the Entries tab to rely exclusively on PS tabs? I know this could upset many users as the full TOC configuration would require to look at two dialogs: TOC and PS and this would be a complication for most users who don't practice styling with ease.

I don't think it would be too difficult to implement. The generated entry contents (a paragraph) is passed to the formatting routine which then applies standard processing.
Comment 3 Heiko Tietze 2022-06-30 08:30:52 UTC
(In reply to ajlittoz from comment #2)
> In the case of a TOC, chapter numbering is captured separately from the
> heading. These bits of information end up in E# and E descriptors which can
> be arranged separately.

Good point!

> A paragraph style (PS) can define tab stops with various attributes
> (alignment, leading, …). The Structure line in TOC overrides the PS tab
> stops without the possibility of variants.

You mean changing the tab stop for the paragraph style has no effect?
Comment 4 Heiko Tietze 2022-06-30 12:49:12 UTC
The topic was on the agenda of the design meeting but didn't receive further input.

Andre's comment 2 brings it to the point: ToF lacks of E vs E# differentiation.
Comment 5 Cor Nouws 2022-06-30 15:30:50 UTC
(In reply to Heiko Tietze from comment #4)
> The topic was on the agenda of the design meeting but didn't receive further
> input.

Missed the meeting :)

> Andre's comment 2 brings it to the point: ToF lacks of E vs E#
> differentiation.

Yeah, good point to have that added by .. :)
Comment 6 ajlittoz 2022-07-02 08:37:48 UTC
Created attachment 181079 [details]
Structure vs. paragraph style tab stop definition

(In reply to Heiko Tietze from comment #3)
> You mean changing the tab stop for the paragraph style has no effect?

Yes. The attachment demonstrates the behaviour. Paragraph style tab stops are ignored. Only those defined in the structure line are active, overriding the PS definitions.

This is inconsistent. When you want to customise TOC appearance, you do it in the Contents n family, but it is difficult to explain users that tab stops should be modified in the Structure line and a TOC refresh should be triggered (instead of immediate update when a page style is changed).

Also, I've seen questions on AskLO requesting tab right-alignment (e.g. so that the final dot in chapter number is vertically aligned over the whole TOC, no matter the number of figures in the number) which is not possible in the present scheme.
Comment 7 sdc.blanco 2023-02-23 15:22:36 UTC
Is the bug summary correct?  I think the question in the OP is about how to control tab stops in a Table of Figures.  Chapter No. is provided by Chapter Info.
Comment 8 ajlittoz 2023-02-23 17:48:05 UTC
(In reply to sdc.blanco from comment #7)
> Is the bug summary correct?  I think the question in the OP is about how to
> control tab stops in a Table of Figures.  Chapter No. is provided by Chapter
> Info.

Good point! Chapter info is available through CI descriptor.

What is requested is an E# *entry number* containing the label (Figure Table, Drawing, …) and the number. And it looks it is already available because we can vary what is inserted in the table with the Display menu:
- References: the complete paragraph
- Category & number: beginning of paragraph up to the number range plus the next character if it is not a space
- Caption Text: end of paragraph after number range and non-space character suffixed to it, run of spaces being ignored before first word or symbol

So E# could collect what is designated by Category and Number, E be restricted to Caption Text instead of present References.

From experiment, what is important is the present of a number range in a paragraph. The field is the split position.

PS: I got funny fancy but rather consistent results by using two different number ranges in a caption. I don't know if this unintentional feature may be useful to someone. It allows to have the same caption in two different tables, even when the second number range is hidden to avoid strange dual-numbered captions. Funny, I said.
Comment 9 sdc.blanco 2023-02-24 00:19:00 UTC
(In reply to ajlittoz from comment #8)
> What is requested is an E# *entry number* containing the label (Figure
> Table, Drawing, …) and the number. And it looks it is already available
> because we can vary what is inserted in the table with the Display menu:
On the "Type" tab.

> So E# could collect what is designated by Category and Number, E be
> restricted to Caption Text instead of present References.
But now you are referring to the Entries tab, no?  

At present, if there is no "E" on the Entries tab, then the settings in Display (on the Type tab) have no effect (i.e., nothing is displayed).  In other words, at present "E" simply displays whatever is selected on the Type tab.

It seems preferable to collect all the formatting onto the Entries tab, for example with:

CT = Caption Text
C  = Category
C# = Category Number
(plus CI = Chapter Info)

Then E is not needed (and "Display" on "Type" is not needed).  

A predefined structure would probably have a common configuration: 
C C# CT  [e.g., Figure (C) 1 (C#): My figure here (CT) ]

and then users can edit/configure according to their needs.

This description seems to be what is needed in the OP. 
And it would also be a solution to bug 94966.

(Alternatively, adding a "Number only" to the Display dropdown would also be a solution).

For now, these proposals are meant only to clarify what capabilities are desired in the UI. 

(Side comment: this seems "bad" to have structure design both on the Type tab and Entries tab -- hard to keep track.  And as you pointed out in bug 134781, it is rather pointless to have both Index of Tables and Index of Figures in the UI, given that each can be created from the other.  While the "from Objects" is a different story, providing a "predefined" version of what can be achieved in User-defined).
Comment 10 ajlittoz 2023-02-24 10:21:48 UTC
(In reply to sdc.blanco from comment #9)
> (In reply to ajlittoz from comment #8)
> It seems preferable to collect all the formatting onto the Entries tab, for
> example with:
> 
> CT = Caption Text
> C  = Category
> C# = Category Number
> (plus CI = Chapter Info)

For consistency with all the other table Entries tab (except Bibliography), I think it would be better to have:

E = caption text (where E is mnemonics for "entry")
E# = category + category number
(plus CI, of course)

Keeping the E ensures a "relative" compatibility with existing documents.

Your proposal introduces an improvement compared to mine: you separate the "identification" of the entry into two components C and C# which can be edited separately.

However, this immediately raises a problem. From my experiments, today the "components" of the caption paragraph are split at the number range field with a subtlety. If the field is immediately followed by a non-U+0020 SPACE character, this character is kept in the first component. We end up with everything from the beginning to the field or its immediately subsequent character as "Category & Number" and everything from the first non-U+0020 following the preceding run to the end.

This is not good in languages such as French where AutoCorrect inserts non breaking space before a colon frequently used to separate the numbering from the caption itself.

Transposing this to your (better) three components proposal, we could have:
- C or EC: everything from start to number range field, removing trailing spaces
- C# or E#: number range value
==> But to do with the separator? Should it be included with E# or ignored, considering we can always add literals in the Entries structure line between two descriptors?
==> Also what is the definition of a separator (+)? To avoid the problem mentioned for French, I suggest that any run of non-U+0020 characters suffixed to the field be treated as the separator, accounting for possible multi-character separators.
- CT or E: everything after the field or separator, excluding leading spaces

(+) Since a caption can be manually generated (which I do frequently when my items are not enclosed in a frame), you can't rely on the Insert>Caption to get the separator. And my suggested separator recognition will fail for something like space-em dash-space, in fact any separator using a space after the number.
Comment 11 sdc.blanco 2023-02-24 13:10:15 UTC
(In reply to ajlittoz from comment #10)
> For consistency with all the other table Entries tab (except Bibliography),
Not sure why consistency should be important in this context. I would put higher priority on meaningful mnemonic labels for the abbreviations.

Meanwhile, in actual fact, the labels and abbreviations appear to be consistent. See attachment 185540 [details] for an overview.

> E# = category + category number
But this fails your consistency test. (-:

At present E# only appears for ToC, where it only provides a Heading number (not a category number) and no category (label). (in relation to E#, see bug 153561)

My proposal with the Cs, is to introduce additional widgets in the Entry tab, as part of the structure dialog, so that the user can control the Category number and category label and caption text independently (and to drop/move out that control from the "Type" tab).  This is completely separate from E# (which has to do with the Heading number that appears before a caption, not caption labels).

Have I misunderstood your point?

> However, this immediately raises a problem. From my experiments, today the
> "components" of the caption paragraph are split at the number range field
> with a subtlety. 
I have also encountered "strange" things with number range field (also with setting the number of levels, see bug 153710). Could not decide what was intended behavior and what was bug, in part because I can not systematically repeat particular strange effects. (I think there are also sometimes problems with "refresh" / "updating" the index.) -- so I have stopped exploring such issues.

> ==> But to do with the separator? Should it be included with E# or ignored,
In the "real" E# (in ToC), there is a dropdown box, which allows both choices.
In my proposal, clicking on C# would provide a dialog with a dropdown box like E# (where you can choose separator or not).
Comment 12 ajlittoz 2023-02-25 10:00:57 UTC
(In reply to sdc.blanco from comment #11)
> (In reply to ajlittoz from comment #10)
> 
> > E# = category + category number
> But this fails your consistency test. (-:

I know. That's why I qualified your 3-descriptor proposal as "better".

> My proposal with the Cs, is to introduce additional widgets in the Entry
> tab, as part of the structure dialog, so that the user can control the
> Category number and category label and caption text independently (and to
> drop/move out that control from the "Type" tab).  This is completely
> separate from E# (which has to do with the Heading number that appears
> before a caption, not caption labels).
> 
> Have I misunderstood your point?
> 
Partially, I think. I fully agree that the menu in the "Type" tab is wrong from UX point of view. It should be part of the structure line with adequate descriptors. I refrained from proposing more descriptors lest it would have been waved away with "too complex" or "breaks user experience" and tried to keep as much the existing. Read further down.

I read bug 153561. If your (=TDF design team) intent is to "customize" the various descriptor in a more targeted meaning, then the "neutral" E (for _E_ntry I presume) can be changed as desired.

> > ==> But what to do with the separator? Should it be included with E# or ignored,
> In the "real" E# (in ToC), there is a dropdown box, which allows both
> choices.
> In my proposal, clicking on C# would provide a dialog with a dropdown box
> like E# (where you can choose separator or not).

In fact, the basic problem is with captioning. There is no agreed standard on caption structure. In Writer, a caption is defined as a paragraph containing a selected number range field. It is then implicitly assumed that what precedes the field is "Category" and what follows is "Caption text".

I think that beyond this simple understanding there exists many other structures possibly exhibiting:
- "decoration" around the number such as parentheses or square brackets (see "Separators" before and after in list styles, including chapter numbering)
- a separator between category and its number and caption text; this separator is not necessarily a colon, nor a single character
- order may be different from category-number-separator-text, which implies there may be 2 separators

Captioning is essentially a manual process (Insert>Caption which is equivalent to macro execution does not change the issue because it leaves only text without any metadata markup). Writer has then no way of identifying the components, apart from the field. The present state of affairs relies on an implicit convention that category is written before (in reading order) the field and caption text after.

A full control of "Table of xxx" formatting requires more contextual information than available today.

Then should caption paragraphs be identified by some special means? Special markup? Then modification to ODF with all the fuss. Heading paragraphs are recognised as such when they are attached to some outline level. But I don't see any solution which would not upset the average user. Acceptability is of major importance.
Comment 13 sdc.blanco 2023-02-25 15:11:50 UTC Comment hidden (off-topic)
Comment 14 sdc.blanco 2023-02-25 15:24:59 UTC
The following proposal resolves the issue raised in the OP.  Also resolves bug 94966.  @Heiko, should I add needsUXEval? or can this proposal be accepted?
(possibly an EasyHack)

For types “Table of Figures” and “Index of Tables” in "Insert Index" dialog,

1. In “Type” tab, drop “Display” control (including the dropdown box)

2. In “Entries tab”, add 3 widgets with following functions/abbreviations:

Widget name         Abbreviation    Function
Caption No.            CN           Display value of caption number 
                                                     (i.e., number range field)
Category               C            Display “category” of caption
Caption Text           CT           Display caption text only*

No additional options when clicking on these Structure widgets.

2.  Drop the “Entry Text” widget (or modify it to be “Caption Text”)

*bug 127452 has additional request to display only first sentence of a caption text.
Comment 15 ajlittoz 2023-02-26 10:34:09 UTC
(In reply to sdc.blanco from comment #13)
> (In reply to ajlittoz from comment #12)
> Thanks for clarifications.
> 
> Some comments:
> > - "decoration" around the number such as parentheses or square brackets (see
> > "Separators" before and after in list styles, including chapter numbering)
> > - a separator between category and its number and caption text; this
> > separator is not necessarily a colon, nor a single character
> > - order may be different from category-number-separator-text, which implies
> > there may be 2 separators
> I believe all these variations can be addressed using the "Text"  fields in
> the Structure option on the Entries tab.  (i.e., do not need to add options
> to UI)

No. I was talking about the caption itself, the data source for the Table of Figures et al.
The Structure line is meant to define the entry in the Table of …, taking its input from the caption paragraph. The caption paragraph is presently "unstructured". The only thing we know about it is it contains a number range field. This fields splits the paragraph into three bits: "before" taken as Category, "field" for number and "after" taken as Caption text.

This could lead to weird formatting of the Table of … or perhaps I fall into "the devil lies in details" syndrome.

Note that presently the Table of … generator and the Insert>Caption "macros" are totally separate and don't share information. The Insert>Caption allows to change the order of the caption components but there is no way to communicate this to the table generator. This makes the Display menu in the Type tab inconsistent. Only the References choice gives usable result. Fortunately I bet 99.99% of the users never fiddle with this menu.

> > A full control of "Table of xxx" formatting requires more contextual
> > information than available today.
> If you think there is a need for more control, beyond what is discussed
> here, then better to open a new bug report with a specific request.

I'll try to summarise my thoughts in a LO Writer document I'll send you ASAP.

> > Then should caption paragraphs be identified by some special means? 
> At present they get their own PS.
> 

Wrong. You can caption with any PS, not only Caption and its derivatives (Drawing, Figure, …). The decision to consider a paragraph as a caption is the presence of a number range field with the name selected in the Type tab of the dialog from the so-called Category menu. You don't even need to have defined a "Category" item either with Insert>Caption or Tools>Options, LOWriter>AutoCaption since the process can be (and basically is) fully manual.

(In reply to sdc.blanco from comment #14)
> The following proposal resolves the issue raised in the OP.  Also resolves
> bug 94966.  @Heiko, should I add needsUXEval? or can this proposal be
> accepted?
> (possibly an EasyHack)
> 
[…]
> 
> 2. In “Entries tab”, add 3 widgets with following functions/abbreviations:
> 
> Widget name         Abbreviation    Function
> Caption No.            CN           Display value of caption number 
>                                                      (i.e., number range
> field)
> Category               C            Display “category” of caption
> Caption Text           CT           Display caption text only*

Since you suggest Cx descriptor names, why don't you use CC for caption category (2-letter symbols everywhere)? For me, single-letter C would emphasise its importance over the others, some kind of primary data versus secondary. This may be psychological but is worth consideration.
Comment 16 ajlittoz 2023-02-26 10:43:32 UTC Comment hidden (off-topic)
Comment 17 sdc.blanco 2023-02-26 11:42:03 UTC
(In reply to ajlittoz from comment #15)
>  why don't you use CC for caption category 
Either way is fine with me.  

More generally, seems like we agree now on the enhancement request, so I will update the bug summary.
Comment 18 sdc.blanco 2023-02-26 11:58:47 UTC Comment hidden (off-topic)
Comment 19 Heiko Tietze 2023-02-27 07:43:34 UTC
(In reply to sdc.blanco from comment #14)
> @Heiko, should I add needsUXEval? or can this proposal be accepted?

Bug 153561 is on the agenda for the design meeting, along with bug 153712 and I added this ticket now too.
Comment 20 ajlittoz 2023-03-10 10:26:13 UTC
Created attachment 185881 [details]
Ideas on captioning

I have written down some ideas on how to enhance captioning.

The main rejection point is that they need a change in ODF XML markup so that needed data is available when generating indexes of tables, figures, …

Power users will have no difficulty with the proposal. To make it acceptable, adequate default "configurations" must be offered when opening the caption and "Table of Figures" dialogs. The same defaults as the present one avoid to disrupt "muscle memory" and habits.

Should any discussion about it be started in a separate bug report as an enhancement request?
Comment 21 Heiko Tietze 2023-03-15 09:48:04 UTC
(In reply to ajlittoz from comment #20)
> Created attachment 185881 [details]
> Ideas on captioning

Cool idea but not really about ToF rather the captioning. Bug 153248 about a mix of options also in regards to auto numbering is on my personal agenda. Commenting there.