Bug 163974 - (May be ODF-spec related) Nested Footnotes: LO reports `Read Error` but `ODFValidator` is quite happy
Summary: (May be ODF-spec related) Nested Footnotes: LO reports `Read Error` but `ODFV...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
24.8.2.1 release
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:25.2.0 target:24.8.4
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-21 04:44 UTC by Jambunathan K
Modified: 2024-11-24 07:27 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
lo-syntax-error-with-nested-footnote-definition.zip: Zip file that contains all artefacts for assessing this bug report (369.97 KB, application/zip)
2024-11-21 04:44 UTC, Jambunathan K
Details
nested-footnote-definition.odt: Problematic ODT file (26.59 KB, application/vnd.oasis.opendocument.text)
2024-11-21 04:46 UTC, Jambunathan K
Details
00-lo-format-error-for-nested-footnote-definition.png: `Read Error` issued by LO (11.84 KB, image/png)
2024-11-21 04:47 UTC, Jambunathan K
Details
01-odf-validator-reports-no-errors-or-warnings.png: ODF validator reports no syntactic errors (97.67 KB, image/png)
2024-11-21 04:48 UTC, Jambunathan K
Details
02-my-diagnosis-of-lo-format-error.png: My "diagnosis" of the problem (230.71 KB, image/png)
2024-11-21 04:49 UTC, Jambunathan K
Details
nested-footnote-definition.pdf: My "expectation" of how the LO should handle the XML markup (16.83 KB, application/pdf)
2024-11-21 04:50 UTC, Jambunathan K
Details
nested-footnote-definition-take-2.zip: Tells how LO may "import" nested footnote definitions without data loss (211.82 KB, application/zip)
2024-11-21 08:21 UTC, Jambunathan K
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jambunathan K 2024-11-21 04:44:49 UTC
Created attachment 197704 [details]
lo-syntax-error-with-nested-footnote-definition.zip: Zip file that contains all artefacts for assessing this bug report

(May be ODF-spec related) Nested Footnotes: LO reports `Read Error` but `ODFValidator` is quite happy



Table of Contents
_________________

1. Broader Question
.. 1. A special request
2. Attachments
3. Misc. attachments for added context


1 Broader Question
==================

  The broader questions are these;

  - what does OpenDocumentFormat say about "nested" foonotes and how
    does LO handle it.  Specifically, can LO create the equivalent of
    what I see in `nested-footnote-definition.pdf` (This `pdf` file is
    created with `tex`)

  - given the problematic `nested-footnote-definition.odt` file here, is
    there a way LO can "jump over" problematic parts (and offer to
    "repair" the file, that provides atleast a "deprecated"
    functionality)


1.1 A special request
~~~~~~~~~~~~~~~~~~~~~

  I believe this "bug" is better reviewed by someone who is quite
  conversant with OASIS OpenDocument spec.  So, I am Cc-ing Regina
  Henschel <rb.henschel@t-online.de>.

  (Regina please excuse me if I am wrong in my judgment)


2 Attachments
=============

  nested-footnote-definition.odt
        This `ODT` file is created OUTSIDE of LO. Specifically, it is
        created with `Org`-to-`ODT` exporter of `Emacs`.[1] (For added
        context, `Org` is a plain text markup similar to `markdown`, and
        `Org` is very popular with Emacs users.)

  00-lo-format-error-for-nested-footnote-definition.png
        Error reported by LO when opening above `ODT` file.

        Note that the target line (= 21), and target column (= 55) of
        `content.xml` effectively points to the EOF (= end of file) and
        not a specific "malformed" XML line.  In fact, the XML is NOT AT
        ALL malformed, and is well-formed when validated against ODF's
        `rng` and `rnc` files.

        This error is reported with following version of LO

        ,----
        | kjambunathan@debian-ng:~$ uname -a
        | Linux debian-ng 6.11.5-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.11.5-1 (2024-10-27) x86_64 GNU/Linux
        | 
        | kjambunathan@debian-ng:~$ dpkg -l | grep office | grep writer
        | ii  libreoffice-uiconfig-writer               4:24.8.2-2                           all          UI data ("config") for LibreOffice Writer
        | ii  libreoffice-writer                        4:24.8.2-2                           amd64        office productivity suite -- word processor
        | 
        | Version: 24.8.2.1 (X86_64) / LibreOffice Community
        | Build ID: 480(Build:1)
        | CPU threads: 4; OS: Linux 6.11; UI render: default; VCL: gtk3
        | Locale: en-IN (en_IN); UI: en-US
        | Debian package version: 4:24.8.2-2
        | Calc: threaded
        `----

  01-odf-validator-reports-no-errors-or-warnings.png
        ODF Validator reports no issues with the above `ODT` file.  This
        implies that the ALL component XML files including `content.xml`
        are well-formed.  IOW, LO reporting `syntax error` is
        questionable.  (LO can claim a `semantic error` but it claiming
        a `syntax error` obscures what the problem is.)

  02-my-diagnosis-of-lo-format-error.png
        The screenshot shows the `Emacs` editor.  Focus on the right
        window which displays the below XML snippet.  Note the XML tree
        goes like

              `text:note`-> `text:note-body` -> `text:note` ->
              `text:note-body`

        The "nested-ness" of the "text:note-body"-s confuses LO.

        ,----
        | <text:p text:style-name="Text_20_body">Sunt sed ullamco amet, velit
        | nulla anim dolore officia reprehenderit occaecat adipiscing magna
        | elit.
        | <text:span text:style-name="OrgSuperscript">
        |   <text:note text:id="fn1"
        |              text:note-class="footnote">
        |     <text:note-citation>1</text:note-citation>
        |     <text:note-body>
        |       <text:p text:style-name="Footnote">Laboris nulla lorem ea tempor
        |       anim do sunt dolor occaecat voluptate aliqua commodo ut.
        |       <text:span text:style-name="OrgSuperscript">
        |         <text:note text:id="fn2"
        |                    text:note-class="footnote">
        |           <text:note-citation>2</text:note-citation>
        |           <text:note-body>
        |             <text:p text:style-name="Footnote">Ipsum deserunt duis aliqua
        |             laboris est ullamco veniam, minim. Voluptate proident, aute tempor
        |             ut ad fugiat laboris sit eiusmod nostrud duis proident, ex
        |             pariatur. Ad fugiat nostrud ex incididunt proident, minim do.
        |             Officia pariatur et enim cillum esse ad adipiscing eu labore velit
        |             esse laborum eu nisi.</text:p>
        |           </text:note-body>
        |         </text:note>
        |       </text:span></text:p>
        |     </text:note-body>
        |   </text:note>
        | </text:span></text:p>
        `----


3 Misc. attachments for added context
=====================================

  nested-footnote-definition.org 
        This is the plain-text version of
        `nested-footnote-definition.odt`.  This file is in Emacs'
        `Org-mode` markup format.

        This file is shown in the left window of the screenshot above.
        I have included this `org` file in the hope that it helps
        "unravels" the XML markup.

        Breadcrumb for use by the originator of the bug :: See
        <https://github.com/kjambunathan/org-mode-ox-odt/issues/281#issuecomment-2489935138>

  nested-footnote-definition.tex
        The `tex` equivalent of the above `org` file.

  nested-footnote-definition.pdf
        The output I desire from LO.



Footnotes
_________

[1] <https://github.com/kjambunathan/org-mode-ox-odt>
Comment 1 Jambunathan K 2024-11-21 04:46:50 UTC
Created attachment 197705 [details]
nested-footnote-definition.odt: Problematic ODT file
Comment 2 Jambunathan K 2024-11-21 04:47:42 UTC
Created attachment 197706 [details]
00-lo-format-error-for-nested-footnote-definition.png: `Read Error` issued by LO
Comment 3 Jambunathan K 2024-11-21 04:48:47 UTC
Created attachment 197707 [details]
01-odf-validator-reports-no-errors-or-warnings.png: ODF validator reports no syntactic errors
Comment 4 Jambunathan K 2024-11-21 04:49:45 UTC
Created attachment 197708 [details]
02-my-diagnosis-of-lo-format-error.png: My "diagnosis" of the problem
Comment 5 Jambunathan K 2024-11-21 04:50:37 UTC
Created attachment 197709 [details]
nested-footnote-definition.pdf: My "expectation" of how the LO should handle the XML markup
Comment 6 Jambunathan K 2024-11-21 04:53:24 UTC
A note to the reviewer 

The `lo-syntax-error-with-nested-footnote-definition.zip` contains ALL the artefacts required for you to make sense of this bug.  IOW, you are free to "ignore" other attachemnts.

The attachments OTHER THAN  `lo-syntax-error-with-nested-footnote-definition.zip` are there to give a bird's-eye view of the bug (when with the browser)
Comment 7 Jambunathan K 2024-11-21 05:01:09 UTC
Bread crumb for the author of this bug, see https://github.com/kjambunathan/org-mode-ox-odt/issues/281#issuecomment-2490017713
Comment 8 Mike Kaganski 2024-11-21 05:41:15 UTC
(In reply to Jambunathan K from comment #0)
> 1 Broader Question
> ==================
> 
>   The broader questions are these;
> 
>   - what does OpenDocumentFormat say about "nested" foonotes and how
>     does LO handle it.  Specifically, can LO create the equivalent of
>     what I see in `nested-footnote-definition.pdf` (This `pdf` file is
>     created with `tex`)

Have you done your own research? The ODF spec is open, and without your own research, this looks like putting a load on others "just because".

https://docs.oasis-open.org/office/OpenDocument/v1.4/OpenDocument-v1.4-part3-schema.html#element-text_note-body

> Note: The schema allows for the inclusion of <text:note> elements as a
> descendant of a child of the <text:note-body> element. While this may be
> reasonable for note types, it is not reasonable for footnotes and endnotes.
> Conforming consumers need not support notes inside notes.

So while the syntax is formally OK, the semantics is the one that is explicitly mentioned as optionally unsupported.

The "should there be a repairment offered" is a reasonable question.
Comment 9 Mike Kaganski 2024-11-21 05:51:50 UTC
Or even, since the syntax is conforming, it should be silently accepted; the unsupported part should be dropped (no artificial rearrangement is needed, if it's unsupported; if we decide to support it, then we should first decide what does it mean). It needs no warnings to user, just the same way as we don't show warnings when load unsupported markup maybe generated by newer versions, etc.
Comment 10 Jambunathan K 2024-11-21 06:24:36 UTC
> Have you done your own research? The ODF spec is open, and without your own
> research, this looks like putting a load on others "just because".

This is intentional ...

When I get a note from Regina (or some one higher up), I know that my bug report will NOT be moved by a "bot" to NEEDINFO state and subsequently going to garbage bin for "NO ACTION BY BUG SUBMITTER".

This is not any worse than some QA folks moving my bugs to NEEDINFO state (with a note to test this with newer version of LO), and subsequently "CLOSING" my bug with "NO RESPONSE FROM FILER OF BUGS".

-----------------

Instead of shouting at me, you should thank me a detailed report, so that an INFORMED person (like you( can act on within a few minutes.  That is HOW DETAILED my bug report is.  I have been maintaining Emacs ODT exporter since 2010 or so, and foolish to claim IDLENESS or IGNORANCE on my part.

------------

A "professional" response from the LO dev team would be to say 

   "LO doesn't support this construct"

or 

   "LO can mimic the TeX engine"

or 

   "ODF 1.2 are whatever doesn't allow nested footnotes.  May be it can be used a use case for future proposal to ODF 1000.x"


and OASIS cannot talk about LO does (even though it is understood that much what got in to OASIS in first place is through OO and LO efforts)
Comment 11 Mike Kaganski 2024-11-21 06:32:23 UTC
(In reply to Jambunathan K from comment #10)
> Instead of shouting at me

???
I asked you. Yes you generated a lot of information (which requires quite some time to parse, by the way) - thank you! But you never mentioned that you checked the ODF spec yourself - that's a homework *exactly* for a person who created such a detailed and knowledgeable report. And I asked, if you did.

But well, you reply emotionally - I am no more interested, bye.
Comment 12 Jambunathan K 2024-11-21 06:44:15 UTC
>  And I asked, if you did.

No.  Your response was quite coloful, too colorful for an old man like me.  Here is the colorful part of your response

    this looks like putting a load on others "just because".

> But well, you reply emotionally - I am no more interested, bye.

When your response is colorful, my response is going to be colorful too.


-----

> But well, you reply emotionally - I am no more interested, bye.

AFAIU, a professional or a old hand knows how to be "colorful" in his response without "showing the colors" explicitly.

May be you can try offering

   "English is not my first language" 

response as an excuse.

---------
Comment 13 Jambunathan K 2024-11-21 08:21:41 UTC
Created attachment 197710 [details]
nested-footnote-definition-take-2.zip:  Tells how LO may "import" nested footnote definitions without data loss
Comment 14 Jambunathan K 2024-11-21 08:36:27 UTC
(In reply to Mike Kaganski from comment #9)
> Or even, since the syntax is conforming, it should be silently accepted; the
> unsupported part should be dropped (no artificial rearrangement is needed,
> if it's unsupported; if we decide to support it, then we should first decide
> what does it mean). It needs no warnings to user, just the same way as we
> don't show warnings when load unsupported markup maybe generated by newer
> versions, etc.

Any data loss has to reported to the user ...

See attachment for how I imagine nested footnotes may be "imported" (or even "implemented") https://bugs.documentfoundation.org/attachment.cgi?id=197710

I have not thought a lot about how the re-write may happen, so I am offering the attachment as "an initial offering" further "brainstorming".

--------

Even though, I don't use LaTeX much, I can confidently it is NOT "abnormal" to expect chaining of footnote definitions in the LaTeX documents.  

(Much of Emacs community are academics and they use LaTeX regularly. The Emacs / LaTeX community feels that it is OK to have "nested footnotes")

There was a specific reason why I have flagged this issue for review (or atleast a note) by someone who is active in OpenDocument-spec forums.  Not being able to capture "nested footnotes" is an handicap to ODF format, may be.
Comment 15 Mike Kaganski 2024-11-21 09:05:46 UTC
https://gerrit.libreoffice.org/c/core/+/176906
Comment 16 Mike Kaganski 2024-11-21 11:47:06 UTC
This report consisted of two major parts:

1. The failure on opening. It was expressed in the title ("LO reports `Read Error` but `ODFValidator` is quite happy"), as well as in comment 0 ("is there a way LO can "jump over" problematic parts").

This part was a bug. Since the syntax is formally correct, we must allow opening such files. As always with unsupported data, it is discarded (the same happens with any other unsupported element).

2. A request for defining a meaning for the nesting, and implementing a support for it (this implies following request to OASIS to amend the standard). Since a single Bugzilla ticket must cover only one problem, that second part needs an own ticket.

This bug is fixed now.
Comment 17 Commit Notification 2024-11-21 11:47:34 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/512ef23224987e3107e66241db3b42934e15a561

tdf#163974: ignore nested footnotes on ODF import

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Jambunathan K 2024-11-22 02:54:10 UTC
(In reply to Commit Notification from comment #17)
> Mike Kaganski committed a patch related to this issue.
> It has been pushed to "master":
> 
> https://git.libreoffice.org/core/commit/
> 512ef23224987e3107e66241db3b42934e15a561
> 
> tdf#163974: ignore nested footnotes on ODF import
> 
> It will be available in 25.2.0.
> 
> The patch should be included in the daily builds available at
> https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
> information about daily builds can be found at:
> https://wiki.documentfoundation.org/Testing_Daily_Builds
> 
> Affected users are encouraged to test the fix and report feedback.

Thanks for the quick fix.  You will hear from me in a week or two.
Comment 19 Commit Notification 2024-11-22 10:16:11 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/da0e4cdc4f6bf8e0e88a4dd0e00afdf68e59a883

tdf#163974: ignore nested footnotes on ODF import

It will be available in 24.8.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Jambunathan K 2024-11-24 07:27:11 UTC
I have verified your fix, and I am able to import the ODT file that has nested footnote definition.

I have created a followup bug on 

`Bug 164019 - Data loss while importing an `ODT` file with ~Nested Footnotes~`. See https://bugs.documentfoundation.org/show_bug.cgi?id=164019 

(In reply to Mike Kaganski from comment #16)
> This report consisted of two major parts:
> 
> 1. The failure on opening. It was expressed in the title ("LO reports `Read
> Error` but `ODFValidator` is quite happy"), as well as in comment 0 ("is
> there a way LO can "jump over" problematic parts").
> 
> This part was a bug. Since the syntax is formally correct, we must allow
> opening such files. As always with unsupported data, it is discarded (the
> same happens with any other unsupported element).
> 
> 2. A request for defining a meaning for the nesting, and implementing a
> support for it (this implies following request to OASIS to amend the
> standard). Since a single Bugzilla ticket must cover only one problem, that
> second part needs an own ticket.
> 
> This bug is fixed now.