Bug 95662 - XHTML-Export: Export to html produces wrong decimalseparator together with bullets
Summary: XHTML-Export: Export to html produces wrong decimalseparator together with bu...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.0.0.3 release
Hardware: x86-64 (AMD64) Linux (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard: target:7.2.0
Keywords: implementationError
Depends on:
Blocks: (X)HTML-Export
  Show dependency treegraph
 
Reported: 2015-11-07 19:22 UTC by Robert Großkopf
Modified: 2023-06-10 06:29 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Odt-document for testing the export - look at the width between text and bullet. (9.43 KB, application/vnd.oasis.opendocument.text)
2015-11-07 19:22 UTC, Robert Großkopf
Details
XHTML-file produced by LO while exporting - wrong decimal-separator for min-width (2.78 KB, text/html)
2015-11-07 19:23 UTC, Robert Großkopf
Details
Original for export to xhtml (28.46 KB, application/vnd.oasis.opendocument.text)
2022-08-18 14:18 UTC, Robert Großkopf
Details
Result export with LO 7.1.5.2 (8.22 KB, text/html)
2022-08-18 14:19 UTC, Robert Großkopf
Details
Result export with LO 7.4.0.2 (8.40 KB, text/html)
2022-08-18 14:19 UTC, Robert Großkopf
Details
After replacing :0, with :0. and :1, with :1. it looks well in LO 7.1.5.2 - but not in LO 7.4.0.2 (8.22 KB, text/html)
2022-08-18 14:21 UTC, Robert Großkopf
Details
Test document with different bullets an different levels. (31.66 KB, application/vnd.oasis.opendocument.text)
2023-06-10 06:29 UTC, Robert Großkopf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Großkopf 2015-11-07 19:22:33 UTC
Created attachment 120369 [details]
Odt-document for testing the export - look at the width between text and bullet.

Open the attached *.odt-file.
Shows a little bit text and the start of a list with a bullet.
Export this with File → Export ... to a xhtml-file.
Open this file with your browser. 
The bullet is on the left, directly followed by the text.
Open the *.html-file for editing.
Inside this file is shown "min-width:0,635cm;"
This is a wrong decimal-separator and couldn't work.
Change this to "min-width:0.635cm;" and the list would be shown in a better way.

Seems this bug appears mostly together with bullets. Have tried to change the GUI from German to English - no effect. Have tried to change the local schema to English - no effect.
Comment 1 Robert Großkopf 2015-11-07 19:23:22 UTC
Created attachment 120370 [details]
XHTML-file produced by LO while exporting - wrong decimal-separator for min-width
Comment 2 A (Andy) 2015-11-07 20:48:02 UTC
For me not reproducible with LO 5.0.3.2, Win 8.1
Comment 3 Robert Großkopf 2015-11-08 08:07:21 UTC
(In reply to A (Andy) from comment #2)
> For me not reproducible with LO 5.0.3.2, Win 8.1

Could be I have to describe a little bit better:
1) Open the attached *.odt-file
2) Go to File → Export (not: File → Save As)
3) Select format XHTML
4) Save the file

There will appear a file with *.html (better should have *.xhtml).
Code is something like this:
...
<span style="display:block;float:left;min-width:0,635cm;">•</span>One bullet. Decimalseparator for margin on left ...

The min-width is set wrong here.

If doing File → Save As the code is totally different. It produces 
...
<li/>
<p style="margin-bottom: 0cm; line-height: 100%">One bullet. Decimalseparator for margin on left ...

No min-width is set here. The format isn't the same. For exporting a book (like Base-Handbook) it is unusable.
Comment 4 A (Andy) 2015-11-08 08:17:57 UTC
Thanks for your reply.  But unfortunately not reproducible for me.

I get: "<span style="display:block;float:left;min-width:0.635cm;">•</span>One bullet. Decimalseparator for margin on left side should be 0.6<span class="T1">"

Maybe a Linux only issue?
Comment 5 Robert Großkopf 2015-11-08 08:40:59 UTC
I have set back the user-profile here. Same behavior. I changed to the LO-Version of OpenSUSE - same behaviour. Changed to LO 4.1.6 (have installed many LO-versions parallel) - same behavior.

My System: OpenSUSE 13.2 64bit rpm Linux.

Seems we have to look for another person with Linux.
Comment 6 Buovjaga 2015-11-11 12:40:11 UTC
I get
min-width:NaNcm;

Setting to NEW.

Ubuntu 15.10 64-bit 
Version: 5.0.2.2
Build ID: 00m0(Build:2)
Locale: en-US (en_US.UTF-8)
Comment 7 Robert Großkopf 2016-03-05 08:14:59 UTC
Have tested a little bit more. The bug first appears with LO 4.0.0.3. Up to LO 3.6.7.2 the created code shows only '0' for min-with. So no min-with is defined at all.
Seem this feature has been added with LO 4.0. I will set this one as "regression". Could be it is wrong to set this as "regression", because the feature doesn't exist before.
Comment 8 Buovjaga 2016-03-05 14:41:04 UTC
(In reply to robert from comment #7)
> Seem this feature has been added with LO 4.0. I will set this one as
> "regression". Could be it is wrong to set this as "regression", because the
> feature doesn't exist before.

Let's change to implementationError
Comment 9 QA Administrators 2017-03-06 15:41:13 UTC Comment hidden (obsolete)
Comment 10 Robert Großkopf 2017-03-08 17:54:26 UTC
Bug still exists with LO 5.3.1.1, OpenSUSE 42.1 Leap, 64bit rpm Linux.
Comment 11 QA Administrators 2018-06-27 02:48:17 UTC Comment hidden (obsolete)
Comment 12 Robert Großkopf 2018-06-27 14:13:49 UTC
Bug still exists with LO 6.0.5.2, OpenSUSE 42.3 Leap, 64bit rpm Linux.
Comment 13 QA Administrators 2019-06-28 02:59:25 UTC Comment hidden (obsolete)
Comment 14 Robert Großkopf 2019-06-28 17:58:55 UTC
Bug still exists in LO 6.2.5.2 on OpenSUSE 15 64bit rpm Linux
Comment 15 Gerrit Großkopf 2020-10-19 19:01:21 UTC
adjusting "$listLabelWidth" to "translate($listLabelWidth,',','.')" in the File filter/source/xslt/odf2xhtml/export/xhtml/body.xsl in Lines 2045 and 2107 fixes this for me, preparing a pull request at the moment...
Comment 16 Gerrit Großkopf 2020-10-19 19:43:57 UTC
I made a commit at https://gerrit.libreoffice.org/c/core/+/104544, this gerrit configuration is new to me, i usually only use git, but I over all like the Interface, good thing i don't have to install gerrit, last time i tried that it deleted my username on deinstaling it again xD
Comment 17 Commit Notification 2020-12-16 08:26:39 UTC
gerrit committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9804cb2f195451811eff8924d7997a3cbd6d679d

tdf#95662 Convert , to . for the min-width in the Lists only

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Xisco Faulí 2020-12-16 09:49:15 UTC
Hi Robert,
Before and after your commit I get 'min-width:NaNcm', do you know why ?
Comment 19 Gerrit Großkopf 2020-12-16 10:21:21 UTC
(In reply to Xisco Faulí from comment #18)
> Hi Robert,
> Before and after your commit I get 'min-width:NaNcm', do you know why ?

Hi Xisco,
First of all, I am not my dad, I am Roberts son, Gerrit. Secondly, the NaN (not a number) is propably a bug coming from deeper in the code, this fix just scratched the surface, I have not looked so deeply into where the numbers come from. I think it would be best to put that in a new Bugreport and provide a minimal example file to better understand where the faulty numbe could be coming from.
Greetings,
Gerrit
Comment 20 Xisco Faulí 2020-12-16 10:40:07 UTC
(In reply to Gerrit Großkopf from comment #19)
> (In reply to Xisco Faulí from comment #18)
> > Hi Robert,
> > Before and after your commit I get 'min-width:NaNcm', do you know why ?
> 
> Hi Xisco,
> First of all, I am not my dad, I am Roberts son, Gerrit. Secondly, the NaN
> (not a number) is propably a bug coming from deeper in the code, this fix
> just scratched the surface, I have not looked so deeply into where the
> numbers come from. I think it would be best to put that in a new Bugreport
> and provide a minimal example file to better understand where the faulty
> numbe could be coming from.
> Greetings,
> Gerrit

Hi Gerrit,
sorry for the confusion, won't happen again.
Could you please share the info from About LibreOffice dialog?

Version: 7.2.0.0.alpha0+
Build ID: 8b3982681e98818388c09233960ad6eaacee205a
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

it's weird that I and buovjaga ( comment 6 ) get min-width:NaNcm and you get min-width:0,635cm
Comment 21 Olivier Hallot 2020-12-16 11:43:10 UTC
(In reply to Buovjaga from comment #6)
> I get
> min-width:NaNcm;
> 
> Setting to NEW.
> 
> Ubuntu 15.10 64-bit 
> Version: 5.0.2.2
> Build ID: 00m0(Build:2)
> Locale: en-US (en_US.UTF-8)

If that helps, I get NaNcm on every mention of the word "border-?-width", e. g.

border-bottom-width:NaNcm;
border-left-width:NaNcm;
border-top-width:NaNcm;

Version: 7.0.3.1
Build ID: d7547858d014d4cf69878db179d326fc3483e082
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: kf5
Locale: pt-BR (pt_BR.UTF-8); Interface: pt-BR
Calc: threaded

(This may explain why I don't get a bottom border in some paragraphs)
Comment 22 Buovjaga 2020-12-16 19:37:30 UTC
For the record, here without the patch/commit, I do *not* get NaN (Not a Number) with Locale: en-US (en_US.UTF-8). I get min-width:0.635cm; as expected.

Xisco: as you are able to repro the NaN thing, you could debug the XSL values a bit. I will explain what I did.

In the file filter/source/xslt/odf2xhtml/export/xhtml/body.xsl find the variable block

<xsl:variable name="listLevelTextIndent">

After the closing tag of the variable, insert

<xsl:message>
    listLevelTextIndent <xsl:copy-of select="-$listLevelTextIndent"/>
</xsl:message>

Then run

make filter

Now, when you export the ODT bug document to XHTML, it will print your debug message to your console!!

Further, you can experiment and use your imagination variously with the debug elements to try to find the root cause :)
Comment 23 Xisco Faulí 2021-03-31 13:55:02 UTC
Dear Gerrit Großkopf,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assign it back to yourself if you're still working on this.
Comment 24 Robert Großkopf 2021-03-31 14:30:43 UTC
There is only fixed a part of this report. There also appear "margin-left:1,27cm;". There must be something going wrong deeper in the code which sometimes sets the German decimal separator. Looks as if its not recognized as a number …
Comment 25 Stéphane Guillou (stragu) 2021-05-24 13:58:31 UTC
I now get "min-width:0cm" when exporting with LO 7.2 alpha1:

<span style="display:block;float:left;min-width:0cm;">•</span>

Version info:

Version: 7.2.0.0.alpha1+ / LibreOffice Community
Build ID: 4a9eef7849a75ba91806886ea9c96d114c8d56f9
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-05-22_06:45:25
Calc: threaded
Comment 26 Robert Großkopf 2022-08-15 16:17:12 UTC
Seems this is getting worse.

At this moment I could only export with LO 7.1.5.2, open a text editor and remove 
:0, with :0.
:1, with :1.

With LO 7.2.5.2 and newer versions I can't get a tab between bullet and following text. Bullet will disappear inside the text. Seems we have to revert to the old behavior…

Have had this trouble every time I have updated the German Base Handbuch, so I didn't test it before.
Comment 27 Buovjaga 2022-08-18 13:25:05 UTC
(In reply to Robert Großkopf from comment #26)
> Seems this is getting worse.
> 
> At this moment I could only export with LO 7.1.5.2, open a text editor and
> remove 
> :0, with :0.
> :1, with :1.
> 
> With LO 7.2.5.2 and newer versions I can't get a tab between bullet and
> following text. Bullet will disappear inside the text. Seems we have to
> revert to the old behavior…
> 
> Have had this trouble every time I have updated the German Base Handbuch, so
> I didn't test it before.

I am not seeing any visual difference with the bullet.

The change is seen in the style, where the min-width for the bullet span used to be 0.635cm and is now 0cm.

I bisected the change with the help of the command line conversion:

instdir/program/soffice --convert-to "html:XHTML Writer File:UTF8" --outdir ~/libobugs ~/libobugs/decimalseparator_xhtml.odt

The changing commit is
https://git.libreoffice.org/core/commit/9d3b39cf9fed9a305ac23d1ecaaafc8f7caaeeb0
HTML XSLT: Missing paragraph BORDER of stand-alone border paragraph (@style:join-border problem)

However, I don't understand why the XHTML filter exports the bullet in such a silly way:

<span style="display:block;float:left;min-width:0cm;">•</span>

It manually inserts a bullet character! The HTML filter does not do this silliness.
Comment 28 Robert Großkopf 2022-08-18 14:17:57 UTC
(In reply to Buovjaga from comment #27)
> > 
> > At this moment I could only export with LO 7.1.5.2, open a text editor and
> > remove 
> > :0, with :0.
> > :1, with :1.
> > 
> > With LO 7.2.5.2 and newer versions I can't get a tab between bullet and
> > following text. Bullet will disappear inside the text. Seems we have to
> > revert to the old behavior…
> > 
> > Have had this trouble every time I have updated the German Base Handbuch, so
> > I didn't test it before.
> 
> I am not seeing any visual difference with the bullet.

Try the new attached documents. Original *.odt and export with LO 7.1.5.2 and also LO 7.4.0.2. Both exports will look the same. Now remove 
:0, with :0.
:1, with :1.
There won't be found :0, in the document exported with LO 7.4.0.2. The export doesn't change but the export with LO 7.1.5.2 will look right.
Comment 29 Robert Großkopf 2022-08-18 14:18:35 UTC
Created attachment 181858 [details]
Original for export to xhtml
Comment 30 Robert Großkopf 2022-08-18 14:19:12 UTC
Created attachment 181860 [details]
Result export with LO 7.1.5.2
Comment 31 Robert Großkopf 2022-08-18 14:19:50 UTC
Created attachment 181861 [details]
Result export with LO 7.4.0.2
Comment 32 Robert Großkopf 2022-08-18 14:21:09 UTC
Created attachment 181862 [details]
After replacing :0, with :0. and :1, with :1. it looks well in LO 7.1.5.2 - but not in LO 7.4.0.2
Comment 33 Svante Schubert 2023-06-09 12:49:29 UTC
In Robert's ODT all XML uses correctly a dot '.' and not ',' therefore the ',' are added later and there are some possibilites of the culpit:

I would like to provide some guidance to narraow down the possibilites!

1. I have checked the latest XSLT filter and could not view any ',' being added.
By using his input document locally without LibreOffice (LO) the XSLT spreadsheet could be sorted out (for the latest sources).
We (Michael Stahl and I) have a little test enviornment for the OASIS ODF TC that runs on Linux for the ODF2HTML XSLT, see https://github.com/oasis-tcs/odf-tc - especially the readme at: https://github.com/oasis-tcs/odf-tc/#odf2html-transformation-not-yet-automated-regression-tests


2. Be aware that not the ODT but a single XML is being provided to the ODT XSLT stylesheets by LibreOffice. 
To check the final flat file format, add an XSLT filter using the IDENT (identicall) transformation: https://github.com/oasis-tcs/odf-tc/blob/master/src/test/resources/odf1.4/tools/ident.xsl
for instance using the fodt suffix
as import/export filter, by this simply save to fodt format :-)
NOTE: We added some QA guidance on how to add a stand-lone LO office, Configuration and pretty printing for ODF file XML in the ODF TC docu:
https://github.com/oasis-tcs/odf-tc/#odf-editing-tool (just use the office you like)

3. Most likely the XSLT processor adds some localizied decimal-separator.
Those who have the problem might want to install the XSLT 2.0 extension (see above docu as well that uses Saxon processor instead - but do not forget to fill thbe check-box using XSLT 2.0 at the XSLT filter configuration).


Last but not least, I had been started writing this XSLT filter in my junior days early 2000 where HTML 4.0 was not fully implemented by browsers. When I look at the ODF of the test document using https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#property-text_list-level-position-and-space-mode
I realized that nowadays CSS text-indent might be used: https://caniuse.com/?search=text-indent
Helpful and efficient is to get into inspect mode (mouse context on selected HTML or press F12 on keyboard) and alter the HTML/CSS to watch if this might work.
I consider doing some hack-fest on HTML XSLT at our next LibreOffice Conference in Bucharest, fixing the given test document.. :-)
Sorry, close to no spare pro-bono time ahead (already quite booked).. ;-)
Comment 34 Svante Schubert 2023-06-09 15:00:49 UTC
I would appreciate if someone could enhance the test document having multiple list levels and all possible features with different numbers for testing:
https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#property-text_list-level-position-and-space-mode

Created an external issue that might be mapped later to bugzilla:
https://github.com/oasis-tcs/odf-tc/pull/50
Comment 35 Robert Großkopf 2023-06-10 06:29:57 UTC
Created attachment 187815 [details]
Test document with different bullets an different levels.

Result for the test document is unusable together with LO 7.4.7, because min-width is set to '0' - see https://bugs.documentfoundation.org/attachment.cgi?id=181862&action=edit

With LO 7.1.5.2 you could repair the first level, but other levels will appear like the first level, don't will get more indent as it looks in *.odt-file.