Bug 93716 - FILESAVE (HTML): Weak-directionality characters (e.g. hyphen, comma) excluded from RTL-direction spans
Summary: FILESAVE (HTML): Weak-directionality characters (e.g. hyphen, comma) excluded...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
5.0.1.2 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on: 63927
Blocks: (X)HTML-Export RTL
  Show dependency treegraph
 
Reported: 2015-08-27 15:31 UTC by Lior Kaplan
Modified: 2024-08-03 09:15 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
testdoc, same as in #63927 (11.48 KB, application/vnd.oasis.opendocument.text)
2015-08-27 15:31 UTC, Lior Kaplan
Details
Output of file export (3.06 KB, text/html)
2015-08-27 15:32 UTC, Lior Kaplan
Details
Output of save as HTML (2.46 KB, text/html)
2015-08-27 15:35 UTC, Lior Kaplan
Details
Output of File > Export > XHTML with Writer 24.2.4.2 (2.98 KB, text/html)
2024-07-21 16:12 UTC, Eyal Rozenberg
Details
Output of File > Save As > HTML - with Writer 24.2.4.2 (2.46 KB, text/html)
2024-07-21 16:12 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lior Kaplan 2015-08-27 15:31:47 UTC
Created attachment 118223 [details]
testdoc, same as in #63927

This is a follow up for Bug #63927, as the problem still happens when using save as (while it's fixed for export).
Comment 1 Lior Kaplan 2015-08-27 15:32:42 UTC
Created attachment 118224 [details]
Output of file export
Comment 2 Lior Kaplan 2015-08-27 15:35:02 UTC
Created attachment 118225 [details]
Output of save as HTML

Attached output of export (with LibO 5.0.1), comparing to save as HTML. The problem as described in bug #63927 still happens while doing save as HTML.
Comment 3 QA Administrators 2016-09-20 10:29:06 UTC Comment hidden (obsolete)
Comment 4 Xisco Faulí 2017-06-08 09:05:48 UTC
You can't confirm your own bugs.
Could you please try to reproduce it with the latest version of LibreOffice
from https://www.libreoffice.org/download/libreoffice-fresh/ ?
I have set the bug's status to 'NEEDINFO'. Please change it back to
'UNCONFIRMED' if the bug is still present in the latest version.
Comment 5 Omer Zak 2017-11-09 10:50:05 UTC
Still happens in:

Version: 6.0.0.0.alpha1+
Build ID: 6070dec9ca9a15587a2aece81f9ae1ab5ac0f8c4
CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.utf8); Calc: group
(Build from 2017-Nov-05 00:00)

OS: Debian 64bit Stretch (Debian 9.2, with some backported packages)



Instructions:
After loading the sample file, perform:

1. File > Export...
  And select the format as XHTML (.html; .xhtml)

and

2. File > Save as...
  And select the format as HTML Document (Writer) (.html)

3. Compare the exported files' contents.
Comment 6 QA Administrators 2018-05-30 16:40:15 UTC Comment hidden (obsolete)
Comment 7 QA Administrators 2018-07-03 14:20:08 UTC Comment hidden (obsolete)
Comment 8 Lior Kaplan 2018-09-30 15:43:49 UTC
Still happens in 

Version: 6.1.1.2
Build ID: 1:6.1.1-1
CPU threads: 8; OS: Linux 4.16; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); Calc: group threaded
Comment 9 QA Administrators 2019-10-01 03:02:23 UTC Comment hidden (obsolete)
Comment 10 QA Administrators 2021-10-01 03:51:39 UTC Comment hidden (obsolete)
Comment 11 Eyal Rozenberg 2023-02-10 14:49:00 UTC
Bug still manifests with:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: ad387d5b984c6666906505d25685065f710ed55d
CPU threads: 4; OS: Linux 6.1; UI render: default; VCL: gtk3
Locale: fa-IR (en_IL); UI: en-US

With export, we get something like:

<span> "בית משפט" - לרבות בית דין לעבודה, בית דין דתי, ראש הוצאה לפועל לפי חוק ההוצאה לפועל, תשכ"ז-1967 (להלן - חוק ההוצאה לפועל), ולמעט בית דין צבאי כמשמעותו בחוק השיפוט הצבאי, תשט"ו– 1955; </span>

and with save, we get:

<span lang="en-US">	&quot;</span></font><span lang="he-IL">בית משפט</span><font face="Liberation Serif, serif"><span lang="en-US">&quot;
- </span></font><span lang="he-IL">לרבות בית דין לעבודה</span><font face="Liberation Serif, serif"><span lang="en-US">,
</span></font><span lang="he-IL">בית דין דתי</span><font face="Liberation Serif, serif"><span lang="en-US">,
</span></font><span lang="he-IL">ראש הוצאה לפועל לפי
חוק ההוצאה לפועל</span><font face="Liberation Serif, serif"><span lang="en-US">,
</span></font><span lang="he-IL">תשכ</span><font face="Liberation Serif, serif"><span lang="en-US">&quot;</span></font><span lang="he-IL">ז</span><font face="Liberation Serif, serif"><span lang="en-US">-1967
(</span></font><span lang="he-IL">להלן </span><font face="Liberation Serif, serif"><span lang="en-US">-
</span></font><span lang="he-IL">חוק ההוצאה לפועל</span><font face="Liberation Serif, serif"><span lang="en-US">),
</span></font><span lang="he-IL">ולמעט בית דין צבאי
כמשמעותו בחוק השיפוט הצבאי</span><font face="Liberation Serif, serif"><span lang="en-US">,
</span></font><span lang="he-IL">תשט</span><font face="Liberation Serif, serif"><span lang="en-US">&quot;</span></font><span lang="he-IL">ו–
</span><font face="Liberation Serif, serif"><span lang="en-US">1955; </span></font>

I wonder, though, if the problem is with the process of producing the output, or with the internal representation.
Comment 12 Eyal Rozenberg 2024-07-21 16:12:08 UTC
Created attachment 195432 [details]
Output of File > Export > XHTML with Writer 24.2.4.2

Updating the outputs with a recent LO version (1 of 2)
Comment 13 Eyal Rozenberg 2024-07-21 16:12:54 UTC
Created attachment 195433 [details]
Output of File > Save As > HTML - with Writer 24.2.4.2

Updating the outputs with a recent LO version (2 of 2)
Comment 14 Eyal Rozenberg 2024-07-21 16:17:55 UTC
Actually, why do we even get those spans in the first place? All of the text paragraph is shown as "hebrew" in LO itself, and there are no font changes within the paragraph, and there's no typeface DF - it's all "Nachlieli CLM". So why do we need this stuff? :

&quot;
<font face="Nachlieli CLM">
<span lang="he-IL">בית משפט</span>
</font>
&quot;- 
<font face="Nachlieli CLM">
<span lang="he-IL">לרבות בית דין לעבודה</span>
</font>
, 


and so on and so on?