Bug 143586 - Ruby (rubi / furigana) html tags not created when saving as html
Summary: Ruby (rubi / furigana) html tags not created when saving as html
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Ruby (X)HTML-Export
  Show dependency treegraph
 
Reported: 2021-07-28 14:26 UTC by anish.mistry
Modified: 2025-10-03 01:47 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Docx with ruby text (4.73 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-07-28 14:26 UTC, anish.mistry
Details
Current resulting html (1.04 KB, text/html)
2021-07-28 14:27 UTC, anish.mistry
Details

Note You need to log in before you can comment on or make changes to this bug.
Description anish.mistry 2021-07-28 14:26:42 UTC
Created attachment 173918 [details]
Docx with ruby text

Version: 7.1.5.2 / LibreOffice Community
Build ID: 10(Build:2)
CPU threads: 12; OS: FreeBSD 13.0; UI render: default; VCL: qt5+cairo
Locale: en-US (en_US.ISO8859-1); UI: en-US
Calc: threaded

When saving a document (docx or odt) with ruby text, that text is NOT converted correctly in the resulting html.  The html should have the text appropriately tags with <ruby> <rt> <rp> tags.  The resulting HTML currently just show everything on the same line since it's not tagged out correctly.
Comment 1 anish.mistry 2021-07-28 14:27:22 UTC
Created attachment 173919 [details]
Current resulting html
Comment 2 Ming Hua 2021-07-29 06:02:49 UTC
Reproduced with 7.2.0 RC1 and attachment 173918 [details] in comment #0:
Version: 7.2.0.1 (x64) / LibreOffice Community
Build ID: 32efc3b7f3a71cfa6a7fa3f6c208333df48656cc
CPU threads: 2; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: threaded

For me, all the ruby texts seems to be dropped when saved as HTML, instead of "show everything on the same line".  Attachment 173919 [details] in comment #1 also shows ruby text missing instead of on the same line with base text.
Comment 3 QA Administrators 2023-07-30 03:16:14 UTC Comment hidden (obsolete)
Comment 4 Urmas 2025-05-09 08:45:06 UTC
Still here.
Comment 5 AdacAlatiorn 2025-10-03 01:47:08 UTC
Came across this bug in June: https://ask.libreoffice.org/t/librewriter-html-exporter-discards-furigana-characters-instead-of-exporting-as-ruby/123787

Looks like this is a very old bug and from time passed seems unlikely to be fixed.

Good news is it this seems like such a simple thing.

Bad news is the HTML module source is huge (37KLOC!), and seems dated (hasn't kept pace with HTML standards?), gives results quite different from what the HTML CSS specifies, and doesn't support low hanging fruit like converting <span class="XYZ"> into a very Libre-friendly Character Style. (Yet classes on <p> mostly work).

I'll try fixing both the ruby and character style thing with what is there now, hoping the changes are minor.

But for future-proofing, better to support modern HTML would be better, and that perhaps may reduce the size and complexity of managing HTML by relying on modern HTML libraries to do the heavy lifting.