Bug 132770 - Underline text using INS tag from HTML document do not appear
Summary: Underline text using INS tag from HTML document do not appear
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.4.3.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: HTML-Import
  Show dependency treegraph
 
Reported: 2020-05-06 13:15 UTC by Konstantin Kharlamov
Modified: 2023-05-11 14:54 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
maybe fix (2.74 KB, patch)
2020-05-10 14:45 UTC, Konstantin Kharlamov
Details
HTML document to import (884 bytes, text/html)
2021-05-19 13:35 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin Kharlamov 2020-05-06 13:15:50 UTC
# Steps to reproduce (in terms of terminal commands)

    $ cat test.html
    <!DOCTYPE html>
    <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
    <head>
      <meta charset="utf-8" />
      <meta name="generator" content="pandoc" />
      <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
      <title>test2</title>
      <style>
        code{white-space: pre-wrap;}
        span.smallcaps{font-variant: small-caps;}
        span.underline{text-decoration: underline;}
        div.column{display: inline-block; vertical-align: top; width: 50%;}
        div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
        ul.task-list{list-style: none;}
      </style>
      <!--[if lt IE 9]>
        <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
      <![endif]-->
    </head>
    <body>
    <ins>
    Test
    </ins>
    </body>
    </html>
    $ lowriter test.html


## Expected

Word "Test" that appears in lowriter is underlined (same as what you would see if you open the test.html in a browser)

## Actual

"Test" is not underlined
Comment 1 Konstantin Kharlamov 2020-05-07 23:13:22 UTC
I have debugged this for a bit, to give some update: it looks like LO Writer has support for <ins> tag. So what happens here seems to be an actual bug (I mean, missing support for basic tag would be one too, but it could've been fixed much easier).

At the moment I figured that the difference between when "italic" and "underline" are parsed is the content of `pCFormat` being assigned as:

    SwCharFormat* pCFormat = m_pCSS1Parser->GetChrFormat( nToken, aClass );

Its field pCFormat->m_aSet has 3 members in case of italic:

        11 = {
          <SfxEnumItem<FontItalic>> = {
            <SfxEnumItemInterface> = {
              <SfxPoolItem> = {
                _vptr.SfxPoolItem = 0x7ff263673c70 <vtable for SvxPostureItem+16>,
                m_nRefCount = 2,
                m_nWhich = 11,
                m_nKind = SfxItemKind::NONE
              }, <No data fields>},
            members of SfxEnumItem<FontItalic>:
            m_nValue = ITALIC_NORMAL
          }, <No data fields>},
        …
        25 = {
          <SfxEnumItem<FontItalic>> = {
            <SfxEnumItemInterface> = {
              <SfxPoolItem> = {
                _vptr.SfxPoolItem = 0x7ff263673c70 <vtable for SvxPostureItem+16>,
                m_nRefCount = 2,
                m_nWhich = 25,
                m_nKind = SfxItemKind::NONE
              }, <No data fields>},
            members of SfxEnumItem<FontItalic>:
            m_nValue = ITALIC_NORMAL
          }, <No data fields>},
        26 = 0x0,
        …
        29 = 0x0,
        30 = {
          <SfxEnumItem<FontItalic>> = {
            <SfxEnumItemInterface> = {
              <SfxPoolItem> = {
                _vptr.SfxPoolItem = 0x7ff263673c70 <vtable for SvxPostureItem+16>,
                m_nRefCount = 2,
                m_nWhich = 30,
                m_nKind = SfxItemKind::NONE
              }, <No data fields>},
            members of SfxEnumItem<FontItalic>:
            m_nValue = ITALIC_NORMAL
          }, <No data fields>},

But in case of underlined it is all zeroes. Not sure just yet what to make of it.

Doesn't help though is that gdb is like trying to explode while loading the office. It quickly grew to 2.2GB and hanged for 5 minutes on printing the field, so I had to -SIGKILL it in the end. I am wondering if it can be cause by pretty-printers of LibreOffice and whatnot. Does gdb load them by default? Gotta check that out, I've never seen such behavior with gdb before.
Comment 2 Konstantin Kharlamov 2020-05-10 12:24:45 UTC
Okay, so a couple of updates:

1. The gdb taking GBs of memory turned out to be a bug in gdb https://sourceware.org/bugzilla/show_bug.cgi?id=25965
2. You can make it work with one of the following changes:
    * replace `ins` with `u` tag
    * add into html file (or to a separate CSS file) a style for `ins` tag as follows:

          <style>
            ins {
                text-decoration: underline;
            }
          </style>

What basically happens is that html parser queries known styles by executing `pCFormat = m_pDoc->FindCharFormatByName( "ins" )`. It does not find one, and creates one with default text property.

I'll try to see if there is any obvious solution, but I suspect solution should be creating default "underlined" style for `ins` tag but using it *only* when there's no override in a CSS. Since I don't know the code nor where to look at examples, I might have a problem with implementing that.
Comment 3 Konstantin Kharlamov 2020-05-10 14:45:25 UTC
Created attachment 160603 [details]
maybe fix

Okay, so, attached patch may or may not help — I haven't been able to test it. If it works, it may cause CSSes to stop working on INS, see https://bugs.documentfoundation.org/show_bug.cgi?id=132914 for details. It is a separate bug, so while at it, someone might want to check that out too.

Unfortunately I should stop working on this for technical reasons, so if anybody want to pick it up, go for it. I figured, my configuration of 8GB RAM + HDD is too slow for being able to develop LibreOffice. Attempts to debug or especially to re-build LO forces stuff into SWAP and displaces off FS-cache. So most of the time I'm sitting in front of my laptop doing nothing and waiting for freezes to go away. I think I'm gonna have more prolific results working on something else.
Comment 4 Xisco Faulí 2020-05-11 10:46:42 UTC
Hello Konstantin,
Could you please submit the patch to gerrit for review ? < https://wiki.documentfoundation.org/Development/gerrit >
Assigning it to you
Comment 5 Konstantin Kharlamov 2020-05-11 12:02:14 UTC
(In reply to Xisco Faulí from comment #4)
> Hello Konstantin,
> Could you please submit the patch to gerrit for review ? <
> https://wiki.documentfoundation.org/Development/gerrit >
> Assigning it to you

Thank you, but unfortunately as I mentioned in my previous comment I haven't been able to test it and I can't continue working on this for technical reasons. Though maybe I'll give it another try later. Perhaps if I reduce parallel builds down to, say, just one, it shouldn't take as much RAM.
Comment 6 Konstantin Kharlamov 2020-05-11 12:10:30 UTC
(In reply to Konstantin Kharlamov from comment #5)
> (In reply to Xisco Faulí from comment #4)
> > Hello Konstantin,
> > Could you please submit the patch to gerrit for review ? <
> > https://wiki.documentfoundation.org/Development/gerrit >
> > Assigning it to you
> 
> Thank you, but unfortunately as I mentioned in my previous comment I haven't
> been able to test it and I can't continue working on this for technical
> reasons. Though maybe I'll give it another try later. Perhaps if I reduce
> parallel builds down to, say, just one, it shouldn't take as much RAM.

Okay, I am sorry, but I'm unassigning myself. Perhaps I'll give it a try when I gonna have an SSD. But right now on HDD even with `make -j1` my whole system just locks up upon building.
Comment 7 Konstantin Kharlamov 2020-05-11 14:14:02 UTC
(In reply to Konstantin Kharlamov from comment #6)
> (In reply to Konstantin Kharlamov from comment #5)
> > (In reply to Xisco Faulí from comment #4)
> > > Hello Konstantin,
> > > Could you please submit the patch to gerrit for review ? <
> > > https://wiki.documentfoundation.org/Development/gerrit >
> > > Assigning it to you
> >
> > Thank you, but unfortunately as I mentioned in my previous comment I haven't
> > been able to test it and I can't continue working on this for technical
> > reasons. Though maybe I'll give it another try later. Perhaps if I reduce
> > parallel builds down to, say, just one, it shouldn't take as much RAM.
>
> Okay, I am sorry, but I'm unassigning myself. Perhaps I'll give it a try
> when I gonna have an SSD. But right now on HDD even with `make -j1` my whole
> system just locks up upon building.

FTR: I made another attempt, probably last one. I managed to overcome slowness of HDD by running the build as follows

    ionice -c3 make -j1

This indeed helps with IO pressure. But then later build (even with one thread) takes up more than 4GBs of RAM, displaces FS cache off the RAM and applications onto SWAP partition, and everything locks up. Since rebuilding supposed to happen pretty often while developing, I don't think it would make sense to tolerate that. Suggestions are welcome though.
Comment 8 Stéphane Guillou (stragu) 2021-05-19 13:35:31 UTC
Created attachment 172175 [details]
HTML document to import

Including an example file (which contains the exact HTML code in the bug description).

Reproducible in LO 7.2 alpha1+: text is not underlined.

Version: 7.2.0.0.alpha1+ / LibreOffice Community
Build ID: b1c0734ffe0f395757b6e0cea7830d820231afeb
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-05-18_03:16:20
Calc: threaded