Bug 157073 - Simple HTML elements for FORMATTING (<SUP>) are ignored on Paste
Summary: Simple HTML elements for FORMATTING (<SUP>) are ignored on Paste
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.0.4 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: HTML-Paste
  Show dependency treegraph
 
Reported: 2023-09-03 17:55 UTC by Dom [:ator]
Modified: 2023-09-09 05:20 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dom [:ator] 2023-09-03 17:55:28 UTC
Users should be able to copy HTML straight from web documents and, when pasting into Calc, have formatting preserved as far as reasonably possible. For example, a very recognisable and common tag is <SUP>, which formats text in superscript. This is supported in Writer, but not in Calc.

I am using the command `xclip -o -selection clipboard -t text/html` to view the actual contents of the clipboard, using the text/html media type.

Strangely, when copying from Writer into Calc <SUP> is respected, whereas when copying from a standard browser such as Firefox it is not. Even more confusing, copying from Calc does not include <SUP> in the HTML, yet when pasting back into Calc or Writer *invisible superscript markings are respected*. I don't see how this last point is possible, but I assume there is either something hidden in the clipboard contents which xclip is not extracting, or the LibreOffice runtime is sharing special data within itself regarding the clipboard.

# Results

The ^ notation is used to denote superscript.

## Copying a^2 from Firefox

xclip:

> <meta http-equiv="content-type" content="text/html; charset=utf-8">a<sup>2</sup>

Calc: a2
Writer: a^2

## Copying a cell containing a^2 from Calc

xclip:

> <!DOCTYPE html>
> <html><head>
>  <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
>  <title></title>
>  <meta name="generator" content="LibreOffice 7.6.0.3 (Linux)"/>
>  <style type="text/css">
>   body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small }
>   a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em;  } 
>   a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em;  } 
>   comment { display:none;  } 
>  </style>
> </head><body>
>  <table cellspacing="0" border="0">
>   <colgroup width="85"></colgroup>
>    <tr>
>     <td height="17" align="left">a2</td>
>    </tr>
>  </table>
> </body></html>

Calc: a^2
Writer: a^2 [although 2 is rendered in the "Lohit Devanagari" typeface]

## Copying a^2 from Writer

xclip:

> <!DOCTYPE html>
> <html>
> <head>
>  <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
>  <title></title>
>  <meta name="generator" content="LibreOffice 7.6.0.3 (Linux)"/>
>  <style type="text/css">
>   @page { size: 21cm 29.7cm; margin: 2cm }
>   p { line-height: 115%; margin-bottom: 0.25cm; background: transparent }
>  </style>
> </head>
> <body lang="en-GB" link="#000080" vlink="#800000" dir="ltr">
>  <p>a<sup>2</sup></p>
> </body></html>

Calc: a^2 [although the typeface is now blank]
Writer: a^2

# Expected results

At the very least, <SUP> should be respected in Calc. Characters contained within <SUP> tags should *always* be rendered in superscript, regardless of the particulars of the rest of the clipboard.
Comment 1 raal 2023-09-09 05:20:11 UTC
test page https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_sup

Reproducible with Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: d8dbf35c48698e49c527d740853ce4edc4f1afa9
CPU threads: 4; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded

and Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)