Bug 103664 - FILEOPEN: DOCX: Wingdings symbols are imported as rectangles
Summary: FILEOPEN: DOCX: Wingdings symbols are imported as rectangles
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.3.0.0.alpha1+
Hardware: All Linux (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard: target:5.3.0 target:5.2.4
Keywords:
Depends on:
Blocks: Font-Substitution
  Show dependency treegraph
 
Reported: 2016-11-02 21:05 UTC by Tamás Zolnai
Modified: 2017-11-04 10:46 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Four wingdings symbols (11.18 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-11-02 21:05 UTC, Tamás Zolnai
Details
5.3.0.0.alpha1+ (1.87 KB, image/png)
2016-11-02 21:22 UTC, Xisco Faulí
Details
Expected display of symbols (first symbol is a blank symbol) (58.21 KB, image/png)
2016-11-02 21:37 UTC, Tamás Zolnai
Details
with master > 20161102 (35.18 KB, image/png)
2016-11-03 05:08 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tamás Zolnai 2016-11-02 21:05:25 UTC
Description:
When I open the test document with some Wingdings symbols, Writer shows only rectangles.
I see a similar issue here, which was fixed:
https://bugs.documentfoundation.org/show_bug.cgi?id=91594
So it might be a regression after that fix.

Steps to Reproduce:
1. Open attached DOCX file in Writer


Actual Results:  
Four rectangle are displayed in the first line.

Expected Results:
Wingdings symbols should be mapped to LO used symbol font and symbols should be displayed similar to the original ones.


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36
Comment 1 Tamás Zolnai 2016-11-02 21:05:56 UTC
Created attachment 128453 [details]
Four wingdings symbols
Comment 2 Xisco Faulí 2016-11-02 21:22:44 UTC
Created attachment 128454 [details]
5.3.0.0.alpha1+

What I see in 

Version: 5.3.0.0.alpha1+
Build ID: 1b0aa768f2c5da65074a6eacfed5f61a121fb13d
CPU Threads: 4; OS Version: Linux 4.2; UI Render: default; VCL: gtk3; Layout Engine: old; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group
Comment 3 Tamás Zolnai 2016-11-02 21:37:24 UTC
Created attachment 128455 [details]
Expected display of symbols (first symbol is a blank symbol)
Comment 4 Tamás Zolnai 2016-11-02 21:40:26 UTC
(In reply to Zolnai Tamás from comment #3)
> Created attachment 128455 [details]
> Expected display of symbols (first symbol is a blank symbol)

I hacked the code locally to see how it looks like when the Windings -> OpenSymbol mapping is done, but would be good to know that when the code became broken to find out a nice way to fix this problem.
Comment 5 V Stuart Foote 2016-11-03 05:08:43 UTC
Created attachment 128461 [details]
with master > 20161102

Resolved with new HarfBuzz layout engine and work on bug 71603

Version: 5.3.0.0.alpha1+
Build ID: 5d39c2013374727b1c8f147b8b99d54402a7ff02
CPU Threads: 8; OS Version: Windows 6.2; UI Render: GL; Layout Engine: new; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-02_01:01:09
Locale: en-US (en_US); Calc: CL
Comment 6 Tamás Zolnai 2016-11-03 19:17:26 UTC
(In reply to V Stuart Foote from comment #5)
> Created attachment 128461 [details]
> with master > 20161102
> 
> Resolved with new HarfBuzz layout engine and work on bug 71603
> 
> Version: 5.3.0.0.alpha1+
> Build ID: 5d39c2013374727b1c8f147b8b99d54402a7ff02
> CPU Threads: 8; OS Version: Windows 6.2; UI Render: GL; Layout Engine: new; 
> TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-02_01:01:09
> Locale: en-US (en_US); Calc: CL

I still see the problem on master. This bug I reported is a Linux only problem. The bug 71603 seems a Windows specific one, so not related.
Comment 7 V Stuart Foote 2016-11-03 23:44:39 UTC
@Khaled, * 

So, IIUC HarfBuzz [1][2] should now have no issue rendering a MS Symbol CMAP and layout glyphs from the PUA of each for the prevalent symbol fonts: Wingdings, Webdings, Wingdings2, Wingdings3

In fact with the fallback patched and HarfBuzz common layout enabled on Windows, I now see just the Unicode PUA mapping per glyph, the Symbol CMAP based values are no longer duplicated on the Special Character dialog. Doing an <alt>+X Unicode toggle has interesting results =).

Of course, the user still must have the font(s) installed, or the font must be embedded in the document (not the case for this sample). Without which they get an unknown value from some fallback of the PUA for the fallback font.

Is that what is happening on the Linux side? Does HarfBuzz or fontconfig handle this on the Linux builds?

As a sidebar--there is an alternative approach to provide support for some subset of these four fonts.  Looks a bit of a hack, but can we map the glyphs for each font, from their PUA sequence to a corresponding Unicode > 7.0 equivilent. If done in a standard way would it allow fall-back replacement from an installed font when the MS symbol fonts are not present[3][4]?

Assume that to do that mapping we would need to dump the CMAP for each of the symbol fonts and statically assign a replacment Unicode 7.0 equivalent.  But would it be worth doing?

=-ref-=
[1] https://github.com/behdad/harfbuzz/issues/236
[2] https://github.com/behdad/harfbuzz/commit/34f9aa582c3a03b578c7eae3d2e8860a0bd5cb00

and for any effort to map the symbol fonts to Unicode 7+
[3] http://unicode.org/~asmus/web-wing-ding-ext.pdf
[4] http://www.unicode.org/L2/L2011/11052r-wingding.pdf
Comment 8 ⁨خالد حسني⁩ 2016-11-04 01:39:46 UTC
The document uses Private Use Area characters, so without the exact fonts (Wingdings) installed you will get garbage. If I install the fonts on Linux, I get the expected symbols on both master and 5.2.

So as far as text layout is concerned, this is not a bug.

Whether Wingdings should be treated differently in case of font fallback and some magic mapping be applied or not is not my area of expertise.
Comment 9 Yousuf Philips (jay) (retired) 2016-11-04 14:37:16 UTC
According to what Khaled said, i would also assume this isnt our bug as well as fallback fonts dont always work, especially when symbol fonts are used.
Comment 10 V Stuart Foote 2016-11-04 16:19:02 UTC
@Zolnai, *

Khaled is out (comment 8), but were you thinking of maybe mapping the standard MS symbol fonts to their Unicode substitutes?  Actually not a bad idea to handle fallback of these symbol fonts if we can do it reliably.

We're now bundling Noto Emoji; while Symbola has pretty complete coverage through Unicode 9.0, or I guess we could extend the OpenSymbol coverage for our needs.  

Think the final glyphs and Unicode equivalents at 7.0 are here.

http://www.unicode.org/L2/L2012/12368-n4384.pdf

Allocation of the "Wingdings ID" notation for glyphs from each of the four symbol fonts are:
Webdings (w-0033..w-00255),
Wingdings (w-1033..w-1255),
Wingdings2 (w-2033-w-2255),
Wingdings3 (w-3033..w-3255).

Imagine we would have to provide a way to toggle a user pref between using the symbol font if actually installed on system (or if embedded in document), or of choosing to use the Unicode mapping and fallback. We would need to handle each symbol font in its own table.

@Mike K., any opinion?
Comment 11 Commit Notification 2016-11-05 10:39:04 UTC
Tamás Zolnai committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5ef66db91e87ef84724be22977acf4c9c472ad6b

tdf#103664: FILEOPEN: DOCX: Wingdings symbols are imported as rectangles

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2016-11-05 20:30:44 UTC
Tamás Zolnai committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=99f2663d8752db9779b72215d79597f8538e061f&h=libreoffice-5-2

tdf#103664: FILEOPEN: DOCX: Wingdings symbols are imported as rectangles

It will be available in 5.2.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 ⁨خالد حسني⁩ 2016-11-16 18:40:00 UTC
Is this really fixed? I still see rectangles here.
Comment 14 Tamás Zolnai 2016-11-16 18:49:16 UTC
(In reply to Khaled Hosny from comment #13)
> Is this really fixed? I still see rectangles here.

With which test document?
The test document I attached contains Wingdings symbols inserted as special characters in Word. My fix affects this kind of symbols only, and not those cases when for example wingdings symbols are typed as simple text with Wingdings font or when bullets uses wingdings symbols. So maybe we should rename this bug for this specific case.
Comment 15 ⁨خالد حسني⁩ 2016-11-16 19:43:09 UTC
I’m checking with the attached document.
Comment 16 Tamás Zolnai 2016-11-17 18:28:32 UTC
(In reply to Khaled Hosny from comment #15)
> I’m checking with the attached document.

Hmm, well on my computer it works, but feel free to reopen the bug. Some more information of your configuration would be useful.