Bug Hunting Session
Bug 85006 - Linux-rpm_deb-x86@45-TDF-dbg asserts on mismatching font names
Summary: Linux-rpm_deb-x86@45-TDF-dbg asserts on mismatching font names
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.4.0.0.alpha0+ Master
Hardware: Other All
: medium critical
Assignee: Caolán McNamara
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Fonts
  Show dependency treegraph
 
Reported: 2014-10-14 15:18 UTC by Yousuf Philips (jay) (retired)
Modified: 2017-10-22 20:03 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
gdb backtrace (14.26 KB, text/plain)
2014-10-14 18:41 UTC, Yousuf Philips (jay) (retired)
Details
output from fc-list (19.95 KB, text/plain)
2014-10-16 12:01 UTC, Yousuf Philips (jay) (retired)
Details
output from fc-cache (3.74 KB, text/plain)
2014-10-16 12:06 UTC, Yousuf Philips (jay) (retired)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yousuf Philips (jay) (retired) 2014-10-14 15:18:39 UTC
I've tried master~2014-09-30_23.32.31, master_dbg~2014-10-13_05.16.52 and master_dbg~2014-10-14_00.05.30 and they all crash the same with the following errors.

soffice.bin: /home/buildslave/source/libo-core/vcl/source/font/PhysicalFontFamily.cxx:297: void PhysicalFontFamily::UpdateCloneFontList(PhysicalFontCollection&, bool, bool) const: Assertion `pClonedFace->GetFamilyName().replaceAll("-", "").trim() == GetFamilyName().replaceAll("-", "").trim()' failed.
Application Error
Comment 1 Yousuf Philips (jay) (retired) 2014-10-14 15:25:35 UTC
This is a regression as the last build i downloaded in july worked fine. :)
Comment 2 Caolán McNamara 2014-10-14 15:33:00 UTC
it'll be an issue with a specific pair of fonts, and we'd need to know what those font names are. if you run it under gdb and then

print GetFamilyName() and print pClonedFace->GetFamilyName() when it crashes that should hopefully work.
Comment 3 Yousuf Philips (jay) (retired) 2014-10-14 17:13:56 UTC
Ran it under gdb and exited to the console, so executing print GetFamilyName() and print pClonedFace->GetFamilyName() simply resulted in "bash: syntax error near unexpected token `('".
Comment 4 Michael Meeks 2014-10-14 17:38:46 UTC
Jay: sure - you'd need to run it under gdb with a build with debugging symbols =) do you have such a thing ?
I guess Chris & Mike also touched font related goodness recently ...
Comment 5 Yousuf Philips (jay) (retired) 2014-10-14 18:41:24 UTC
Created attachment 107834 [details]
gdb backtrace

Michael: the version i downloaded is a dbg version or do you mean another dbg version that actually runs? Do i need to execute it differently from the commandline than the standard backtrace?
Comment 6 Yousuf Philips (jay) (retired) 2014-10-16 12:01:05 UTC
Created attachment 107926 [details]
output from fc-list

Well i thought it might be one of the custom fonts i installed in ~/.fonts but the problem still persisted, so here is the output from fc-list of the fonts on the system.
Comment 7 Yousuf Philips (jay) (retired) 2014-10-16 12:06:55 UTC
Created attachment 107927 [details]
output from fc-cache

And here is output from 'fc-cache -fv' after i put the fonts in ~/.fonts back.
Comment 8 Matthew Francis 2014-10-16 12:34:29 UTC
After a little remotely assisted debugging, we came up with the following:

(gdb) x/sh GetFamilyName()->pData->buffer
0xafe9874c:     u"FZSongS-Extended"

(gdb) x/sh pClonedFace->GetFamilyName()->pData->buffer
0xafe94e58:     u"FZSongS-Extended(SIP)"
Comment 9 Matthew Francis 2014-10-16 17:03:58 UTC
This is triggered by the fonts from "WPS Office" (-> http://wps-community.org/download.html )

With the wps-office and wps-office-fonts packages installed, I can now reproduce this.

-> NEW
Comment 10 Matthew Francis 2014-10-16 17:13:17 UTC
It seems my earlier (remote) reconstruction was backwards - it should be

GetFamilyName() is "FZSongS-Extended(SIP)"

pClonedFace->GetFamilyName() is "FZSongS-Extended"
Comment 11 Matthew Francis 2014-10-16 18:00:27 UTC
OUString GetEnglishSearchFontName( const OUString& rInName ) in unotools/source/misc/fontdefs.cxx is unprepared for a font that contains brackets in the name - it's expecting that to have some meaning to do with scripts

As a result, in PhysicalFontCollection::Add(...), it looks like we end up getting the PhysicalFontFamily of "FZSongS-Extended(SIP)" and "FZSongS-Extended" mixed up
Comment 12 Matthew Francis 2014-10-18 14:05:12 UTC
FZSongS-Extended(SIP) contains characters for the Supplementary Ideographic Plane (SIP) (specifically "CJK Unified Ideographs Extension B U+20000-U+2A6D6")

Amusingly, WPS/Kingsoft Writer which supplies the font seems unable to actually make use of it.


LibreOffice is capable of using this font in theory, but the assumption that font names do not contain brackets appears to be widespread, and such fonts appear to be rare, so it may be economical just to filter out such those with such names unless and until someone complains, in order to eliminate the crash
Comment 13 Chris Sherlock 2014-10-20 07:26:42 UTC
Why does GetEnglishFontName care about the script name in brackets? Does anyone understand the semantics around this?
Comment 14 Matthew Francis 2014-10-20 07:52:13 UTC
Caolan commented on IRC earlier that
"I have a vague memory that Word exports font names to rtf with the language or encoding in the font name, e.g. "Times New Roman (Russian)" and "Times New Roman (Baltic)" or some such"
Comment 15 Chris Sherlock 2014-10-20 08:43:50 UTC
Matthew, I believe this is a community sponsored package for Kingsoft's office suite... it's not the official font package?
Comment 16 Chris Sherlock 2014-10-20 08:45:47 UTC
And I can confirm what Caolan says - a small snippet (reformatted to make reading easier) from a Word 2010 RTF document I created:

{\fonttbl
	{\f0\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\f1\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;}
	{\f34\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria Math;}
	{\f36\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria;}
	{\f37\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}
	{\f216\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0606020202030204}Arial Narrow;}
	{\flomajor\f31500\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\fdbmajor\f31501\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\fhimajor\f31502\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria;}
	{\fbimajor\f31503\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\flominor\f31504\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\fdbminor\f31505\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\fhiminor\f31506\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}
	{\fbiminor\f31507\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
	{\f346\fbidi \froman\fcharset238\fprq2 Times New Roman CE;}
	{\f347\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr;}
	{\f349\fbidi \froman\fcharset161\fprq2 Times New Roman Greek;}
	{\f350\fbidi \froman\fcharset162\fprq2 Times New Roman Tur;}
	{\f351\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}
	{\f352\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}
	{\f353\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic;}
	{\f354\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese);}
Comment 17 Caolán McNamara 2014-10-20 10:03:15 UTC
There is also lcl_stripCharSetFromName in vcl, which knows about the exact script names that MSOffice uses. We should combine that into unotools/source/misc/fontdefs.cxx's GetEnglishSearchFontName which has that looser concept.
Comment 18 Caolán McNamara 2014-10-20 15:07:11 UTC
honestly, all this GetEnglishSearchFontName foo is super dubious. If there are other problems here we probably should just stop "removing" stuff from the names.
Comment 19 Khaled Hosny 2014-10-20 17:12:42 UTC
Seco(In reply to Caolán McNamara from comment #18)
> honestly, all this GetEnglishSearchFontName foo is super dubious. If there
> are other problems here we probably should just stop "removing" stuff from
> the names.

Can’t agree less.
Comment 20 Khaled Hosny 2014-10-20 17:18:03 UTC
I meant “Can’t agree more” of course!
Comment 21 Yousuf Philips (jay) (retired) 2014-10-20 18:34:59 UTC
So was this fixed in some patch?
Comment 23 Chris Sherlock 2014-10-21 09:54:19 UTC
Oh, and for good measure, the unit test was updated :-)

http://cgit.freedesktop.org/libreoffice/core/commit/unotools?id=e12ba2eddc827e39444f5efe6107d8afe1f7aaff