Bug 130501 - URL in unicode interpretate as file link (PDF export)
Summary: URL in unicode interpretate as file link (PDF export)
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
6.4.0.3 release
Hardware: All All
: medium normal
Assignee: Stephan Bergmann
URL:
Whiteboard: target:7.0.0 target:6.4.2 target:6.3.6
Keywords: filter:pdf
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2020-02-07 08:32 UTC by Andrew
Modified: 2020-12-12 16:49 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
ODT sample (8.56 KB, application/vnd.oasis.opendocument.text)
2020-02-07 08:32 UTC, Andrew
Details
PDF sample (41.09 KB, application/pdf)
2020-02-07 08:32 UTC, Andrew
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew 2020-02-07 08:32:03 UTC
Created attachment 157722 [details]
ODT sample

When I export file to pdf, unicoded URL interprets like file url

See: https://bugs.documentfoundation.org/show_bug.cgi?id=125041
Comment 1 Andrew 2020-02-07 08:32:28 UTC
Created attachment 157723 [details]
PDF sample
Comment 2 Roman Kuznetsov 2020-02-07 08:46:59 UTC
confirm in

Версия: 7.0.0.0.alpha0+ (x64)
ID сборки: 69ccba90135f4dfc22d4cb823e10cf4794ddaa04
Потоков ЦП: 4; ОС: Windows 10.0 Build 18362; Отрисовка ИП: по умолчанию; VCL: win; 
Локаль: ru-RU (ru_RU); Язык интерфейса: ru-RU
Calc: threaded
Comment 3 Stephan Bergmann 2020-02-07 09:08:01 UTC
There appears to be an issue with internationalized domain name top level domains (IDN TLDs) somewhere in the code:

The IDN URL <http://транссеть.рф> (with a IDN TLD) is exported as a URL with a last segment of "http:%2F%2Fxn--80akxkhacg4g.xn--p1a%D1%84" (i.e., some code apparently decoded IDN label "транссеть" to "xn--80akxkhacg4g" and "рф" to "xn--p1a%D1%84", though the latter should have been "xn-p1ai"?!?, and then URL-encoded "//" as "%2F%2F" as it apparently decided for some reason that all of the input should become a single segment in the resulting URL).

On the other hand, testing with current master on Linux at least, <http://транссеть.ru> (with a "plain" TLD) is exported as <http://xn--80akxkhacg4g.ru/> (i.e., with the IDN label "транссеть" decoded to "xn--80akxkhacg4g", which should be fine).
Comment 4 Stephan Bergmann 2020-02-07 14:33:32 UTC
(In reply to Stephan Bergmann from comment #3)
> On the other hand, testing with current master on Linux at least,
> <http://транссеть.ru> (with a "plain" TLD) is exported as
> <http://xn--80akxkhacg4g.ru/> (i.e., with the IDN label "транссеть" decoded
> to "xn--80akxkhacg4g", which should be fine).

(See <https://gerrit.libreoffice.org/plugins/gitiles/core/+/a346dfccd7e342d776dd59eb3ed128557e22a1bf%5E!> "tdf#70833: IDNA support when exporing hyperlinks to PDF" for why IDN URLs are decoded to ASCII when exporting to PDF.)
Comment 5 muso 2020-02-07 17:37:40 UTC
I have the same problem with

Version: 6.4.0.3 (x64)
Build ID: b0a288ab3d2d4774cb44b62f04d5d28733ac6df8
CPU threads: 12; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: CL
Comment 6 Commit Notification 2020-02-07 20:27:49 UTC
Stephan Bergmann committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/4c0394461af4d6bcba059161113abffbb484efe8

tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Roman Kuznetsov 2020-02-08 19:37:54 UTC
(In reply to Commit Notification from comment #6)
> Stephan Bergmann committed a patch related to this issue.
> It has been pushed to "master":
> 
> https://git.libreoffice.org/core/commit/
> 4c0394461af4d6bcba059161113abffbb484efe8
> 
> tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost
> 
> It will be available in 7.0.0.
> 
> The patch should be included in the daily builds available at
> https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
> information about daily builds can be found at:
> https://wiki.documentfoundation.org/Testing_Daily_Builds
> 
> Affected users are encouraged to test the fix and report feedback.

Stephan, will you plan to backport it to 6.4?
Comment 8 Stephan Bergmann 2020-02-10 08:13:34 UTC
(In reply to Roman Kuznetsov from comment #7)
> Stephan, will you plan to backport it to 6.4?

yes, <https://gerrit.libreoffice.org/c/core/+/88295> "tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost"
Comment 9 Commit Notification 2020-02-10 12:53:42 UTC
Stephan Bergmann committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/289d2e5c86ffa99bc6a8c6c51f630d629afcd954

tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost

It will be available in 6.4.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Andrew 2020-02-11 04:00:02 UTC
(In reply to Stephan Bergmann from comment #8)
> (In reply to Roman Kuznetsov from comment #7)
> > Stephan, will you plan to backport it to 6.4?
> 
> yes, <https://gerrit.libreoffice.org/c/core/+/88295> "tdf#130501: Fix
> off-by-one error in URIHelper::resolveIdnaHost"

Thank you for fix!
Can you please backport it to 6.3?
Comment 11 Stephan Bergmann 2020-02-11 08:31:29 UTC
(In reply to Andrew from comment #10)
> Can you please backport it to 6.3?

Do you have a pressing need for that?  My understanding is that exporting URLs with a IDN TLD (see comment 3) never worked (and that the fix for issue 70833 by accident only made it work for non-TLD IDN labels), so this issue would classify as a non-high-severity non-regression.  Therefore, I'm a bit reluctant about backporting, to avoid unexpected regressions.
Comment 12 Andrew 2020-02-11 12:49:11 UTC
(In reply to Stephan Bergmann from comment #11)
> (In reply to Andrew from comment #10)
> > Can you please backport it to 6.3?
> 
> Do you have a pressing need for that?  My understanding is that exporting
> URLs with a IDN TLD (see comment 3) never worked (and that the fix for issue
> 70833 by accident only made it work for non-TLD IDN labels), so this issue
> would classify as a non-high-severity non-regression.  Therefore, I'm a bit
> reluctant about backporting, to avoid unexpected regressions.

Yes. We have a need. Because we usually use the still version.
These fixes make life easier for users.
Comment 13 Stephan Bergmann 2020-02-11 12:55:14 UTC
Shrug; lets take the risk then, seeing that PDF export is the only code using URIHelper::resolveIdnaHost.  <https://gerrit.libreoffice.org/c/core/+/88401> "tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost"
Comment 14 Andrew 2020-02-11 13:00:08 UTC
(In reply to Stephan Bergmann from comment #13)
> Shrug; lets take the risk then, seeing that PDF export is the only code
> using URIHelper::resolveIdnaHost. 
> <https://gerrit.libreoffice.org/c/core/+/88401> "tdf#130501: Fix off-by-one
> error in URIHelper::resolveIdnaHost"

Thank you!
Comment 15 Commit Notification 2020-02-11 23:24:59 UTC
Stephan Bergmann committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/commit/6ce74c39d15a9e55cc2ad846e18e001509e95392

tdf#130501: Fix off-by-one error in URIHelper::resolveIdnaHost

It will be available in 6.3.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.