Bug 125041 - URL in unicode interpretate as file link
Summary: URL in unicode interpretate as file link
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.1.5.2 release
Hardware: All All
: medium normal
Assignee: Tünde Tóth
URL:
Whiteboard: target:6.4.0,
Keywords: bibisected, bisected, regression
: 121875 126591 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-04-30 10:40 UTC by Andrew
Modified: 2020-02-11 12:32 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
screenshot (42.95 KB, image/png)
2019-04-30 11:37 UTC, Andrew
Details
Example ODT file (8.56 KB, application/vnd.oasis.opendocument.text)
2020-02-06 12:42 UTC, Andrew
Details
Example PDF file (41.09 KB, application/pdf)
2020-02-06 12:42 UTC, Andrew
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew 2019-04-30 10:40:21 UTC
Example:

http://транссеть.рф - interprets as file link
http://xn--80akxkhacg4g.xn--p1ai - interprets as correct url

How it must work correctly?

http://транссеть.рф - must interprets as url or automaticaly convert to http://xn--80akxkhacg4g.xn--p1ai
Comment 1 Andrew 2019-04-30 11:37:20 UTC
Created attachment 151091 [details]
screenshot
Comment 2 raal 2019-05-02 15:46:15 UTC
Confirm with Version: 6.3.0.0.alpha0+
Build ID: 83abdf803a023067ebc207fd82dde987df233754
CPU threads: 4; OS: Windows 6.1; UI render: default; VCL: win;
Comment 3 Xisco Faulí 2019-05-23 11:34:34 UTC
Not reproduced in

Version: 4.3.0.0.alpha1+
Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e
Comment 4 raal 2019-05-23 12:59:04 UTC
This seems to have begun at the below commit.
Adding Cc: to Szymon Kłos; Could you possibly take a look at this one? Thanks

d16391a6b80d56e09f87703a7a9c76bcfae17529 is the first bad commit
commit d16391a6b80d56e09f87703a7a9c76bcfae17529
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Tue Dec 12 22:34:35 2017 -0800

    source sha:4b9e237850efe36f7e35d65e14d6953f1e1f3a45

author	Szymon Kłos <szymon.klos@collabora.com>	2017-11-14 19:29:33 +0100
committer	Szymon Kłos <szymon.klos@collabora.com>	2017-11-19 18:16:55 +0100
commit	4b9e237850efe36f7e35d65e14d6953f1e1f3a45 (patch)
tree	46281518c46207cce740c84c6bd68156aeb58d4f
parent	f9d1de6c7b135e36ae23095029d6dbaa044d881e (diff)
tdf#86087 Open relative links in Writer
Comment 5 Xisco Faulí 2019-08-01 10:09:21 UTC
I believe the problem is that https://opengrok.libreoffice.org/xref/core/sw/source/uibase/wrtsh/wrtsh2.cxx?r=5cb34f1c#503 returns aURL as a NotValid protocol because it contains Cyrillic characters...
Comment 6 Xisco Faulí 2019-08-01 10:43:41 UTC
*** Bug 121875 has been marked as a duplicate of this bug. ***
Comment 7 NISZ LibreOffice Team 2019-08-01 13:23:21 UTC
*** Bug 126591 has been marked as a duplicate of this bug. ***
Comment 8 Xisco Faulí 2019-08-01 15:56:41 UTC
Doing something like this

--- a/sw/source/uibase/wrtsh/wrtsh2.cxx
+++ b/sw/source/uibase/wrtsh/wrtsh2.cxx
@@ -488,6 +488,16 @@ bool SwWrtShell::ClickToINetGrf( const Point& rDocPt, LoadUrlFlags nFilter )
     return bRet;
 }
 
+
+static bool isAscii( const OUString& rStr )
+{
+    sal_Int32 nLen = rStr.getLength();
+    for( sal_Int32 i = 0; i < nLen; i++ )
+        if( rStr[i] > 127 )
+            return false;
+    return true;
+}
+
 void LoadURL( SwViewShell& rVSh, const OUString& rURL, LoadUrlFlags nFilter,
               const OUString& rTargetFrameName )
 {
@@ -501,7 +511,7 @@ void LoadURL( SwViewShell& rVSh, const OUString& rURL, LoadUrlFlags nFilter,
 
     OUString sFileURL = rURL;
     INetURLObject aURL( sFileURL );
-    if( aURL.GetProtocol() == INetProtocol::NotValid && !sFileURL.startsWith("#") )
+    if( isAscii(sFileURL) && aURL.GetProtocol() == INetProtocol::NotValid && !sFileURL.startsWith("#") )

could fix the issue, however, the real fix should be done in INetURLObject so it accepts non-ascii characters in the host as a valid URL.

@Stephan, I thought you might be interested in this issue...
Comment 9 Stephan Bergmann 2019-08-02 07:46:50 UTC
(In reply to Andrew from comment #0)
> http://транссеть.рф - interprets as file link

What do you mean with "interprets as"?  Please be more specific:  What exactly do you do, and what exactly is the (unexpected/erroneous) outcome you observe?
Comment 10 Xisco Faulí 2019-08-02 08:53:53 UTC
(In reply to Stephan Bergmann from comment #9)
> (In reply to Andrew from comment #0)
> > http://транссеть.рф - interprets as file link
> 
> What do you mean with "interprets as"?  Please be more specific:  What
> exactly do you do, and what exactly is the (unexpected/erroneous) outcome
> you observe?

In order to reproduce the issue:
1. Open writer
2. Add an hyperlink to 'http://транссеть.рф'
3. Click on it

-> gio: file:///http:%2F%2F%D1%82%D1%80%D0%B0%D0%BD%D1%81%D1%81%D0%B5%D1%82%D1%8C.%D1%80%D1%84: Operation not supported
Comment 11 Xisco Faulí 2019-08-02 08:54:29 UTC
BTW, patch in gerrit: https://gerrit.libreoffice.org/#/c/76804/
Comment 12 Commit Notification 2019-08-15 09:37:15 UTC
Tünde Tóth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/bb9bad31b9e9f741fed91b2a4b3043814cb07f13%5E%21

tdf#125041 fix hyperlinks to IDN websites

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 muso 2019-09-08 22:16:22 UTC
Can this please be backported to LO 6.3.x?
Comment 14 Xisco Faulí 2019-10-28 10:40:19 UTC
(In reply to muso from comment #13)
> Can this please be backported to LO 6.3.x?

I let Tünde Tóth decide about it... I don't like there's no unittest for it...
Comment 15 Xisco Faulí 2019-11-19 12:33:21 UTC
Verified in

Version: 6.4.0.0.beta1+
Build ID: 1987c98926a85a483a32ea78e460e563a6ea4705
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded
Comment 16 Andrew 2020-02-06 12:41:24 UTC
When I export this example to PDF, URL is wrong.
Comment 17 Andrew 2020-02-06 12:42:09 UTC
Created attachment 157689 [details]
Example ODT file
Comment 18 Andrew 2020-02-06 12:42:32 UTC
Created attachment 157690 [details]
Example PDF file
Comment 19 NISZ LibreOffice Team 2020-02-07 08:22:52 UTC
(In reply to Andrew from comment #16)
> When I export this example to PDF, URL is wrong.

Could you please open a new report about this problem?

Original report was about problem in the editor, opening/saving in various file formats is different from that.

Setting status back to RESOLVED.