Bug 45084 - FILEOPEN: Selected file type is not honored for HTML files
Summary: FILEOPEN: Selected file type is not honored for HTML files
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
3.5.0 Beta0
Hardware: All All
: medium major
Assignee: Kohei Yoshida
URL:
Whiteboard: target:3.6.0 target:3.5.8
Keywords: regression
Depends on:
Blocks: mab3.5
  Show dependency treegraph
 
Reported: 2012-01-22 06:33 UTC by famo
Modified: 2012-10-23 15:40 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample HTML File with a small Table (333 bytes, text/html)
2012-01-22 06:33 UTC, famo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description famo 2012-01-22 06:33:15 UTC
Created attachment 55959 [details]
Sample HTML File with a small Table

One cannot open HTML in Calc any longer instead they open in Writer/Web.

How to reproduce:
1. Download attached sample HTML document.
2. Start Calc, choose File open
2.a under File type choose "HTML Document (OpenOffice.org Calc) (*.html;*.htm)"
2.b select document and OK

3. Document opens in Writer/Web
Expected: document should be opened in Calc


This worked as expected in prior Versions (OOo 3, LO 3.4.5).
Comment 1 Julien Nabet 2012-01-22 06:44:38 UTC
I reproduced this pb on Master and 3.5 branch.

(PC Debian x86-64)
Comment 2 Pedro 2012-01-22 06:54:37 UTC
This is not a Windows specific change. Modified OS to All.

I'm not sure this is a bug. If the file is HTML it makes sense that it is opened in the HTML editor.

Unless a rule is setup that if an HTML file only contains a table, then open in Calc... Does that make sense?

You can still open the HTML file as a spreadsheet by opening Calc and choosing Insert, Sheet from file and selecting the HTML file you know only contains a table.
Comment 3 famo 2012-01-22 07:16:34 UTC
@Pedro:
> Unless a rule is setup that if an HTML file only contains a table, then open in
> Calc... Does that make sense?
Partly, please see point 2.a in OP:
2.a under File type choose "HTML Document (OpenOffice.org Calc) (*.html;*.htm)"

This option (File type) is specifically there to open the file in Calc. Notice also that this bug is a regression and it worked as described before.

It /could/ be that this is a knew expected behavior in 3.5, but I highly doubt that because the File Type option ("HTML Document (OpenOffice.org Calc) (*.html;*.htm)") wouldn't make any sense then.

> You can still open the HTML file as a spreadsheet by opening Calc and choosing
> Insert, Sheet from file and selecting the HTML file you know only contains a
> table.
Yes, that would be a good work around.
Comment 4 famo 2012-01-22 08:37:34 UTC
Reducing importance to major.
Comment 5 famo 2012-01-22 13:29:55 UTC
@Cor Nouws
Please take a look.
Comment 6 Cor Nouws 2012-01-22 14:09:23 UTC
confirmed - was fine in 3.4.5
thanks for testing & reporting!
Comment 7 Kohei Yoshida 2012-01-23 14:33:46 UTC
Technically this is not a regression, since we never properly handled this at filter type detection level.  The fact that it seems to have worked in 3.4.x is purely due to nothing but luck.  E.g. this never worked for me even in 3.3 or 3.4.

Internally, the algorithm that picks the app to open has no idea from which app the user is opening a given file.  In HTML's case, candidates are writer/web or calc.  Which ever app happens to be stored first in boost::unordered_map gets picked.  And since the container doesn't guarantee the order, which app gets picked when there are multiple candidate apps is anyone's guess.

The current workaround is to select "HTML Document (OpenOffice.org Calc)" or "Web Page Query (OpenOffice.org Calc)" (hmm OOo is still used there...) as the file type when opening the HTML file from File - Open.  Or, do as Pedro suggested.   Yes, this is a UI bug and we need to fix this some day, but it's not a regression.

Removing the regression keyword.
Comment 8 Cor Nouws 2012-01-23 15:12:57 UTC
Hi Kohei,

thanks for the explanation...

(In reply to comment #7)
> The current workaround is to select "HTML Document (OpenOffice.org Calc)" or
> "Web Page Query (OpenOffice.org Calc)" (hmm OOo is still used there...) as the
> file type when opening the HTML file from File - Open.  

That is what Famo wrote in his point 2 and what I tested and found *not* to work

> Removing the regression keyword.

Revering that, sorry.
Comment 9 famo 2012-01-25 04:06:01 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > The current workaround is to select "HTML Document (OpenOffice.org Calc)" or
> > "Web Page Query (OpenOffice.org Calc)" (hmm OOo is still used there...) as the
> > file type when opening the HTML file from File - Open.  
> 
> That is what Famo wrote in his point 2 and what I tested and found *not* to
> work
> 
Yes, exactly. This is what the bug is all about.


Currently there is only one workaround - the one pedro described.
Comment 10 Kohei Yoshida 2012-01-27 08:30:40 UTC
I'll take this.
Comment 11 Kohei Yoshida 2012-01-27 11:26:53 UTC
This is not calc specific since the code is in the shared framework. Changing the component accordingly.
Comment 12 Cor Nouws 2012-02-07 03:14:57 UTC
Hi Eike,
Is this someting that makes bells ring at your side (no alarm bells ;-) ) ?
thanks - Cor
Comment 13 Cor Nouws 2012-02-07 03:17:42 UTC
(maybe better ask mstahl ?)
Comment 14 dE 2012-04-15 21:07:02 UTC
Rarely used feature.

Do the devs think it's priority be reduced?
Comment 15 Cor Nouws 2012-04-16 12:38:30 UTC
(In reply to comment #14)
> Rarely used feature.
> 
> Do the devs think it's priority be reduced?

Assigned, devs pay attention too it, and it actually blocks people opening certain files in LibreOffice. Indeed Annoying
Comment 16 Petr Mladek 2012-04-17 07:13:07 UTC
Hmm, if you read the comment #7, it worked as expected only in LO-3.4 by luck. It did not work before and it is broken in LO-3.5 again. A proper solution might even need rework of the file type detection code. I agree with dE that it is less used feature and it does not belong to the most annoying bugs.
Comment 17 Petr Mladek 2012-04-17 09:36:11 UTC
Ah, I have spoken with Cor on irc. I should have read the bug more carefully. It actually worked in earlier LO versions. Also it might affect quite some users. So, let's put it back in MAB.

I am sorry for the mess. My main intention was to encourage dE to work for us.
Comment 18 Roman Eisele 2012-05-04 07:17:37 UTC
According to comment #11, this issue is not Calc-specific, therefore changed Summary accordingly to prevent misunderstandings.
Comment 19 Kohei Yoshida 2012-05-22 11:39:45 UTC
This will be fixed in 3.6.0.

http://cgit.freedesktop.org/libreoffice/core/commit/?id=552bebe6fa27fa58d07d87283a4b24e6052ab3d4

Why this was not fixed sooner?  Refer to my blog post.

http://kohei.us/2012/05/21/what-goes-on-when-loading-a-file/

I don't know who designed this piece of code, but it's horrendously complex and very easy to break.  It took me a few days to even start to understand the code, let alone come up with a "right" fix.

Now, I really hope that's indeed the right fix (i.e. that won't cause regression), but to the best of my ability that's the best and most sensible thing we can do there.

The real mystery is how this code ever worked before.  The handling of pre-selected filter hadn't changed for ages.
Comment 20 Cor Nouws 2012-05-24 07:01:04 UTC
Hi Kohei,

(In reply to comment #19)
> This will be fixed in 3.6.0.


Ah wow!

> http://kohei.us/2012/05/21/what-goes-on-when-loading-a-file/

I saw that few days ago. So indeed, tough piece to dive in.

> Now, I really hope that's indeed the right fix (i.e. that won't cause
> regression), but to the best of my ability that's the best and most sensible
> thing we can do there.

We'll ping people to do the tests :-)

> The real mystery is how this code ever worked before.  The handling of
> pre-selected filter hadn't changed for ages.

If you can't see any change, who can? May have been some change at a different level in the code?
Comment 21 Cor Nouws 2012-05-24 07:03:33 UTC
(In reply to comment #20)
> 
> We'll ping people to do the tests :-)

He, I just see that the masterbuild I have, 
    version 3.6.0alpha0+ Build ID: 1bb9a60
does contain the patch.

Indeed works. Super!
Comment 22 Not Assigned 2012-10-23 15:40:01 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-3-5":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dc9b4a29f1928fabcd4094942d32bc5985091838&g=libreoffice-3-5

rhbz#868953 fdo#45084 When the caller specifies filter type, stick to it


It will be available in LibreOffice 3.5.8.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.