Bug 38511 - Command line --convert-to txt doesn't work: Failure modes
Summary: Command line --convert-to txt doesn't work: Failure modes
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
PreBibisect
Hardware: All Linux (All)
: medium normal
Assignee: David Tardon
URL:
Whiteboard: target:4.4.0
Keywords: preBibisect
Depends on:
Blocks:
 
Reported: 2011-06-20 18:33 UTC by gobnat
Modified: 2015-12-17 06:50 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description gobnat 2011-06-20 18:33:20 UTC
Can't get --convert-to txt working. 

Have tried:

*** Bare txt
>/opt/libreoffice3.4/program/soffice --headless -convert-to txt dt20.doc 

== Silent Failure:

Produces no error, but output isn't text (seems to be broken pdf of html - only first page converted, format seems to be pdf but formatting of page seems to be html) 


*** txt:writer_Text
>/opt/libreoffice3.4/program/soffice --headless --convert-to txt:writer_Text dt20.doc 

== Conversion Failure (no file written) - apparently can't find export filter? 

convert /data-current/programming/python/dgf2/conversions/dt20.doc -> /data-current/programming/python/dgf2/conversions/dt20.txt using writer_Text
Error: Please reverify input parameters...


*** txt:writer_Text_encoded


>/opt/libreoffice3.4/program/soffice --headless --convert-to txt:writer_Text_encoded dt20.doc 

== Conversion Failure (no file written) - apparently can't find export filter? 

convert /data-current/programming/python/dgf2/conversions/dt20.doc -> /data-current/programming/python/dgf2/conversions/dt20.txt using writer_Text_encoded
Error: Please reverify input parameters...
Comment 1 Yifan Jiang 2011-06-22 02:20:39 UTC
Platform: SLED 11 sp1 i586
build info: Libreoffice 3.4 release

Reproduced.

Hi Cedric,

Is this an easy hack?:) Thanks!
Comment 2 gobnat 2011-07-10 05:17:56 UTC
I assume that there is similar behavior for filters other than pdf (which seems to work).
Comment 3 Björn Michaelsen 2011-12-23 12:21:42 UTC
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 4 sasha.libreoffice 2012-03-01 02:03:57 UTC
Please, verify: in last version of Libre Office still reproducible?
Comment 5 gobnat 2012-03-01 02:48:14 UTC
Yes. In fact, it's worse.  All the previous examples now become silent failures and no output is produced.
Comment 6 gobnat 2012-03-01 02:49:56 UTC
Just to confirm - I am using a freshly downloaded v3.5.0 atm
Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735
Comment 7 sasha.libreoffice 2012-03-01 04:13:43 UTC
thanks for additional experiments
reproduced in 3.5.0 on Fedora 64 bit
only this tested:
/opt/libreoffice3.5/program/soffice --headless -convert-to txt dt20.doc
Comment 8 Trever L. Adams 2012-10-20 17:40:42 UTC
I am seeing this in Fedora 17 with Libreoffice version 3.5.7.2-2 (their package version).

With Silent Failure, I only see it with a window already open. If there are windows open, it fails silently. With no windows open, it fails with .ods and .odp (bug #56231), but works with .odt.

I am wondering, but haven't yet tried to duplicate it with multiple sessions of --headless --convert.... to see if it fails. My situation requires that it works with --headless --convert no matter how many instances are running. I would love to see this fixed.
Comment 9 Riccardo Magliocchetti 2013-06-06 20:22:42 UTC
Can reproduce on libreoffice-4-1 branch:

$ /usr/local/lib/libreoffice/program/soffice --convert-to txt test_tmp.doc 
convert /home/rm/test_tmp.doc -> /home/rm/test_tmp.txt using 
                                                       ^^^^^^ looks like txt does not match a filter
$ file test_tmp.txt 
test_tmp.txt: OpenDocument Text

$ /usr/local/lib/libreoffice/program/soffice --convert-to txt:writer_Text test_tmp.doc 
convert /home/rm/test_tmp.doc -> /home/rm/test_tmp.txt using writer_Text
Overwriting: /home/rm/test_tmp.txt
Error: Please reverify input parameters...
Comment 10 Marcos Souza 2013-07-09 04:03:21 UTC
I investigated a little this problem:

The function SfxFilterMatcher::GuessFilterIgnoringContent that calls the fucntion:

xDetection->queryTypeByURL( rMedium.GetURLObject().GetMainURL( INetURLObject::NO_DECODE ) ); 

to get the type of the file that will be exported. But, this function returns writer_T602_Document.

So the fucntion GetFilter4EA is called, to verify if the type has a filter... but in this case, the writer_T602_Document seems to don't be a valid filter.

What can I do now? I believe we're near to solve this issue, but I need help!
Comment 11 Marcos Souza 2014-01-14 02:11:04 UTC
We can't figure out the problem by using bibisect, becuase this bug appears before the oldest build.
Comment 12 Cédric Bosdonnat 2014-01-20 09:00:28 UTC Comment hidden (noise)
Comment 13 Alexandre Vicenzi 2014-02-05 19:02:42 UTC
I will try to fix this Bug.

Does anyone have any idea where I can start?

Marcos found something and I will take a look, but he asked for help too.
Comment 14 sasha.libreoffice 2014-02-06 05:05:36 UTC
Thanks for interesting in this bug.
IMHO You should start here: irc://chat.freenode.net/libreoffice-dev
Comment 15 Marcos Souza 2014-02-11 21:29:24 UTC
(In reply to comment #14)
> Thanks for interesting in this bug.
> IMHO You should start here: irc://chat.freenode.net/libreoffice-dev

We're asking here beause we don't have time to be on IRC.

So posting here maybe some hackeer could help to solve this issue.
Comment 16 sasha.libreoffice 2014-02-12 09:56:07 UTC
Sorry, but hackers usually do not visit this bugreport. Only in chat You may find them.
Comment 17 Maxim Monastirsky 2014-06-25 09:50:23 UTC
(In reply to comment #9)
> $ /usr/local/lib/libreoffice/program/soffice --convert-to txt:writer_Text
This command is wrong. The filter is called "Text", not "writer_Text". And it works for me when using the correct name.

(In reply to comment #10)
> xDetection->queryTypeByURL( rMedium.GetURLObject().GetMainURL(
> INetURLObject::NO_DECODE ) ); 
> 
> to get the type of the file that will be exported. But, this function
> returns writer_T602_Document.
When several file types register the same extension, TypeDetection::queryTypeByURL is guaranteed to return the one that has the "Preferred" flag set to true. In this case we have two types that use the "txt" extension: generic_Text, and writer_T602_Document. generic_Text has Preferred=false [1], and writer_T602_Document has Preferred=true [2]. That's the reason it returns writer_T602_Document.

> What can I do now?
Setting Preferred=true for generic_Text solves this bug for me (not sure regarding side effects it may have). But it just a workaround hiding the real problem: TypeDetection::queryTypeByURL is not suited for searching an export filter, because it may return types (like writer_T602_Document) that don't have an export filter at all.

[1] http://opengrok.libreoffice.org/xref/core/filter/source/config/fragments/types/generic_Text.xcu#23
[2] http://opengrok.libreoffice.org/xref/core/filter/source/config/fragments/types/writer_T602_Document.xcu#23
Comment 18 Maxim Monastirsky 2014-08-05 18:40:31 UTC
*** Bug 82196 has been marked as a duplicate of this bug. ***
Comment 19 Robinson Tryon (qubit) 2015-12-17 06:50:13 UTC
Migrating Whiteboard tags to Keywords: (prebibisect)
[NinjaEdit]