I am working on a system, where I need to be able to extract text from all document types supported by LibreOffice. The exported text needs to be UTF-8, which I understand is what LibreOffice normally exports.
The problem is that if you try to --convert-to txt:Text with Impress or Calc native or imported files, an error is given.
For Calc, at least for my needs, csv where a space is used instead of the comma is perfect.
For Impress just dump the text out using a newline or
other whitespace character between text objects and slides, otherwise
just straight UTF-8 text.
Thank you for considering this feature.
I am confirming this enhancement request. Status set to NEW. Version changed to Inherited From OOo as this feature has never been available. Tested under Crunchbang 11 x86_64 running:
- v22.214.171.124 OOO330m19 Build: 401
- v126.96.36.199 Build ID: 70feb7d99726f064edab4605a8ab840c50ec57a
(In reply to comment #0)
> The problem is that if you try to --convert-to txt:Text with Impress or Calc
> native or imported files, an error is given.
For Calc this command is wrong, because the csv filter is called "Text - txt - csv (StarCalc)" not "Text", and if you use csv extension you don't even need to specify the filter name, so --convert-to csv should be enough.
For Impress it won't work because Impress doesn't have a plain text export filter, so let's focus on this.