Description: Hello, I am working on a service that converts documents to pdf or html in order to display them in a browser. I am trying to update the libreoffice version from 7.2 to 7.3 and I am noticing now that it can no longer convert powerpoint documents to html. The command I have been running is soffice --headless --convert-to html test.pptx I tried updating my command to specify a filter to use but that did not seem to work: soffice --headless --convert-to html:"impress_html_Export" test.pptx When I run the first command there is an error message saying that there is no export filter. When I run the second command I get an error that I should verify my input parameters Steps to Reproduce: 1. Try to convert .pptx document to .html Actual Results: Fails to convert Expected Results: Converts document to .html Reproducible: Always User Profile Reset: No Additional Info: No other information
Repro using 7.3.0.3, but not 7.2.0.4. But 7.3.0.3 works using 'htm' instead of 'html'.
--convert-to html > no export filter : repro, as reporter --convert-to html:"impress_html_Export" > OK for me, different from reporter --convert-to htm > OK, as Mike found, using filter : impress_html_Export I'm not sure about "regression", because convert of attachment 178918 [details] to html would loose footer with 7.2 and give different errors for XSL Vendor: 'libxslt'. In all, I'd say minor. Also, "no export filter, aborting" should be an error, but convert error status is 0.
(In reply to Timur from comment #2) > I'm not sure about "regression" Marking it that way, it is possible to *bibisect* it to the changing commit, and get an idea what was the change about - which helps clarify it it's the intended change or not. Hence, I used the keyword, and consider it proper for the said reason :-D - not claiming more than that.
7.3 commit 36ce32072658c6ffca75b200f116ddfc11cab138 Date: Tue Jun 22 08:50:00 2021 +0200 source 990b2cb056788f7f412656a303456d90c003cf83 pre 949658028e722e5d2657b503eb20e16e41dbd8cf author Noel Grandin <noel@peralex.com> committer Noel Grandin <noel.grandin@collabora.co.uk> commit 990b2cb056788f7f412656a303456d90c003cf83 simplify and improve Wildcard it is faster to just process OUString data, rather than perform expensive conversion to OString and back again. Hi Noel, please see this. Report is that PPTX --convert-to html now gives "no export filter" which is true. But, my example shows that even before it wasn't reliable, instead html:"impress_html_Export" would give better result before and now. Also, --convert-to htm is different and proper, how comes that. There are other bugs with HTML convert and my general conlusion is that app or filter should be defined, short form is not reliable.
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/50add7c97e75d604287218f49c9283aab052fdf0 tdf#148253: fix matching algorithm It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-7-3": https://git.libreoffice.org/core/commit/2143fa31b9035c7c2cf302ccd3907d0853132e8f tdf#148253: fix matching algorithm It will be available in 7.3.3. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike, until I'm able to test this, I have 3 things to ask you ( I think you are following via Tags, as myself ): 1. How come that html and htm are different? 2. Can you see bug 148275 and confirm or comment? 3. Can you see the mail I sent directly to your Hotmail address? Thanks.
(In reply to Timur from comment #7) > 1. How come that html and htm are different? We have two HTML-related export filters: "impress_html_Export" and "XHTML Impress File". The latter refers to "XHTML_File" type, which has extension list defined as "html,xhtml". The former refers to "graphic_HTML" type, and its extensions are "html,htm". Note that both can handle "html", and both have "html" extension first. It seems that *for some reason* (I don't know which, but that is irrelevant here - *some* of them must be picked anyway, one or the other), in *normal* case, the latter one got picked when you didn't explicitly defined the filter. But the found commit regressed so that it couldn't find *any* non-last extension in the list - so it didn't see *both* filters handling NTML. But it could find the *last* extension in the list - so it found graphic_HTML type, and hence impress_html_Export filter. > 2. Can you see bug 148275 and confirm or comment? Yes. > 3. Can you see the mail I sent directly to your Hotmail address? Unfortunately, no (I also checked the spam folder)...