When I choose to export writer document as Text hidden paragraphs which was hidden in PDF are shown in TXT file: Example template: https://ask.libreoffice.org/upfiles/15929228703192809.odt Example python3 code: https://pastebin.com/i7u7X9gB PDF contains: Simple paragraphs with variable abc Hidden paragraph if abc Simple paragraph with variable 3.65 Hidden paragraph if greater 5 Numeric user field 123,458.67 but TXT file contains: Simple paragraphs with variable abc Hidden paragraph if abc Hidden paragrap if not abc Simple paragraph with variable 3.65 Hidden paragraph if less 5 Hidden paragraph if greater 5 Numeric user field 123,458.67 To generate TXT file replace last script line with model.storeToURL("file:///tmp/output/test1.txt", [PropertyValue("FilterName", -1, "Text", 0)])
What PDF is meant? Have you made a typo/thinko, or have you forgotten to provice a link/attachment?
Created attachment 166361 [details] PDF generated by script
Created attachment 166362 [details] Text file generated by script
(In reply to Mike Kaganski from comment #1) > What PDF is meant? Have you made a typo/thinko, or have you forgotten to > provice a link/attachment? I've attached both PDF and text files.
Anyway, all the complexity with python scripts and code modifications in not needed to see the problem (why would people not try doing simple things in GUI when filing bugs to simplify reproduction steps?) Open the document in Writer, and save as Text. Repro with Version: 7.0.2.2 (x64) Build ID: 8349ace3c3162073abd90d81fd06dcfb6b36b994 CPU threads: 12; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: en-US Calc: CL
I suppose sw/qa/python/var_fields.py test could check content of exported text to check this issue.
I still think it's a bug, but OTOH there might be a consideration/distinction related to difference between export-only PDF, and editable document format like TXT: the former should keep visual representation, while the latter is expected to keep as much information as possible ... keeping all the text, and removing all text's unsupported properties. Miklos, Michael: what do you think: Is it better to keep text and drop its "hidden" attribute, or to keep the "hidden" attribute value by dropping the text in Text filter?
Whatever you do unconditionally, somebody will be upset. So I guess keeping the status quo makes sense, so at least the ones who are happy already are not disturbed. You could add an option for this to make everyone happy, but then you have the cost of one more option. :-)
Could FilterOptions be used to enable new behavior?
(In reply to Oleg Shchelykalnov from comment #9) Yes, but which filter should that be? We have a "Text" filter without any options, and we have "Text (encoded)" filter, with encoding-related settings. Personally I'd love to see the two filters become one, with only one option in file dialogs, with the new setting added to the current encoding-related dialog (and to the FilterOptions). But... Anyway, shouldn't the setting (if needed) be currently added to already configurable "Text (encoded)"?
I've tried to look at it but cannot find where "Text (encoded)" filter sources are placed.
I found SwAsciiOptions and SwASCWriter classes which used here. Should SwAsciiOptions accept sixth option to include or not hidden text and set it yes by default?
(In reply to Oleg Shchelykalnov from comment #12) > Should SwAsciiOptions accept sixth option to include or not hidden text and > set it yes by default? Yes, that's needed if you want this implemented of course. I don't see a problem in this, if you keep backward compatibility (so it should by default behave as before).
I've prepared unit tests for it and now I stuck how hidden paragraphs and text represented in LibreOffice code? New option and unittest for it: https://gerrit.libreoffice.org/c/core/+/105625 Also I found out HTML export also includes hidden paragraps.
Hidden paragraph appeared to be simple. I managed to do it in https://gerrit.libreoffice.org/c/core/+/105631 but somehow gerrit shows merge conflict while branch was rebased against master. Also I tried to include hidden text in test but it looks like it's broken other way, it isn't included in output file if hidden condition is true either way.
Oleg Shchelykalnov committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/aafe21d8765158d223dd359e6737b64ed1b34549 tdf#137469 Add option to disable hidden text in text filter It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I've rebased and fixed other changesets that fix this issue.
Oleg Shchelykalnov committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/c96b61f86ef3f4cdc34f84043fed2724b6d9732b tdf#137469 Prepare tests for encoded text filter It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Oleg Shchelykalnov committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/b5e07b1339f73841664b28c65639f1638bd7edf4 tdf#137469 Implement and test excluding hidden text in text filter It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
This broke the fix for bug 119800. Now attachment 144783 [details] doesn't show the first frame, again.