Description: When opening an RTF document that uses plain typewriter font text with layout, multiple spaces are replaced by a series of "\u8198\'3f \u8198\'3f ", with the result that justification is lost. If the document is stripped back to plain text (FILE.txt) and opened in LO, it imports correctly, with spaces and justification preserved, though needing page and fontsize modification if the lines are long. Steps to Reproduce: 1.Open any RTF document using typewriter font and spaces to justify columns 2. 3. Actual Results: Justification is lost, and "spaces" are clearly not all of same length Expected Results: Justification should be preserved, and spaces kept as normal space-characters Reproducible: Always User Profile Reset: No Additional Info: Save it as some other FILE_2.rtf, and the changes will be clear Version: 7.1.5.2 / LibreOffice Community Build ID: 10(Build:2) CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: en-GB (en_GB.UTF-8); UI: en-GB Ubuntu package version: 1:7.1.5~rc2-0ubuntu0.20.04.1~lo1 Calc: threaded
Created attachment 174514 [details] A source rtf ready to be opened in LO
Created attachment 174515 [details] The RTF saved (as) after opening SRC.rtf
The problem is in FILEOPEN, not SAVE - this can be seen in the uneven motion of the cursor within multiple spaces (2 short moves with left/right arrow keys, one longer one). Isolated space characters are not affected. Unnecessary fonts and styles are introduced, but that is irrelevant to this current issue. OTOH, if I introduce into the source PDF a line after the fonttbl {\*\generator LibreOffice/7.1.5.2$Linux_X86_64 LibreOffice_project/10$Build-2} the rtf opens as it should. So there is a "cheat" solution!
I'm not an expert, but as far as I can see, attachment 174514 [details] is not a rtf-file. I can't open it directly from browser. So I have to save it first and in this case the suggested file format is "text". Whe I save attachment 174515 [details] the suggested file format is "Rich Text Format". Please check => NEEDINFO
It opens in my browser (Pale Moon, FireFox derivative), but as a text file, showing all the RTF directives. How should a browser open an RTF file? - I don't know! But it *is* an rtf file, albeit with minimal directives, and LO opens it as such.
(In reply to Bernard Moreton from comment #5) > It opens in my browser (Pale Moon, FireFox derivative), but as a text file, > showing all the RTF directives. How should a browser open an RTF file? - I > don't know! > But it *is* an rtf file, albeit with minimal directives, and LO opens it as > such. How did you create that rtf-file?
The uploaded example file is a pared-down extract from a much longer PDF report file, with most of the actual text replaced character-for-character, for obvious discretionary reasons. The PDF was reduced to text using pdftotext -layout $src # $src being the PDF file A standard RTF header block is then written, with the mandatory {\rtf1\ansi followed by a brief FONTTBL, COLORTBL (probably redundant), and a single style in the STYLESHEET. I now follow that with the {\*\generator LibreOffice/7.1.5.2$Linux_X86_64 LibreOffice_project/10$Build-2} to stop the unwanted behaviour of appending the strange characters in multi-soace strings. Then the lines defining the papersize, margins, and orientation for the document and the section (the latter again probably redundant), and finally the "\pard\plain \s7" to start the body of the text. The text is then copied from the text file, adding a "\line" at each line-end. And finally the RTF ending is added, "}" I'd upload the BASH executable, but the source RTF already uploaded shows the process more clearly than the BASH script could do! I've been using this sort of method for many years for reporting from 4GL, whether simply to LO (and OOo before that), or using LO to create a PDF from the command line - though in 4GL reporting most of the formatting is done by defining tabs. When processing pre-formatted text, however, especially from the output of PDFTOTEXT, multiple spaces are unavoidable; but they should *never* be added to with strange characters as the LO FILEOPEN for RTF obviously does.
Bibisected with linux-64-6.4 to https://git.libreoffice.org/core/commit/24b04db5a63b57a74e58a7616091437ad68548ac tdf#123703 RTF import: fix length of space character sequence Version: 7.4.0.0.alpha1+ (x64) / LibreOffice Community Build ID: b6266207b55a7633dc82b02142215757512adfb7 CPU threads: 2; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win Locale: fi-FI (fi_FI); UI: en-US Calc: threaded Jumbo
Created attachment 191750 [details] Spaces correctly rendered by Wordpad Wordpad rendering https://bugs.documentfoundation.org/attachment.cgi?id=174514
Created attachment 191751 [details] OpenOffice rendering spaces correctly OpenOffice rendering https://bugs.documentfoundation.org/attachment.cgi?id=174514
Created attachment 191761 [details] Wrong rendering by Libre Office https://bugs.documentfoundation.org/attachment.cgi?id=174514 rendered by Libre Office Writer 7.6 on Linux.
Created attachment 191766 [details] WordPad-inspired example demonstrating the consecutive space issue. WordPad-inspired example demonstrating the consecutive space issue. Line 3 is blank, replace it with {\*\generator anystring} and the spaces render correctly.
Created attachment 191767 [details] Montage showing rendering with/without \generator control word Montage showing the rendering of https://bugs.documentfoundation.org/attachment.cgi?id=191766 with and without the \generator control word. Highlighted in red are regions of interest.
I was about to open an bug report on what I now believe to be the same bug/issue as this. Working on RTF-output for the Pygments project I see consecutive spaces rendered width variable width or maybe even replaced with a number of other characters (in Libre Office Write 7.6.x on Linux and Windows 10). Both my own minimal example [1], as well as the originally attached RFT-example[2] render correctly in WordPad[3] on Windows or Apache OpenOffice[4] (Windows+Linux). An interesting observation I made was that adding the following line to the RTF-files makes the spaces render correctly in Libre Office. {\*\generator anystring} The `{\*\generator ...}`-line appears in RTF-files produced by WordPad. I can't however find the \generator control word described anywhere in the rtf-specification. `\*` instructs readers to ignore the control word (a destination) if they do not implement it. WordPad appears to use it to declare which version of Microsoft Rich Edit was used to generate the file (e.g. `{\*\generator Riched20 10.0.18362}`. [5] shows how replacing the blank line 3 in [1] with `{\*\generator anystring}` cause Libre Office to render the spaces at equal width. [5] also show a selection of the first 4 characters in Libre Office, e.g. ` 2 ` (space 2 space space), which Libre Office detects as a selection of 6 characters. As I understand Comment 8 by Buovjaga a commit has been identified which introduced the bug. That commit is dated August 2019 so I picked a version from 2018 (6.1.6.3) and that version (6.1.6.3) is free of this bug/issue. It renders the spaces correctly. [1] https://bugs.documentfoundation.org/attachment.cgi?id=191766 [2] https://bugs.documentfoundation.org/attachment.cgi?id=174514 [3] https://bugs.documentfoundation.org/attachment.cgi?id=191750 [4] https://bugs.documentfoundation.org/attachment.cgi?id=191751 [5] https://bugs.documentfoundation.org/attachment.cgi?id=191767
Created attachment 191772 [details] Correct and incorrect rendering in Libre Office (with/without \*\generator or LO Writer before/after version 6.3) Correct and incorrect rendering in Libre Office (with/without \*\generator or LO Writer before/after version 6.3)
Helpful findings by caolanm and vmiklos in #libreoffice-dev https://git.libreoffice.org/core/+/24f17f0336badfbba276c1e6713a89b4f9bb7cb8%5E%21 https://git.libreoffice.org/core/+/cd7241e3d2892c2a115265f842f464d017d7c7e1%5E%21 https://git.libreoffice.org/core/+/33d966ecc1f9fc44016cdeeed15dbaf6bda68eda%5E%21
*** This bug has been marked as a duplicate of bug 135079 ***