Bug 106041 - Specific characters at the beginning of a paragraph
Summary: Specific characters at the beginning of a paragraph
Status: RESOLVED INSUFFICIENTDATA
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-16 10:35 UTC by Butch
Modified: 2018-05-30 16:48 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Demo odt including description and macros (19.95 KB, application/vnd.oasis.opendocument.text)
2017-02-16 10:35 UTC, Butch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Butch 2017-02-16 10:35:24 UTC
Created attachment 131268 [details]
Demo odt including description and macros

There is a problem with characters like „ or – at the beginning of a paragraph.
The macros included in my Demo.odt show that. The macros are inserting format marks <small> … </small> for a text formatted in 10 pt font size (as an example).
And they show that the characters „ and – are not included in the correct format anymore!

Could be associated with https://bugs.documentfoundation.org/show_bug.cgi?id=103308?
Comment 1 V Stuart Foote 2017-02-16 15:44:28 UTC
Can not confirm within the GUI.  The Find&Replace (Ctrl+H) dialog using "Other options", with "Regular expressions" and the "Text Format (search)" dialog to set font size--and "Text Format (replace)" of the selection with tags results in each ".*" regular expression selecting the whole paragraph *including* the leading characters in each Find, one find for each paragraph/line. 

Then the replace string includes the full "&" between the added text tags.

Within the GUI the find/replace is not affected.

Unfortunately I don't know enough about running the select/find/replace in your macros to judge if that syntax is correct. But it does function correctly in the UI.

You might verify that you have no autocorrect options set for those characters but I don't think that is the issue here.  Bug 103308 was invalid, a simple misunderstanding of the autocorrect defaults.


=-testing-=
On Windows 10 Pro 64-bit en-US with
Version: 5.4.0.0.alpha0+
Build ID: 6de3688cc6bd52ce08ff8a4327e59dbbc8a5c7d4
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-02-15_23:33:50
Locale: en-US (en_US); Calc: CL

also
Version: 5.3.0.3 (x64)
Build ID: 7074905676c47b82bbcfbea1aeefc84afe1c50e1
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; Layout Engine: new; 
Locale: en-US (en_US); Calc: group
Comment 2 Butch 2017-02-16 16:41:01 UTC
1) Yes, the autocorrect setting has no influence.

2) I reproduced the same within the GUI, as described by you. Did you register that there IS a difference in treating characters within the GUI too?
I see the following result:
<small>~This is a 10 pt direct format paragraph.~</small>
<small>„</small><small>This is a 10 pt direct format paragraph.“</small>
<small>–</small><small>This is a 10 pt direct format paragraph.</small>
<small>°This is a 10 pt direct format paragraph.°</small>
„ and – have separate tags! 
This would mean that there could be an issue with these characters, which manifests themselve when using macros.
Comment 3 V Stuart Foote 2017-02-16 16:46:46 UTC
Not for Windows 10 and the en-US locale--the full strings are picked up with each regex find. Of course the highlighting had to be removed so the only difference from default format is the size of the font.

But given that, if you are seeing a find selection of ".*" picking up those glyphs individually--that could be a localization.

What Windows build and locale are you working with?
Comment 4 Butch 2017-02-16 16:56:23 UTC
You are right.
I am using Windows 10 / German (10.0.14393 Build 14393).
And .* is picking these two characters individually!

Do you have an idea for a workaround?
Comment 5 V Stuart Foote 2017-02-16 17:16:34 UTC
Looking a bit further seems to be formatting issue with your sample text--bcz it is not cleanly entered new text.

If you open the ODF archive for the document, for the two problem glyphs "„" and the "–", both split from their paragraph text and styled T2 as DDE links and text:bookmark-start elements, while the "This is a" string is styled T1 as DDE link text:bookmark-end elements and also split from paragraph text.

The other paragraphs/lines are simple text.

Your .* find will then treats the T2 and T1 styled elements as different regex matches--that is expected.  Did you check typing in the strings cleanly as new text in a document?

So question for your process is where the DDE linkages are generated *and styled*, external to LibreOffice? To adjust the scripting for the output, do you need the DDE linkage? Can the style be adjusted after generation but before parsing.

But seems this is NOT A BUG.
Comment 6 Butch 2017-02-16 19:12:11 UTC
The problem seems to be that in LO Writer it is nearly impossible to enter text "cleanly". Every backspace deleting during entering, inserting a character into a already written word or line etc. results in an interrupted styling, as described by you.

In the case of my demo text the german quotation mark and dash were inserted by autorcorrect (for ", -- and space). This is the normal setting for german. And it results in interrupted styling!

The only "way" I found to enter such a text "cleanly": Copy the text from Writer into an Unicode txt file, and copy it from there back to the Writer document! In this case there is no interrupted styling.

(However, even in this case the macro Demo2 produces the same incorrect result. This may be associated to the process of copying into a new document, as used in the macro. Unfortunatelly I am unable to find information in the LO documentation on applicable values for SelectedFormat... )

(BTW, you have closed Bug 103308. But there was a comment by me which reported a strange behavior of the autocorrect option!)
Comment 7 V Stuart Foote 2017-02-16 21:52:05 UTC
Back to unconfirmed.

@Regina-- any thoughts on use in a de-DE locale.
Comment 8 Buovjaga 2017-02-28 08:57:40 UTC
Butch: so you tried -- in a clean document? I have autocorrecting of -- to long dash as well and it does not get an own XML element, when I open content.xml.
I don't get, why German would be different..

Win 7 Pro 64-bit Version: 5.4.0.0.alpha0+
Build ID: eb7b03b052ffe8c2c577b2349987653db6c53f76
CPU threads: 4; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2017-02-26_22:34:18
Locale: fi-FI (fi_FI); Calc: CL
Comment 9 Xisco Faulí 2017-11-01 22:56:49 UTC
(In reply to Buovjaga from comment #8)
> Butch: so you tried -- in a clean document? I have autocorrecting of -- to
> long dash as well and it does not get an own XML element, when I open
> content.xml.
> I don't get, why German would be different..
> 
> Win 7 Pro 64-bit Version: 5.4.0.0.alpha0+
> Build ID: eb7b03b052ffe8c2c577b2349987653db6c53f76
> CPU threads: 4; OS: Windows 6.1; UI render: default; 
> TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2017-02-26_22:34:18
> Locale: fi-FI (fi_FI); Calc: CL

Dear Reporter,
Could you please answer the question above?
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the question is answered
Comment 10 QA Administrators 2018-05-02 15:47:35 UTC Comment hidden (obsolete)
Comment 11 QA Administrators 2018-05-30 16:48:46 UTC
Dear Bug Submitter,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INSUFFICIENTDATA due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

Warm Regards,
QA Team

MassPing-NeedInfo-20180530