Bug 94437 - Editing: entries from a deleted alphabetic index reappear when a new index is inserted
Summary: Editing: entries from a deleted alphabetic index reappear when a new index is...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0.1.2 release
Hardware: x86 (IA32) All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: TableofContents-Indexes
  Show dependency treegraph
 
Reported: 2015-09-22 09:41 UTC by Bernard Moreton
Modified: 2023-01-27 12:50 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
chapter (ODT) awaiting fresh indexing (45.25 KB, application/vnd.oasis.opendocument.text)
2015-09-22 17:59 UTC, Bernard Moreton
Details
Concrdance file for test alphabetic indexing (63 bytes, text/plain)
2015-09-22 18:01 UTC, Bernard Moreton
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bernard Moreton 2015-09-22 09:41:24 UTC
When inserting a fresh alphabetic index in a file where an index was inserted and removed previously, the entries from that old index reappear.  This seems to be caused by a SAVE of the file after the old index was deleted.  The reappearance happens even when the new index has nothing in common with the old one. 

This is effectively data corruption, so a major problem.
Comment 1 Buovjaga 2015-09-22 14:55:18 UTC
Could not reproduce, but it may be that I'm not doing it exactly as you.

1. New document, add a few headings
2. Insert table of contents
3. Delete toc
4. Save
5. Insert alphabetical index

Are you saying the index entries appear in the alphabetical index? For me it stays completely empty.

Win 7 Pro 64-bit, Version: 5.0.1.2 (32-bit)
Build ID: 81898c9f5c0d43f3473ba111d7b351050be20261
Locale: fi-FI (fi_FI)
Comment 2 Bernard Moreton 2015-09-22 17:59:57 UTC
Created attachment 118943 [details]
chapter (ODT) awaiting fresh indexing
Comment 3 Bernard Moreton 2015-09-22 18:01:47 UTC
Created attachment 118944 [details]
Concrdance file for test alphabetic indexing
Comment 4 Bernard Moreton 2015-09-22 18:10:56 UTC
Nothing to do with ToC, I'm afraid.

Sample ODT file for indexing has had an index created, then deleted, then Saved - possibly more than once.

To reproduce:
Check the sample concordance file
Open the DT document,
check there is no existing visible alphabetic index in it,
go to EOF (not essential, but ...),
Insert/Index, select type=alphabetic index, opt to use Concordance file, 
and select the one submitted above, completely different from anything I used in previous aborted attempts.

One new relevant entry should be creaated,  but lots of entries, mostly duplicated, from the previous attempts.

At least, that what happens on my system - but that *might* be due to something in my environment rather than in the submitted file - unlikely, because I've checked by indexing fresh copies, and those attenpts have not triggered the problem.
Comment 5 Buovjaga 2015-10-01 11:14:02 UTC
Reproduced with the attached files.

Win 7 Pro 64-bit, Version: 5.0.2.2 (x64)
Build ID: 37b43f919e4de5eeaca9b9755ed688758a8251fe
Locale: fi-FI (fi_FI)
Comment 6 Oliver Specht (CIB) 2015-11-06 13:30:32 UTC
The bugdoc contains all the index entries already. You can see the small grey squares in the document, e.g. in the first sub-heading 
"(i)  The catechumenate and Lent" 

Automatically generated index entries should be marked as 'auto generated'. All automatically generated entries are deleted on update of the index.
As long as the document is not saved & reloaded they actually are marked but this is not stored in the odt file. A related attribute is not defined in the ODF standard.

see http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng

The element "text-alphabetical-index-mark-attrs"
needs an additional entry like 
<optional>
 <attribute name="text:isAuto">
  <ref name="boolean"/>
 </attribute>
</optional>
Comment 7 Svante Schubert 2015-11-09 19:57:36 UTC
To rephrase the issue:
The attached ODT already contains existing index fields, like:

<text:alphabetical-index-mark text:string-value="Justin" text:key1="Justin"/>
<text:span text:style-name="T4">Justin, </text:span>

and someone adds a new index file (as the second attached) the application behavior is now NOT to show/warn NOR to provide the option to remove the existing index fields.

Olivers assumption was that the existing index is an automated generated one, which should be marked as auto generated so can be removed when exchanged by user.

Two question:
1) What makes Oliver think the index is auto generated?
2) Why should an existing auto generated index be removed, but an existing user index be kept? 


Wouldn't in general showing existing indexs and the choice of removal a more flexible solution?
Comment 8 Svante Schubert 2015-11-09 20:13:00 UTC
Note the new index file is being merged, only "miles" will be added.

What I see as an issue is that for each of the existing index entries always TWO lines are being added even if the entry only exist ones, for instance:
"Sacramentarium Triplex"

In the content.xml:
<text:alphabetical-index-mark text:string-value="Sacramentarium Triplex" text:key1="Sacramentarium Triplex"/>
<text:span text:style-name="T5">Sacramentarium Triplex</text:span>

This is at least to me much more annoying than the merge behavior and the not obivious way to delete existing indices.
Comment 9 Oliver Specht (CIB) 2015-11-10 07:44:02 UTC
(In reply to Svante Schubert from comment #7)
> 
> Olivers assumption was that the existing index is an automated generated
> one, which should be marked as auto generated so can be removed when
> exchanged by user.
Not the index is auto generated but the index entries!
> 
> Two question:
> 1) What makes Oliver think the index is auto generated?
The attached concordance file makes me think that.
> 2) Why should an existing auto generated index be removed, but an existing
> user index be kept? 
As above: Not the index but the entries are removed when a new index is created. That's the idea behind the concordance file.
> 
> 
> Wouldn't in general showing existing indexs and the choice of removal a more
> flexible solution?
The idea of using a concordance file is to make sure the usually used technical/scientific/... terms are marked automatically so that you don't need to do the work manually for each document you create. While editing the document some of the related text might be removed others might be added so it makes sense to create the automatic index entries again. To do that you need to know which of the index entries were automatically created - otherwise you would also delete manually created entries.

To make a long story short: The standard is missing a feature that has been introduced in Writer before it became open source.
Comment 10 Oliver Specht (CIB) 2015-11-10 08:08:29 UTC
(In reply to Svante Schubert from comment #8)
> What I see as an issue is that for each of the existing index entries always
> TWO lines are being added even if the entry only exist ones, for instance:
> "Sacramentarium Triplex"
No issue at all.
The index generates a line for the key and one for the entry. They just happen to show the same string because the entry has a 1st key with the same content as the index entry.
Comment 11 Bernard Moreton 2015-11-10 10:11:56 UTC
Tuppence worth again, mostly to show I'm listening to the experts.  The original indexing was done with an earlier form of the concordance file, with too many keys (a mis-understanding of the online Help).  Other than that, both indexings were done the same way.  And the concordance file is not (in this case) a list of standard technical stuff - though it is being applied to all chapters - but based on the author's original indexing, done (I think) in XyWrite III+ .

I'm glad there is consensus that the problem is real.  And it is really confusing to the hack like me who gets stuck in quicksand!  I'm *so* glad of the indexing facility in Writer; it'd be nice if it were just a bit more friendly ...

Thank you all!
Comment 12 QA Administrators 2017-01-03 19:37:12 UTC Comment hidden (obsolete)
Comment 13 Bernard Moreton 2017-01-04 09:59:46 UTC
The bug is still present in LO 5.2.3.2 on Ubuntu 16.04 LTS

If anything, it is now slightly worse, because although the test document opens at its end, the first attempt at inserting the alphabetic index resulted in it being inserted halfway through the document.  I had to Undo, then positively place the cursor at the end of the document, then Insert the alphabetic index again.

But in both attempts, the previously-deleted index entries re-appeared, together with the one new expected entry from the sample concordance file.
Comment 14 QA Administrators 2018-01-05 03:40:56 UTC Comment hidden (obsolete)
Comment 15 Bernard Moreton 2018-01-12 14:06:55 UTC
The problem still exists in LO 5.4.4.2 on Ubuntu 16.04 LTS, at least when using the test documents suppplied.  

As before, the first attempt to insert the index inserted it on p.18, not at the end, on p.21, though the cursor had been at the end of the document already.

I still have not seen any "small grey squares" in the Writer document; perhaps they're in the XML?  Anyway, I still seem to be stuck with the original index entries, even though they were generated from that earlier concordance file format.

I'd really like to get shot of them!
Comment 16 Bernard Moreton 2018-01-13 16:45:37 UTC
Those "small grey squares" appear when Field Shading is turned on.  

The field shading for the earlier indexing (? no definition for whole words ?) appears before the relevant text, where that for the sample concordance file (set for 'Whole words') covers the indexed word.

And I've found a way to remove the old entries:
1 Save As RTF format
2 close the original file (header shows RTF, but field shading still apparent)
3 open new RTF file, then re-save as ODT.
Comment 17 QA Administrators 2019-01-14 03:51:48 UTC Comment hidden (obsolete)
Comment 18 Bernard Moreton 2019-01-14 11:55:36 UTC
There has been change, but the situation is now more muddled than before.
Using the same ODT and concordance file as before, 
Insert / Table of Contents and Index
now gives 3 options (Index entry, Bibliography entry, Table of Contents Index and Bibliography).  Using the last (the first two don't seem helpful), then selecting Alphabetic Index as the type, there's been a change of layout, with no place to show the concordance file name.

If I right click on File, then I sometimes get three options (Open, New, Edit), but mostly only the first two.  
- If I can select Edit, then the intended (last-used?) concordance file is displayed, with its name.  
- If New is selected, then a file selection page opens, but shows only directories, not my list of *txt concordance files, even after navigating back (!) to the current working directory.  
- If Open is selected, then the index is recreated from the (?last-used) concordance file, without any indication of its identity;  and as before, entries are duplicated.

Oh dear ...


Help About:
Version: 6.1.4.2
Build ID: 1:6.1.4-0ubuntu0.18.04.1~lo1
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: en-GB (en_GB.UTF-8); Calc: group threaded
Comment 19 QA Administrators 2021-01-26 05:12:52 UTC Comment hidden (obsolete)
Comment 20 Bernard Moreton 2021-01-26 15:50:42 UTC
There has been no change since my comment 18, two years ago.
Where FILE is called for, the display does not list the *.txt files (possible concordance files) in the chosen directory, so none can be selected.
However, on proceeding without selecting a concordance file, the previous index reappears,

Version: 7.0.1.2
Build ID: 00(Build:2)
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-GB (en_GB.UTF-8); UI: en-GB
Ubuntu package version: 1:7.0.1_rc2-0ubuntu0.18.04.1
Calc: threaded
Comment 21 QA Administrators 2023-01-27 03:25:32 UTC Comment hidden (obsolete)
Comment 22 Bernard Moreton 2023-01-27 12:50:16 UTC
I note that there *has* been a change - possibly before my comment last year - in that concordance files seem now to be *.sdi rather than *.txt.

On my local system I copied the concordance file to have the suffix ".sdi", and it now appears in the directory listing;  but it does not seem possible to review the whole file, only to amend line by line (?).  THIS IS NOT HELPFUL!

However, whether deliberately choosing the concordance file or by ignoring the option to choose and (presumably) accepting the previous usage, the same result is obtained, and the index previously deleted re-appears.

Version: 7.4.4.2 / LibreOffice Community
Build ID: 40(Build:2)
CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-GB (en_GB.UTF-8); UI: en-GB
Ubuntu package version: 1:7.4.4-0ubuntu0.22.04.1~lo1
Calc: threaded