Bug 147806 - Dummy bookmarks generated when importing .doc fiels
Summary: Dummy bookmarks generated when importing .doc fiels
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:doc
Depends on:
Blocks: Bookmarks
  Show dependency treegraph
 
Reported: 2022-03-06 16:52 UTC by Eyal Rozenberg
Modified: 2022-12-22 20:51 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Document exhibiting the bug (61.21 KB, application/vnd.oasis.opendocument.text)
2022-03-21 15:18 UTC, Eyal Rozenberg
Details
Older version in .doc format (1.57 MB, application/msword)
2022-12-21 22:11 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2022-03-06 16:52:45 UTC
When importing a .doc document, it seems multiple bookmarks are generated, named _RefXXXXXXXX, where XXXXXXXX is a long number (8 or 9 digits).

I'm assuming these bookmarks have to do with the targets of references in the original document - but I'm not even sure.

Anyway, this doesn't seem right. References are references, bookmarks are bookmarks, and they should not be mixed up. 

Also, the original references are, more often than not, targeting numbered items/paragraphs, headings, actual bookmarks present in the word document, or other similar targets. In those cases, I don't see how there's any excuse to create artificial bookmarks for the reference targets (and multiple duplicate ones to boot).

Seeing this with:
Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: fb9270b238cba4f36e595c5d7f4d85f6f3f18e1c
CPU threads: 4; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: en-IL (en_IL); UI: en-US

... but actually I had imported the .doc file with an earlier nightly of 7.4.0.0 from several weeks ago.
Comment 1 Telesto 2022-03-06 22:28:30 UTC
Please add an example file illustrating the behaviour
Comment 2 Eyal Rozenberg 2022-03-21 15:18:24 UTC
Created attachment 179010 [details]
Document exhibiting the bug

As per @telesto's request, I'm attaching a document with most of its contents removed. The remaining bookmarks (originally there were dozens and dozens) seem to correspond to cross-references and possibly their targets.
Comment 3 Buovjaga 2022-12-21 11:05:00 UTC
(In reply to Eyal Rozenberg from comment #2)
> Created attachment 179010 [details]
> Document exhibiting the bug
> 
> As per @telesto's request, I'm attaching a document with most of its
> contents removed. The remaining bookmarks (originally there were dozens and
> dozens) seem to correspond to cross-references and possibly their targets.

The attachment is an .odt file. Can you attach the original .doc? I get it that you might need to sanitise it in MS Office.
Comment 4 Eyal Rozenberg 2022-12-21 22:11:45 UTC
Created attachment 184302 [details]
Older version in .doc format

So, this is probably not the exact origin of the document I've already attached, but - it's almost that. And when we open it in LO we see these arbitrary-number-named ref's.
Comment 5 QA Administrators 2022-12-22 03:36:29 UTC Comment hidden (obsolete)
Comment 6 Buovjaga 2022-12-22 07:25:07 UTC
Confirmed already in 3.5.0. I opened the file in MSO 365 (converted to new format), but I don't know how to see the references. There are no indicators even in the document text that they would exist.