Bug 89178 - Mail merge does not change the merge fields in files (when saved as individual documents) to plain text
Summary: Mail merge does not change the merge fields in files (when saved as individua...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:24.2.0 target:7.6.3
Keywords:
: 100375 (view as bug list)
Depends on: 67207
Blocks: Mail-Merge
  Show dependency treegraph
 
Reported: 2015-02-06 16:52 UTC by Valdir Barbosa
Modified: 2023-10-27 09:32 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Numbers of processes (50.40 KB, application/vnd.oasis.opendocument.text)
2015-02-06 16:52 UTC, Valdir Barbosa
Details
mailmerge ex. (36.62 KB, application/zip)
2015-02-08 16:26 UTC, Valdir Barbosa
Details
mailmege (11.54 KB, application/vnd.oasis.opendocument.text)
2015-07-13 16:39 UTC, Valdir Barbosa
Details
This test will generate a file as single Document and as individual Documents in mail merge (176.77 KB, image/png)
2017-07-28 17:58 UTC, Valdir Barbosa
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Valdir Barbosa 2015-02-06 16:52:08 UTC
Created attachment 113181 [details]
Numbers of processes

The files generated by Mail Merge does not allow editing of merged fields. Appears the contents of the field, but are as mail merge fields and does not change.
Comment 1 Cor Nouws 2015-02-06 22:32:00 UTC Comment hidden (obsolete)
Comment 2 Valdir Barbosa 2015-02-06 22:55:40 UTC Comment hidden (obsolete)
Comment 3 Cor Nouws 2015-02-07 20:41:44 UTC Comment hidden (obsolete)
Comment 4 Valdir Barbosa 2015-02-08 16:26:26 UTC
Created attachment 113232 [details]
mailmerge ex.
Comment 5 Valdir Barbosa 2015-02-08 16:27:11 UTC
Hello Corn,
Make a simple mail merge with name, address and city, enter the fields in the main file and send to print single or individual files.
Open the generated file and try changing the merged content. The content appears as field and does not change unless you delete the field.
See the .doc file and compare as .odt file. The  odt file content should be equal to .doc.
Comment 6 Cor Nouws 2015-02-08 20:01:23 UTC Comment hidden (obsolete)
Comment 7 Valdir Barbosa 2015-07-13 16:39:53 UTC
Created attachment 117212 [details]
mailmege
Comment 8 Valdir Barbosa 2015-07-13 16:40:46 UTC
Testing mail merge

After printing mail merge, this is result:

Individual files in .ODT


NAME: Ana Maria Braga

DEPTO: Pró-Reitora de Graduação

FUNCTION: Assessor

CIVIL STATUS: Solteira

The fields can not be edited.


You need print mail merge in format .DOC for edit.

Individual files in .DOC

NAME: Ana Maria Braga

DEPTO: Pró-Reitora de Graduação

FUNCTION: Assessor

CIVIL STATUS: Solteira
Comment 9 Buovjaga 2015-09-01 16:11:54 UTC
(In reply to Valdir Barbosa from comment #4)
> Created attachment 113232 [details]
> mailmerge ex.

Could not reproduce with FORM.odt using Database.ods as address source.
No fields appear in the merged document.

Are you using LibreOffice 5.0? Please test with it, if not.

Win 7 Pro 64-bit, Version: 5.0.1.2 (32-bit)
Build ID: 81898c9f5c0d43f3473ba111d7b351050be20261
Locale: fi-FI (fi_FI)
Comment 10 Jan-Marek Glogowski 2015-09-15 20:55:20 UTC
Actually this bug mixes up multiple problems. Beware I'm writing this from my heart without a test, so feel free to prove me wrong :-)

1. As long as you _don't_ mail merge the result into a single file, the DB form fields will be preserved in the resulting _ODT_ documents! Another exception is _PDF_, where ConvertFieldsToText is enforced.

AFAIK this has never been different. Internally it's a speed optimization, because you don't need to copy the document, as currently "ConvertFieldsToText" can't be undone. So we just change the field content and save as an individual document. For others this might not be a bug, but a feature - who knows...

So from the users POV this is inconsistent, as MM produces files with and without form fields, depending on the output.

And these form fields aren't editable, as the content is from a DB. This is intentional too!

BTW - we're talking internal representation here. When saving we face the 2nd problem.

2. Microsoft Word formats have no equivalent to LO DB form fields used for LO mail merge.

We're just talking about the form field type for MM here! I don't know how MS does their mail merge, but obviously different. So when saving to DOC, the current exporter - at some point - converts these fileds to text.

I remember a bug, where someone complained he couldn't use a DOC for MM. Same reason.
And I was told there is nothing we can do about it.

==========

So please validate my information and test them ;-)
And probably close this bug.
Comment 11 Alex Thurgood 2016-06-16 12:37:41 UTC
Jan-Marek : spot on 8 You hit the nail on the head :-)

I confirmed a duplicate report of the same behaviour this morning, where I noted this inconsistency in behaviour between single file mailmerge to ODT and multiple file mailmerge to ODT.

Confirming, as I reproduced this behaviour this morning in response to another bug report (now to find it and mark it as DUP).
Comment 12 Alex Thurgood 2016-06-16 12:39:41 UTC
*** Bug 100375 has been marked as a duplicate of this bug. ***
Comment 13 Cor Nouws 2016-06-16 16:57:38 UTC
(In reply to Alex Thurgood from comment #11)

> Confirming, as I reproduced this behaviour this morning in response to
> another bug report (now to find it and mark it as DUP).

So if I understand it correct, the issue is that when saved as individual documents, the fields remain fields.. Hmm :)
Changing summary accordingly then.
Thanks for clarifying this!
Comment 14 Alex Thurgood 2016-06-17 06:53:13 UTC
(In reply to Cor Nouws from comment #13)


> So if I understand it correct, the issue is that when saved as individual
> documents, the fields remain fields.. Hmm :)
> Changing summary accordingly then.
> Thanks for clarifying this!

Yes, thanks Cor, that is exactly it.
Comment 15 Cor Nouws 2016-06-17 09:07:11 UTC Comment hidden (obsolete)
Comment 16 Valdir Barbosa 2016-06-17 10:17:14 UTC Comment hidden (obsolete)
Comment 17 Jan-Marek Glogowski 2017-04-07 09:54:23 UTC
Since there were some private mails regarding this bug, I'll add an additional comment.

The problem of this "fix" is not the implementation, MM wise. That would be rather trivial, in regards to the MM code (just extending two if conditionals is my guess, to switch the behavior).

But my concern is the speed of the whole MM process. We spend a lot of time (and money) to speed-optimize MM, so the previous hour long work is now down to minutes for us. While this is probably insignificant for < 500 documents, we use MM to create 10.000+ documents.

And now it becomes a problem: exchanging fields with text additionally forces to create internal copies for every document, as form=>text can't be undone and instead of just updating the fields and saving the document. But even if someone implemented the undo, it would still be additional work costing time, but probably less significant.

I don't see why the current behavior is a bug. Forms are not printed as fields and even removed when generating PDF. I don't see a need to sacrifice speed for consistency here.

The original bug claimed it is a change in LO 4.0, Cor changed it into "inherited from OOo", so from my POV it should be an enhancement and not a regression. I didn't check older versions, when I wrote comment #10, that's why I wrote: please prove me wrong (and set the regression tag).

= Implementation suggestion =

So IMHO, in the end a fix would be to create a new option, like "Force replacement of fields with text", with a sensible description for the users regarding speed and probably even extend the UNO interface for MM.

I can supply some code pointers, if needed.
Comment 18 Valdir Barbosa 2017-07-28 17:58:06 UTC
Created attachment 134947 [details]
This test will generate a file as single Document and as individual Documents in mail merge

This test will generate a file as single Document and as individual Documents.
Comment 19 Mike Kaganski 2018-05-26 14:09:03 UTC
(In reply to Jan-Marek Glogowski from comment #17)

I suppose that having an export option that would allow to convert fields to their text *on saving*, in filter, would allow to get the desired result without sacrificing the speed. A filter option that would convert *some* (visible) fields, keeping bookmarks and the like?
Comment 20 Jan-Marek Glogowski 2018-07-03 10:18:09 UTC
(In reply to Mike Kaganski from comment #19)
> (In reply to Jan-Marek Glogowski from comment #17)
> 
> I suppose that having an export option that would allow to convert fields to
> their text *on saving*, in filter, would allow to get the desired result
> without sacrificing the speed. A filter option that would convert *some*
> (visible) fields, keeping bookmarks and the like?

Then you would have to implement this feature in all filters, instead of the internal document model. I guess your suggestion would work, but it's a lot of work.

Eventually implementing an undo for "fields => text" would be better. That would be nice to have generally.

An other idea would be a "node only" internal document copy for mail merge, which shares all the meta data, like defaults, styles etc. Currently the SwDoc::CreateCopy creates an individual document with new styles, which has to update all references in the nodes and also re-validates all the document data. It's currently more or less the same then a "copy and paste" between documents.
Comment 21 Cor Nouws 2019-06-07 14:01:09 UTC
If bug 80786 would be implemented, that could make life here a bit easier maybe.
Comment 22 Ben Fleming 2021-01-11 15:48:23 UTC
There have been several bugs logged on here regarding this issue, or what users perceive it as, e.g. 54703 and duplicates of that bug which have been closed.

The issue is that for some reason, when saving to individual documents, the outputted documents do not get "flattened" they are just copies of the original, still connected to the data source, they just happen to have the relevant record selected.  This is not the behaviour if the whole merge is to one document, in which case it is all "flattened" correctly.

The users who are saying that hidden paragraphs are not working are probably seeing this behaviour, along with having "Show hidden paragraphs" ticked in their Writer settings.

Foe me the gotcha is that I noticed the behavour because I have a BASIC macro that produces invoices based on a spreadsheet and/or a base database.  When I perform this merge through the UI, the individual documents saved are "flattened" as expected.  But doing it through UNO (BASIC), I get the version with the fields still intact.

Jan-Marek Glogowski above mentioned speed and the belief that this incorrect behaviour actually sped things up in his use case.  Given that the front-end seems to have been fixed at some point to behave correctly, could there not be a setting in the com.sun.star.text.MailMerge service with a boolean to perform the flattening if required?  Personally I consider it a significant bug but I bet others consider it a feature.
Comment 23 Jan-Marek Glogowski 2021-01-11 19:51:01 UTC
(In reply to Ben Fleming from comment #22)
> Foe me the gotcha is that I noticed the behavour because I have a BASIC
> macro that produces invoices based on a spreadsheet and/or a base database. 
> When I perform this merge through the UI, the individual documents saved are
> "flattened" as expected.  But doing it through UNO (BASIC), I get the
> version with the fields still intact.

There is a "hidden secret" in the MM wizard code. One step of it allows you to edit the merged documents before "finalizing" them. But for this step the wizard actually generates a single document (AKA flattened) and MM just adds some bookmarks to mark the start / end of each single document. And just as the last step it splits the document into single ones, if a user prefers this.

I think the MM triggered via "File -> Print" can omit that step, so would be faster with larger amounts of documents. Obviously the (BASIC) UNO API too.

This is all very non-obvious from the users POV. AFAIK you can MM via "File -> Print", you can use "View -> Data Sources" and "Tools -> MM wizard". And all the generators in "File -> New -> Lables / Business cards" use MM in the end too, as the underlying mechanism. The generating code is always the MM UNO API with a descriptor to pass options into the code.

I didn't look at the MM and its UI code for a very long time, so that might have changed, but from your observation it hasn't.
Comment 24 Ben Fleming 2021-01-12 14:26:45 UTC
(In reply to Jan-Marek Glogowski from comment #23)
Thank you very much for your reply, it was really interesting to have the insight from someone who knows the code.

So if I understand you correctly, the MM process merely produces one doc with bookmarks and then it splits that document into individual files.

Is there any way to replicate this behaviour through the (BASIC) UNO API?

When I generate a MM to a single document myself manually, I cannot see any bookmarks.

> This is all very non-obvious from the users POV.

This is the problem really, I suppose at least the front-end is behaving (at least in my testing) exactly as expected, although there are a lot of bugs open to suggest it doesn't for everyone?
Comment 25 Jan-Marek Glogowski 2021-01-12 16:36:25 UTC
(In reply to Ben Fleming from comment #24)
> (In reply to Jan-Marek Glogowski from comment #23)
> Thank you very much for your reply, it was really interesting to have the
> insight from someone who knows the code.
> 
> So if I understand you correctly, the MM process merely produces one doc
> with bookmarks and then it splits that document into individual files.

No. My description is just the MM wizard behavior and describes, why you see differences between the wizard and UNO / your BASIC macro. I just had to check the source code. LO MM wizard generates a document shell and a list of bookmarks, which point into the document (see SwMailMergeConfigItem and SwDocMergeInfo). I suggest to use https://opengrok.libreoffice.org/ to display the C++ definitions.

There is also a README in https://opengrok.libreoffice.org/xref/core/sw/source/uibase/dbui/. It describes the MM behavior based on the members of SwMergeDescriptor, which describes how to process the input. That descriptor also has more comments on its members. Most members will directly match to UNO from what I remember.

> Is there any way to replicate this behaviour through the (BASIC) UNO API?
>
> When I generate a MM to a single document myself manually, I cannot see any
> bookmarks.

I think so. You need to do a MM of type MailMergeType.SHELL, not a document MailMergeType.FILE. That will generate a LO internal document shell. And it fills a list of SwDocMergeInfo with the UNO bookmarks into the merged document. So the document itself doesn't contain the bookmarks. 

I know WollMux uses it to implement its MM (if you know Java see https://github.com/WollMux/WollMux/blob/WollMux_18.2/core/src/main/java/de/muenchen/allg/itd51/wollmux/mailmerge/print/OOoBasedMailMerge.java), so that should work with BASIC too.

> > This is all very non-obvious from the users POV.
> 
> This is the problem really, I suppose at least the front-end is behaving (at
> least in my testing) exactly as expected, although there are a lot of bugs
> open to suggest it doesn't for everyone?

See https://bugs.documentfoundation.org/page.cgi?id=weekly-bug-summary.html: 12102 open bugs. And LO has probably 30-40 core developers, half of them mainly working on LO Online (that is just a guess).

And a few bugs will be for mail merge. Problem is, most features don't have any description of "expected behavior". One persons bug might literally be an other persons feature, and that is not just an XKCD joke (https://xkcd.com/1172/). If you have millions of users, there will be enough existing workflows, which expect some behavior, even if it's actually a bug. I broke stuff this way a few times.
Comment 26 Cor Nouws 2022-10-17 12:48:37 UTC
(In reply to Cor Nouws from comment #21)
> If bug 80786 would be implemented, that could make life here a bit easier
> maybe.
Now bug 45946.
Comment 27 Commit Notification 2023-10-21 19:49:36 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5555562856a8bca0869a04147fbc05a1eece9193

Related: tdf#89178 Add an option to avoid converting some fields into text

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 28 Commit Notification 2023-10-27 09:32:00 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":

https://git.libreoffice.org/core/commit/ef29791179ca6cfcb5c82a65fefcbe29e039831d

Related: tdf#89178 Add an option to avoid converting some fields into text

It will be available in 7.6.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.