Bug 113002 - LibreOfficeCalc: Pivot Table recurring data field name will added a simple quotation marks before number
Summary: LibreOfficeCalc: Pivot Table recurring data field name will added a simple qu...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.5.0 target:7.4.0.0.beta2
Keywords:
: 128693 135827 (view as bug list)
Depends on:
Blocks: Pivot-Table
  Show dependency treegraph
 
Reported: 2017-10-08 17:13 UTC by camisso
Modified: 2022-06-21 12:11 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Example for the pivot table with recurring names (11.11 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-10-10 17:18 UTC, camisso
Details
Example for mismatch, if newest are sorted and number as row, sorting is wrong for two columns (16.85 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-10-10 21:47 UTC, camisso
Details

Note You need to log in before you can comment on or make changes to this bug.
Description camisso 2017-10-08 17:13:49 UTC
Hello,
e.g. for a given table:
Date	Number
12.01.2008	1
15.01.2008	2
17.01.2008	1
28.01.2008	1
31.01.2008	17
14.02.2008	1
18.02.2008	1
28.02.2008	2
29.02.2008	1
09.03.2008	1
14.03.2008	1
31.03.2008	4
15.04.2008	1
17.04.2008	1

If I make a pivot table of this data, grouped by year and month, the year will added a simple quotation marks before the number. 

If I have already numbers, e.g. year, and I check at data field options recurring names there will be a mismatch, the original numbers are numbers and the recurring names are strings. No correct sorting is possible and Conditional Formatting will work only for the numbers, not the recurring names.

Thank you in advance for your help.
Comment 1 m_a_riosv 2017-10-10 08:51:30 UTC
(In reply to camisso from comment #0)
>
> If I have already numbers, e.g. year, and I check at data field options
> recurring names there will be a mismatch, the original numbers are numbers
> and the recurring names are strings. No correct sorting is possible and
> Conditional Formatting will work only for the numbers, not the recurring
> names.

Please attach a sample file.
Comment 2 camisso 2017-10-10 17:18:03 UTC
Created attachment 136898 [details]
Example for the pivot table with recurring names
Comment 3 m_a_riosv 2017-10-10 18:37:44 UTC
Yes the year has a quotation mark, but if I'm not wrong dates a showed as text on the pivot table, e.g. GETPIVOTDATA() function search for text to find the data.

Maybe some light on this bug report https://bugs.documentfoundation.org/show_bug.cgi?id=35247
Comment 4 m_a_riosv 2017-10-10 18:42:10 UTC
I think the sort on the PT it's done fine. And for CF should not be difficult get it working, please if you are not able, ask the question on https://ask.libreoffice.org/en/questions/
Comment 5 camisso 2017-10-10 21:47:11 UTC
Created attachment 136900 [details]
Example for mismatch, if newest are sorted and number as row, sorting is wrong for two columns
Comment 6 m_a_riosv 2017-10-10 23:34:59 UTC
After refresh sort it's fine for me. To avoid lost the CF give it a range larger than PT rows because format of PT cells is reset with refresh.
Comment 7 camisso 2017-10-11 07:32:55 UTC
@m.a.riosv
Did you look at the end of the table? There are, even after refesh, the numbers of the Year, above are year numbers as string.
Comment 8 m_a_riosv 2017-10-11 08:11:20 UTC
Ok, I see that the first time a year appears on the list it's as number without the quotation mark while the rest have it.
Comment 9 QA Administrators 2018-10-12 03:05:02 UTC Comment hidden (obsolete)
Comment 10 camisso 2018-10-13 21:28:21 UTC
The bug is still present.
Version: 6.1.2.1 (x64)
Comment 11 QA Administrators 2019-10-14 02:27:16 UTC Comment hidden (obsolete)
Comment 12 Jean-Marc Le Peuvédic 2020-10-11 21:35:26 UTC
This bug is present in version 7no linux.

Version: 7.0.1.2
Build ID: 88a63c56098013eb4038e11ebe7c8c0daab09aa8
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: fr-FR (fr_FR.UTF-8); Langue IHM : fr-FR
Calc: threaded

The first entry in the pivot table is correct, the following entries have a quote.
Comment 13 Milston B. 2020-10-28 16:54:24 UTC
Same behaviour in Windows 10



     Version: 7.0.1.2 (x86)
     Build ID: 7cbcfc562f6eb6708b5ff7d7397325de9e764452
     CPU threads: 4; OS: Windows 10.0 Build 17763; UI render: Skia/Raster; VCL: win
     Locale: de-DE (de_DE); UI: de-DE
Comment 14 Robert Lacroix 2020-12-27 06:28:18 UTC
Bug is present.

Version: 6.1.5.2
Build ID: 1:6.1.5-3+deb10u6
CPU threads: 12; OS: Linux 4.19; UI render: default; VCL: x11; 
Locale: en-CA (en_CA.utf8); Calc: group threaded

This happens with any numeric value used as a row field with the field option "Repeat item labels" enabled. Original labels are numeric and repeated labels are converted to text.

The problem is not merely cosmetic. Using the labels of this field in the PT as the SearchCriterion of a VLOOKUP formula will have different results depending on whether the particular row's item label is numeric or text.

One workaround is to convert this column in the source data to a text value. One way to do this for computed values is to append an empty string onto the computation. For example:  =date(1995,7,28)&""  Then all the item labels will be text for that field.

This is an unsatisfactory workaround if you expect the field to have numeric values, like when applying conditional formatting or when using the labels in a numeric computation. For instance when cell A3 contains a text value  '1234  and cell D3 contains a formula  =A3>1235  the value of cell D3 is TRUE.
Comment 15 Robert Lacroix 2020-12-27 11:50:28 UTC
e: #(In reply to Robert Lacroix from comment #14)
 
> One workaround is to convert this column in the source data to a text value.
> One way to do this for computed values is to append an empty string onto the
> computation. For example:  =date(1995,7,28)&""  Then all the item labels
> will be text for that field.
> 
> This is an unsatisfactory workaround if you expect the field to have numeric
> values, like when applying conditional formatting or when using the labels
> in a numeric computation. For instance when cell A3 contains a text value 
> '1234  and cell D3 contains a formula  =A3>1235  the value of cell D3 is
> TRUE.

Actually, this workaround is barely even useful - it works only for numbers whose integer part is fixed-length, like a bar code. Integers of varying length converted to text strings will sort ascending lexically so that  "1000" comes before "104". The workaround sorts the text representation of the date/time values correctly but the resulting string of digits is practically worthless because you can't interpret it as a date without converting it back to a number.
Comment 16 Robert Lacroix 2020-12-27 14:26:29 UTC
One more comment about the repeated item label values for dates. The text strings produced for repeated item labels follow the formatting of the source column. So dates in the source formatted as "MMM YYYY" produce 8-character values for text item labels, while dates in the source formatted as "YYYY-MM-DD" produce 10-character values for text item labels. Meanwhile the value of the numeric item labels in the PT retain full date precision regardless of the formatting of the source column or formatting of the pivot table cells.

It bears pointing out that under all circumstances, the VALUE of an item label should reflect the value of the source column (whether it's an original item label or a repeated item label). The FORMAT of the item label should reflect, primarily, the format applied to cells in the pivot table, and if otherwise cleared of formatting, secondarily to the format of the source column. IT IS A SERIOUS BUG to use the formatted appearance of a source column for the value of an item label.

TLDR; stop here.

To belabour the point, let's look at an example that would compute COST OF GOODS SOLD in a business' quarterly income statement, where the fiscal year end is July 31. (NB: real accounting software should be used instead of LO-Calc, but we are talking about a hypothetical example; CAN and SHOULD are irrelevant).

The source data is a list of sales transactions. Among the data columns are:
"tr_date" the transaction date of an invoice
"barcode" of each item sold on an invoice
"quantity" of each item sold on an invoice

A new source column "qe_date" is added which computes the last date of a fiscal quarter (it's pretty ugly).

=EOMONTH(tr_date,(INT(MOD((MONTH(tr_date)+4),12)/3)+1)*3+7-MONTH(tr_date))

A pivot table is used to tabulate the total number of each item sold by barcode in every fiscal quarter. The PT fields are
"qe_date_field" the result of selecting "qe_date" as a row field
"barcode_field" the result of selecting "barcode" as a row field
"sum(quantity)" the data field

In the leftmost row postion, the "Repeat item labels" option must be turned on for "qe_date_field" since we want to use that field to look up the quarterly average price of inventory in another table for computing COST OF GOODS SOLD for the quarter. The price lookup could be done in the source table, but let's follow through this example because there might be only 160 lookups in the PT compared to doing this in the source table which might be 2000 lookups.

For item labels of "qe_date_field" which are numeric, a simple cell reference suffices to extract the relevant quarter-ending date.

=qe_date_field

But for text items labels it's a different story. We must use DATEVALUE() to get the date. But DATEVALUE produces an error for numeric fields, we must test "qe_date_field" thusly:

=IF(ISNUMBER(qe_date_field),qe_date_field,DATEVALUE(qe_date_field))

This works fine when "YYYY-MM-DD" is the date format of the "qe_date" source column. All PT item labels, including repeated labels, show the correct quarter-ending date.

But say the source column uses a natural date format for quarter-ending. Using the date format "MMM YYYY", DATEVALUE(qe_date_field) produces the first day of the month instead of the last, so the cell reference must be

=IF(ISNUMBER(qe_date_field),qe_date_field,EOMONTH(DATEVALUE(qe_date_field)))

--or--

=IF(ISNUMBER(qe_date_field),qe_date_field,DATEVALUE(qe_date_field)+30))


Long story short, the cell reference needed to work around this bug is both complex and fragile (being subject to the whims of source column formatting).
Comment 17 Robert Lacroix 2020-12-27 14:39:02 UTC
(In reply to Robert Lacroix from comment #16)
correction:
> --or--
> 
> =IF(ISNUMBER(qe_date_field),qe_date_field,DATEVALUE(qe_date_field)+30))
> 
> 
> Long story short, the cell reference needed to work around this bug is both
> complex and fragile (being subject to the whims of source column formatting).

Actually this alternate formula doesn't work at all for quarter ending April 30.
See? Use accounting software.
Comment 18 m_a_riosv 2020-12-28 08:12:06 UTC
*** Bug 128693 has been marked as a duplicate of this bug. ***
Comment 19 Robert Lacroix 2020-12-29 21:51:14 UTC
*** Bug 135827 has been marked as a duplicate of this bug. ***
Comment 20 Andreas Heinisch 2022-06-15 15:42:49 UTC
Proposed patch still does not fix:
If I make a pivot table of this data, grouped by year and month, the year will added a simple quotation marks before the number. 

Proposed patch fixes:
https://gerrit.libreoffice.org/c/core/+/135915
If I have already numbers, e.g. year, and I check at data field options recurring names there will be a mismatch, the original numbers are numbers and the recurring names are strings. No correct sorting is possible and Conditional Formatting will work only for the numbers, not the recurring names.
Comment 21 Commit Notification 2022-06-19 17:04:03 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/30529311ffbbe193765aff0ff353d3409fce3a3e

tdf#113002 - Add numeric flag to recurring data fields

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Andreas Heinisch 2022-06-19 17:06:41 UTC
The other issue is on purpose and no idea, if we should fix it.
Comment 23 Commit Notification 2022-06-21 12:11:29 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/b013342bf7b16cb2393bd7e020707ff58f4d88de

tdf#113002 - Add numeric flag to recurring data fields

It will be available in 7.4.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.