Bug 143183 - Format Basic function needs better description of its 'format' argument
Summary: Format Basic function needs better description of its 'format' argument
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: HelpGaps-NewFeatures
  Show dependency treegraph
 
Reported: 2021-07-04 15:59 UTC by Mike Kaganski
Modified: 2021-07-08 05:54 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2021-07-04 15:59:05 UTC
In bug 114418, a link to Number format codes [1] was added to Format function help page [2].

The next step should be to clarify the textual description of the 'format' argument of the function, specifically:

1. Remove the incomplete information from [2] that duplicates part of [1], and instead, put another reference to [1], with a clarification that only format string elements for en-US are accepted in Basic format string;
2. Extend the "Predefined format" section, mentioning all the accepted special Basic-only predefined formats, namely:

  * "<" (lowercase)
  * ">" (uppercase)
  * "c" (same as "General Date")
  * "n" (minute (1-2-digit))
  * "nn" (minute (2-digit))
  * "w" (weekday)
  * "y" (day of year)
  * "General Date" (short date format, optionally with "H:MM:SS AM/PM")
  * "General Number" (0.############)
  * "Currency" (depends on current locale settings)
  * "Fixed" (0.00)
  * "Standard" (@0.00)
  * "Percent" (0.00%)
  * "Scientific" (#.00E+00)
  * "Yes/No" (Yes or no, localized)
  * "True/False" (True or False, localized)
  * "On/Off" (On or Off, localized)

[1] https://help.libreoffice.org/7.2/en-US/text/shared/01/05020301.html?&DbPAR=BASIC
[2] https://help.libreoffice.org/7.2/en-US/text/sbasic/shared/03120301.html?DbPAR=BASIC
Comment 1 Mike Kaganski 2021-07-04 16:36:01 UTC
There are more special format strings in Basic:

  * "Long Date" ("System long date format" (in fact, depends on LO locale))
  * "Medium Date" ("DD-MMM-YY")
  * "Short Date" ("System short date format" (in fact, depends on LO locale))
  * "Long Time" ("H:MM:SS AM/PM")
  * "Medium Time" (HH:MM AM/PM)
  * "Short Time" (HH:MM)
  * "ddddd" (same as "Short Date")
  * "dddddd" (same as "Long Date")
  * "ttttt" (H:MM:SS AM/PM)

(see getFormatInfo() and pFormatInfoTable)
Comment 2 Mike Kaganski 2021-07-04 21:09:53 UTC
OMG, how convoluted is it all.
There is a special processing in Basic for text input.

* "!" in the first position of format string means "take first character only (and ignore the rest of both input string, and of format string)":

> Format("FooBar", "!!abc") => "F" (everything after initial "1" is discarded)

* "\" in the first position of format string means "starting from this, start outputting character by character from the input string, until another "\" is found, then break":

> Format("FooBar", "\.!?\abc") => "FooBa" (5 character output: one for each "\",
>                                       ".", "!", "?", and "\").

* Everything else in the format string just means "output the input string verbatim".
This effectively disables normal format string handling for strings that can't convert to text - so e.g. this doesn't produce an expected result:

> Format("BAZ", "foo@bar") => "BAZ", not expected "fooBAZbar", nor
>                             "fooBbazAR", as VBA gives

IIUC, VBA doesn't have such a strange thing. Where does this come from? Even https://wiki.openoffice.org/w/images/c/c1/BasicGuide_OOo3.2.0.pdf doesn't mention those strange things. Should we just drop that special processing?

See printfmtstr in basic/source/sbx/sbxscan.cxx
Comment 3 Olivier Hallot 2021-07-07 22:10:14 UTC
The printfmtstr extra codes for strings may return unexpected values if not carefully coded in Basic

msgbox Format("FooBar", "!!abc") '=> "F" (everything after initial "1" is discarded)
msgbox Format("123.456", "!") '=> return !
msgbox Format("123.456", "!!abc") '=> return !!abc 
msgbox Format("123.abc456", "!!abc") '=> return 1

So, if the string contents is a valid number, the '!' format string return the format string itself.
otherwise, return the first character of the input string.


Also the "&" format code seems to return the format string itself.
msgbox Format("ADCDEFGHIJKLMNOPQRSTUVWXYZ1234567890", "&abc")
msgbox Format(123.456, "&abc")
msgbox Format("123.456", "&abc")

all return &abc.
Comment 4 Mike Kaganski 2021-07-08 05:54:12 UTC
(In reply to Olivier Hallot from comment #3)
> Also the "&" format code seems to return the format string itself.
> msgbox Format("ADCDEFGHIJKLMNOPQRSTUVWXYZ1234567890", "&abc")
> msgbox Format(123.456, "&abc")
> msgbox Format("123.456", "&abc")
> 
> all return &abc.

No, the first one returns "ADCDEFGHIJKLMNOPQRSTUVWXYZ1234567890" - the handling of "&" in printfmtstr is the same as treatment of everything else ("default" clause in the switch), with the only difference of also modifying position of the pFmt, which is only used for return value of the function ("the number of characters used from the format"), which is in fact unused in the only place where the function is called - hence I didn't mention the "&" in comment 2.

When the string may be converted to number, it is handled as a number, using different rules.

In general, I suppose we need to drop this "printfmtstr" idiocy, as I suggested in tdf#143193.