Bug 159583 - No fallback to C locale with unsupported locale
Summary: No fallback to C locale with unsupported locale
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.3.7.2 release
Hardware: All Linux (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-05 23:14 UTC by Tagwerk
Modified: 2024-02-22 14:25 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Test file with accent (4.66 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2024-02-05 23:14 UTC, Tagwerk
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tagwerk 2024-02-05 23:14:20 UTC
Created attachment 192415 [details]
Test file with accent

SUMMARY:

    Libreoffice gives a:

        (soffice:1159): Gtk-WARNING **: 17:18:05.556: Locale not supported by C library.
            Using the fallback 'C' locale.

    for some systems when trying to use en_SE.UTF-8 for "Time" (which should
    give you the ISO date format of YYYY-MM-DD).

    However the issue is that not all systems give this error, with some you
    don't fallback to the 'C' locale and you can get corrupt data.

STEPS TO REPRODUCE:

    I'm only able to reproduce this on an "old" system that has been upgraded
    over the years. Following the same steps on a newly built system does not
    lead to the failure.

    Set up a "vanilla" locale, checking with locale (rather than localectl)

    Try "libreoffice --cat testfile.docx | od -c | more"

    Adjust the locale to include en_SE.UTF-8, either by setting the LC_TIME
    environment variable or choosing en_SE.UTF-8 in the KDE regional settings,
    again check with locale:

    Repeat the "libreoffice --cat testfile.docx | od -c | more"

OBSERVED RESULTS:

    "libreoffice --cat ..." reads the given file and writes it to STDOUT as
    text (which should be UTF-8 encoded)

    If you are fortunate, you get:

        $ libreoffice --cat testdoc.docx | od -c
        0000000 357 273 277   A       T   e   s   t       F   i   l   e   :
        0000020   D   i   f   f   i   c   u   l   t 303 251  \n  \n

    or, if you are not so fortunate:

        $ libreoffice --cat testdoc.docx | od -c
        0000000   A       T   e   s   t       F   i   l   e   :       D   i   f
        0000020   f   i   c   u   l   t 351  \n  \n

    The latter with:

        $ locale
        locale: Cannot set LC_ALL to default locale: No such file or directory
        LANG=en_US.UTF-8
        LANGUAGE=
        LC_CTYPE="en_US.UTF-8"
        LC_NUMERIC=en_US.UTF-8
        LC_TIME=en_SE.UTF-8
        LC_COLLATE="en_US.UTF-8"
        LC_MONETARY=en_US.UTF-8
        LC_MESSAGES="en_US.UTF-8"
        LC_PAPER=en_US.UTF-8
        LC_NAME=en_US.UTF-8
        LC_ADDRESS=en_US.UTF-8
        LC_TELEPHONE=en_US.UTF-8
        LC_MEASUREMENT=en_US.UTF-8
        LC_IDENTIFICATION=en_US.UTF-8
        LC_ALL=
    
    You can test whether the output is proper UTF-8 by piping into "iconv"

        $ libreoffice --cat testdoc.docx | iconv -f UTF-8 -t UTF-8

WISHED FOR RESULTS:

    This issue is likely dependent of the update history of the underlying
    system, however the text conversion should *fail* rather than output "corrupt"
     UTF-8.

    It may be that this extra bulletproofing has been done in later releases, it
    would be nice to have confirmation

VERSION:

    LibreOffice 7.3.7.2 30(Build2)
    in Neon (on Ubuntu 22.04 LTS)
Comment 1 Buovjaga 2024-02-22 14:25:48 UTC
(In reply to Tagwerk from comment #0)
>     It may be that this extra bulletproofing has been done in later
> releases, it
>     would be nice to have confirmation

You can always quickly check with an appimage: https://www.libreoffice.org/download/appimage/