Bug 162716 - TextInputStream.ReadLine() will not always strip line-ending characters (i.e. \r or \n)
Summary: TextInputStream.ReadLine() will not always strip line-ending characters (i.e....
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: BASIC (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyBeginner, easyHack, skillCpp
Depends on:
Blocks: Macro-StarBasic
  Show dependency treegraph
 
Reported: 2024-08-30 19:33 UTC by rehierl
Modified: 2024-09-01 03:15 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
sample code to showcase the issue with TextInputStream.ReadLine() (1.53 KB, text/plain)
2024-08-30 19:35 UTC, rehierl
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rehierl 2024-08-30 19:33:56 UTC
Description:
Assume a ReadLine(input as string) routine that creates an Uno Pipe, a TextOutputStream that writes $input to the pipe, and a TextInputStream that uses its ReadLine() method to read the frist line from that pipe.

According to https://api.libreoffice.org/docs/idl/ref/interfacecom_1_1sun_1_1star_1_1io_1_1XTextInputStream.html#a723d8ab1b4f3966a0e95043c20d8fa71, the ReadLine() function should always right-trim all line-ending charactes (\r and/or \n) from the string it returns. However, that is not always the case:

input = "abc" & chr(13) 'ends in \r
ret = ReadLine(input) 'ret has length 4 - ERROR

input = "abc" & chr(10) 'ends in \n
ret = ReadLine(input) 'ret has length 4 - ERROR

In both cases (i.e. a single line-ending character at the very end) the line-ending character will be returned as the last character of the result.

input = "abc" & chr(13) & "def" 'no issue - ok
input = "abc" & chr(10) & "def" 'no issue - ok
input = "abc" & chr(13) & chr(10) 'no issue - ok
input = "abc" & chr(13) & chr(10) & "def" 'no issue - ok

In all other cases I have tried, there doesn't seem to be an issue.

The workaround is obvious: Don't forget to right-trim the returned result.


Steps to Reproduce:
As described above and by the sample code in the attachment.

Actual Results:
The returned string of TextInputStream.ReadLine() may include line ending characters.

Expected Results:
Line ending characters should always be stripped from the returned string, as is suggested by the api documentation.


Reproducible: Always


User Profile Reset: No

Additional Info:
Tested the pre-installed version of LibreOffice:

Version: 7.3.7.2 / LibreOffice Community
Build ID: 30(Build:2)
CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: de-DE (en_US.UTF-8); UI: en-US
Ubuntu package version: 1:7.3.7-0ubuntu0.22.04.6
Calc: threaded

Tested on the current "fresh" version of LibreOffice as an AppImage,
just to confirm that this is still an issue:

Version: 24.8.0.3 (X86_64) / LibreOffice Community
Build ID: 0bdf1299c94fe897b119f97f3c613e9dca6be583
CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: de-DE (en_US.UTF-8); UI: en-US
Calc: threaded

Same on Windows in a virtual machine:

Version: 24.2.5.2 (X86_64) / LibreOffice Community
Build ID: bffef4ea93e59bebbeaf7f431bb02b1a39ee8a59
CPU threads: 2; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: de-DE
Calc: threaded
Comment 1 rehierl 2024-08-30 19:35:38 UTC
Created attachment 196120 [details]
sample code to showcase the issue with TextInputStream.ReadLine()
Comment 2 Roman Kuznetsov 2024-08-31 09:01:37 UTC
Mike, I'm not sure what behavior should be here, any opinion?
Comment 3 Mike Kaganski 2024-08-31 09:34:07 UTC
Code pointer: OTextInputStream::implReadString in https://opengrok.libreoffice.org/xref/core/io/source/TextInputStream/TextInputStream.cxx?r=36703975#160; line 191 there is where the break happens for the EOF case, before stripping code starting at line 198 has a chance to do its job.