Bug 159204 - Import of a text .srt file skips characters between ellipses
Summary: Import of a text .srt file skips characters between ellipses
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.6.4.1 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Format-Filters
  Show dependency treegraph
 
Reported: 2024-01-15 21:08 UTC by Eltomito
Modified: 2024-12-14 18:13 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
File to recreate the bug. (137 bytes, text/plain)
2024-01-15 21:10 UTC, Eltomito
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eltomito 2024-01-15 21:08:13 UTC
Description:
When I open a .srt file (which is really just plain text) with LO, the result seems normal except that all text after the first (and every odd) ellipsis in the file is dropped (not imported at all) until the next ellipsis is encountered.


Steps to Reproduce:
1. Create a plain text file with the following content and save it as "something.srt":
---- FILE CONTENT ----
This is imported…

THIS IS NOT IMPORTED

Neither is this…

And this is imported again.

What about this… BLAH BLAH… Is it there?
---- END OF FILE CONTENT ----
2. Open the file with LibreOffice Writer
3. Behold the text with the parts between the ellipses missing (it's the text in bold, for convenience)


Actual Results:
The parts between adjacent pairs of ellipses are missing.

Expected Results:
The parts between adjacent pairs of ellipses should NOT be missing.


Reproducible: Always


User Profile Reset: Yes

Additional Info:
Version: 7.6.4.1 (X86_64) / LibreOffice Community
Build ID: e19e193f88cd6c0525a17fb7a176ed8e6a3e2aa1
CPU threads: 12; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Eltomito 2024-01-15 21:10:05 UTC
Created attachment 191965 [details]
File to recreate the bug.

This file can be used to demonstrate the bug as described in the original buf report.
Comment 2 Eltomito 2024-01-15 21:10:50 UTC
Find the something.srt file attached for testing.
Comment 3 m_a_riosv 2024-01-15 21:39:18 UTC
You need to select the filter
On the box at the right of the filename, select 'Text (*.txt)
On the file name: *.srt
Select the file.
Comment 4 Eltomito 2024-01-15 22:52:49 UTC
(In reply to m_a_riosv from comment #3)
> You need to select the filter
> On the box at the right of the filename, select 'Text (*.txt)
> On the file name: *.srt
> Select the file.

Maybe that works but if you just right click on the file in a file manager and open it with LO then LO automatically does something wrong and that's what this bug report is about.
Comment 5 m_a_riosv 2024-01-16 10:53:04 UTC
Reproduced
Version: 7.6.4.1 (X86_64) / LibreOffice Community
Build ID: e19e193f88cd6c0525a17fb7a176ed8e6a3e2aa1
CPU threads: 16; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US
Calc: CL threaded

But that file doesn't look as srt file, so the extension doesn't matter here.
Comment 6 Eltomito 2024-01-16 11:04:04 UTC
Okay, let me explain. I first discovered this ellipsis problem on valid .srt files (with timecodes and subtitle numbers and all) and then I investigated further and found out that this bug shows when:

1) the extension of the plain text file is not .txt and LO doesn't understand the extension. The file can be called something.whatever (tried and confirmed). But if you call it something.txt, then the bug doesn't happen.

2) There are ellipsis characters in the file (U+2026).

The bug matters to me, because it's useful for me to send valid .srt files as .docx or .odt to proofreaders who have no subtitling software and want to read the text in a regular document. I want to keep the srt timecodes there, because when the proofreader is done with it, I can import the content of the corrected .docx straight back into the subtitling software.

The easiest way to save a .srt as a .docx is to right-click the .srt file, open it in LO and save it as .docx. But this bug messes up this workflow.