Bug 159575 - Add --page (or similar) command line argument for opening and converting files/text extraction, and search argument for opening
Summary: Add --page (or similar) command line argument for opening and converting file...
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.5.9.2 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-05 16:32 UTC by Darren Li
Modified: 2024-03-08 03:15 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Darren Li 2024-02-05 16:32:28 UTC
Description:
This request is primarily about allowing better integration of Writer with
search tools like Recoll, ripgrep-all, etc. I admit this is likely a niche use
case.

Right now, there are the following issues with the command line interface (to
the best of my knowledge):
- Text extraction via --text (or the equivalent --convert-to) does not contain
  separators between pages, so there is no way for an indexer to collect page
  number during indexing.
    - This is possible for PDFs as pdftotext from poppler-utils insert
      separators between pages, and also have a page range arguments, allowing
      page-by-page text dump.
    - I find the second option (explicit page number command line arguments) to
      be clearer in what behaviour to expect. For instance, I don't know what
      happens if the separator sequence appears in PDF content - does pdftotext
      escape the sequence?
- No way to ask Writer to jump to a specific page when opening document file.
    - This is possible for PDFs as most PDF viewers accept a page argument to
      open to.
    - Passing page number when launching a PDF viewer is the primary way for
      search tools to make user experience as smooth as possible.
- No way to ask Writer to begin a search.
    - This is possible for PDFs as most PDF viewers accept a search argument at
      launch.
    - This is a step further from above, but some search tools (namely Recoll)
      try to jump to the search result (or as close as possible) by also picking
      a word to search for on top of picking a page to go to.

Suggested additions:
- Page number arguments for start and end page range for text extraction (or
  conversion in general).
- Page number argument to open to at launch.
- Argument to specify string to search for at launch. This argument is processed
  after the above page number argument.

Steps to Reproduce:
N/A

Actual Results:
N/A

Expected Results:
N/A


Reproducible: Always


User Profile Reset: No

Additional Info:
N/A
Comment 1 Dieter 2024-03-07 18:15:35 UTC
Darren, thank you for your ideas (sorry, but I don't understand it, because - as you've written - it's a niche use case). Please make sure, that problem still existes in actual version LO 24.2.

Not sure, if it should be the aim of LO to fully interact with a lot of other programs like the search tools you've mentioned. What about the extension AltSearch. Does it fit your needs?
Comment 2 Darren Li 2024-03-07 22:55:52 UTC
> Please make sure, that problem still existes in actual version LO 24.2.

As far as I know, there is no way to specify page numbers via command line in LO 24.2 either.

> Not sure, if it should be the aim of LO to fully interact with a lot of other programs like the search tools you've mentioned.

Yep very fair, that's understandable.

> What about the extension AltSearch. Does it fit your needs?

Unfortunately no - but writing an extension is an idea I have not thought about, so I might explore that path actually. Thanks!