Description: i tried to convert a pdf file size(1.5MB) to docx through command line and it works without any problem it takes around 3s and i got satisfied with the result, then i wanted to test the application with a large file like 20MB ,i waited for maybe 10 min and it hasn't finished + the terminal has crashed when i tried to terminate the process. Actual Results: actually i waited for like 10 min then i exit the terminal cause nothing happened and the app has crashed. Expected Results: converting large pdf to docx within a satisfied time Reproducible: Always User Profile Reset: No Additional Info: i have windows 11 pro , 16GB ram , 13th Gen Intel(R) Core(TM) i7-13620H 2.40 GHz idk if it's related to memory or why it hasn't converted within a reasonable time , i tried to convert this dummy file for testing https://examplefile.com/document/pdf/20-mb-pdf#google_vignette i dont know if can update anything in options->advanced->... or what the reasons of crashing
Created attachment 199335 [details] this is the file that i was trying to convert it to docx this is the command that i run "C:\\Program Files\\LibreOffice\\program\\soffice.exe" --headless --convert-to docx --infilter="writer_pdf_import" "C:\\Users\\user\\Downloads\\20mb.pdf" --outdir "C:\\Users\\user\\Desktop\\file-convertor\\uploads"
Tested with other apps also takes a lot of time, E.g. with Gimp or Word. It is a long PDF with a lot of tables.
here is another info inside C:\Users\user\AppData\Roaming\LibreOffice\4\crash ProductName=LibreOffice Version=25.2.0.3 BuildID=e1cf4a87eb02d755bce1a01209907ea5ddc8f069 URL=https://crashreport.libreoffice.org/submit/ UseSkia=true Language=en-US CPUModelName=13th Gen Intel(R) Core(TM) i7-13620H CPUFlags=sse3 pclmulqdq monitor ssse3 fma cpmxch16b sse41 sse42 movbe popcnt aes xsave osxsave avx f16c rdrand msr cx8 sep cmov clfsh mmx fxsr sse sse2 ht fsgsbase bmi1 avx2 bmi2 erms invpcid rdseed adx sha lahf abm syscall rdtscp MemoryTotal=16462712 kB ShutDown=true
I shrunk it down to 1000 pages with: qpdf --empty --pages ./20mb.pdf 1-1000 -- ./20mb_1000p.pdf Conversion time on my machine: real 4m59,879s user 5m2,320s sys 0m0,538s Arch Linux 64-bit Version: 26.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: 38121c6f0208f9db0a6d69e33efc7d1eec0aae31 CPU threads: 8; OS: Linux 6.17; UI render: default; VCL: gtk3 Locale: fi-FI (fi_FI.UTF-8); UI: en-US Calc: CL threaded Built on 30 October 2025
Created attachment 203740 [details] Perf flamegraph Recorded a perf trace for a version cut down to 500 of the first pages. Lots of work being done in lcl_GetUniqueFlyName() Arch Linux 64-bit Version: 26.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: d0d81540a7f1e4b2c0b7a305f9f64c518edec8c3 CPU threads: 8; OS: Linux 6.17; UI render: default; VCL: gtk3 Locale: fi-FI (fi_FI.UTF-8); UI: en-US Calc: CL threaded Built on 5 November 2025
This has actually gotten much better recently. In the oldest of 7.6 Linux bibisect repo, with the 500 page version I get: real 3m38,753s user 3m35,717s sys 0m0,745s In the master commit of 25.2 Linux bibisect repo, I get: real 1m37,381s user 1m38,233s sys 0m0,343s