Bug 154628 - XML Form Document: Sending data with GET fires very often
Summary: XML Form Document: Sending data with GET fires very often
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.5.2.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.6.0
Keywords:
Depends on:
Blocks: XML_Form
  Show dependency treegraph
 
Reported: 2023-04-05 17:34 UTC by Robert Großkopf
Modified: 2023-05-17 19:50 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Zip package contains a form, a *.php-file and two files for data (17.79 KB, application/zip)
2023-05-15 18:54 UTC, Robert Großkopf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Großkopf 2023-04-05 17:34:02 UTC
I know this can't be reproduced without a running webserver. So I don't know how to offer a good example.

I have created a XML form document.
Set submission to method "GET".
Connected to a php-file on my local webserver to get data from the xml form document.

Sending data and get REQUEST_METHOD "HEAD" one time, also 3 times REQUEST_METHOD "GET". So data saved by the webserver will be saved 4 times with LO 7.4 and LO 7.5. Have to set the php-script to save only REQUEST_METHOD "HEAD".
Comment 1 Julien Nabet 2023-05-15 18:07:13 UTC
Would it be possible you attach a minimum php file so we can try to reproduce this? (of course, it'll still require an Apache or other web server)
Comment 2 Robert Großkopf 2023-05-15 18:54:18 UTC
Created attachment 187304 [details]
Zip package contains a form, a *.php-file and two files for data

The form has different Button for sending data. Only button for 
Daten > Server > Get
is needed.
Submission → Server_Get → Action must be modified to your (local) web server.

The php file must be positioned in path for the (local) web server. 
Create a sub folder to this file called /data. Data should be writable for all (or wwwrun).
The two files "Daten_einfach_formatiert.xml" and "Daten_einfach_formatiert.xsl" should be copied in this folder. *.xml-file should be writable to all.

After sending data to the *.php-file you could have a look at apache2 log file: There is more than one Server request. There is a way to get only one of this request by setting 
IF ($_SERVER['REQUEST_METHOD'] == "HEAD")
because this method only appears once. But why do all the other requests appear?
Have a look at the file *.xml file. Every time you send new data there will be about 4 new entries.
Comment 3 Julien Nabet 2023-05-15 19:40:07 UTC
Thank you Robert for the zip file.

I gave a try and when clicking on Daten > Server > Get, I got 3 times this line
<Name><Vorname>Rob</Vorname><Nachname>van Eden</Nachname></Name>^M

and this in logs:
127.0.0.1 - - [15/May/2023:21:34:07 +0200] "HEAD /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 128 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
127.0.0.1 - - [15/May/2023:21:34:07 +0200] "GET /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 178 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
127.0.0.1 - - [15/May/2023:21:34:07 +0200] "GET /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 178 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"

(I created a local link for Apache2 with "robert", more straigthforward than opening the odt file to change the path :-)).


Just once, I got 4 times with:

127.0.0.1 - - [15/May/2023:21:32:18 +0200] "OPTIONS /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 178 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
127.0.0.1 - - [15/May/2023:21:32:18 +0200] "HEAD /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 128 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
127.0.0.1 - - [15/May/2023:21:32:18 +0200] "GET /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 178 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
127.0.0.1 - - [15/May/2023:21:32:18 +0200] "GET /robert/xmldata_get.php?Vorname=Rob&Nachname=van+Eden& HTTP/1.1" 200 178 "-" "LibreOffice 7.6.0.0 denylistedbackend/8.0.1 NSS/3.87.1"
but don't know how to reproduce this last one.

Anyway, what would be the goal to reach here:
- send HEAD once, send GET once and add one record in xml file?
- just send GET once and add one record in xml file?
- other?
Comment 4 Robert Großkopf 2023-05-15 19:59:10 UTC
(In reply to Julien Nabet from comment #3)
> 
> Anyway, what would be the goal to reach here:
> - send HEAD once, send GET once and add one record in xml file?
> - just send GET once and add one record in xml file?
> - other?

Don't know much about the commands, but with other connections I used (PHP-Apache-MySQL/MariaDB) it should only sen GET-command one time. At this moment I get 3 or 4 new entries. I have seen HEAD will only be send one time so I changed the *.php-file to get only the content together with HEAD. But I don't know if this is the right behavior.

Might be this helps:
https://www.rfc-editor.org/rfc/rfc2616#section-9.3
Seems it would help only to use HEAP.

I will set this one to NEW (comment 3)
Comment 5 Julien Nabet 2023-05-16 19:37:20 UTC
Here are 2 bts retrieved from 1 click:
#0  http_dav_ucp::Content::open(com::sun::star::ucb::OpenCommandArgument3 const&, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&)
    (this=0x561e53e48450, rArg=..., xEnv=uno::Reference to (CCommandEnvironmentHelper *) 0x561e548cce88) at ucb/source/ucp/webdav-curl/webdavcontent.cxx:2201
#1  0x00007f6d984d4ad2 in http_dav_ucp::Content::execute(com::sun::star::ucb::Command const&, int, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&)
    (this=0x561e53e48450, aCommand=..., Environment=uno::Reference to (CCommandEnvironmentHelper *) 0x561e548cce88) at ucb/source/ucp/webdav-curl/webdavcontent.cxx:539
#2  0x00007f6d984e726d in non-virtual thunk to http_dav_ucp::Content::execute(com::sun::star::ucb::Command const&, int, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&) ()
    at /home/julien/lo/libreoffice/instdir/program/../program/libucpdav1.so
#3  0x00007f6dd9d51947 in ucbhelper::Content_Impl::executeCommand(com::sun::star::ucb::Command const&) (this=0x561e53be4150, rCommand=...) at ucbhelper/source/client/content.cxx:1264
#4  0x00007f6dd9d5480b in ucbhelper::Content::openStream(com::sun::star::uno::Reference<com::sun::star::io::XOutputStream> const&)
    (this=0x7ffe1b880810, rStream=uno::Reference to (io_stm::(anonymous namespace)::OPipeImpl *) 0x561e546391c8) at ucbhelper/source/client/content.cxx:825
#5  0x00007f6d9b6674e1 in CSubmissionGet::submit(com::sun::star::uno::Reference<com::sun::star::task::XInteractionHandler> const&)
    (this=0x561e4fc22880, aInteractionHandler=uno::Reference to (svxform::FormController *) 0x561e53d71388) at forms/source/xforms/submission/submission_get.cxx:84

+

#0  http_dav_ucp::Content::open(com::sun::star::ucb::OpenCommandArgument3 const&, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&)
    (this=0x561e53e48450, rArg=..., xEnv=uno::Reference to (CCommandEnvironmentHelper *) 0x561e548cce88) at ucb/source/ucp/webdav-curl/webdavcontent.cxx:2267
#1  0x00007f6d984d4ad2 in http_dav_ucp::Content::execute(com::sun::star::ucb::Command const&, int, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&)
    (this=0x561e53e48450, aCommand=..., Environment=uno::Reference to (CCommandEnvironmentHelper *) 0x561e548cce88) at ucb/source/ucp/webdav-curl/webdavcontent.cxx:539
#2  0x00007f6d984e726d in non-virtual thunk to http_dav_ucp::Content::execute(com::sun::star::ucb::Command const&, int, com::sun::star::uno::Reference<com::sun::star::ucb::XCommandEnvironment> const&) ()
    at /home/julien/lo/libreoffice/instdir/program/../program/libucpdav1.so
#3  0x00007f6dd9d51947 in ucbhelper::Content_Impl::executeCommand(com::sun::star::ucb::Command const&) (this=0x561e53be4150, rCommand=...) at ucbhelper/source/client/content.cxx:1264
#4  0x00007f6dd9d53860 in ucbhelper::Content::openStream() (this=0x7ffe1b880810) at ucbhelper/source/client/content.cxx:709
#5  0x00007f6d9b667593 in CSubmissionGet::submit(com::sun::star::uno::Reference<com::sun::star::task::XInteractionHandler> const&)
    (this=0x561e4fc22880, aInteractionHandler=uno::Reference to (svxform::FormController *) 0x561e53d71388) at forms/source/xforms/submission/submission_get.cxx:88

These 2 calls at frame 0 call a GET function.

The 2 different calls are from forms/source/xforms/submission/submission_get.cxx
line 84 and 88
     83         css::uno::Reference< XOutputStream > aPipe( css::io::Pipe::create(m_xContext), UNO_QUERY_THROW );
     84         if (!aContent.openStream(aPipe))
     85             return UNKNOWN_ERROR;
     86         // get reply
     87         try {
     88             m_aResultStream = aContent.openStream();
     89         } catch (const Exception&) {
     90             OSL_FAIL("Cannot open reply stream from content");
     91         }

So perhaps we may just avoid the first call and use this patch:

diff --git a/forms/source/xforms/submission/submission_get.cxx b/forms/source/xforms/submission/submission_get.cxx
index ae630b504b0c..1ddcd529ef1a 100644
--- a/forms/source/xforms/submission/submission_get.cxx
+++ b/forms/source/xforms/submission/submission_get.cxx
@@ -24,7 +24,6 @@
 #include <rtl/strbuf.hxx>
 #include <osl/diagnose.h>
 #include <ucbhelper/content.hxx>
-#include <com/sun/star/io/Pipe.hpp>
 #include <com/sun/star/task/InteractionHandler.hpp>
 #include <comphelper/diagnose_ex.hxx>
 
@@ -80,9 +79,6 @@ CSubmission::SubmissionResult CSubmissionGet::submit(const css::uno::Reference<
         }
         OUString aQueryURL = OStringToOUString(aUTF8QueryURL, RTL_TEXTENCODING_UTF8);
         ucbhelper::Content aContent(aQueryURL, aEnvironment, m_xContext);
-        css::uno::Reference< XOutputStream > aPipe( css::io::Pipe::create(m_xContext), UNO_QUERY_THROW );
-        if (!aContent.openStream(aPipe))
-            return UNKNOWN_ERROR;
         // get reply
         try {
             m_aResultStream = aContent.openStream();

Michael: does it seem reasonable?

Robert: if the patch is ok, it would allow you to just use:
"IF ($_SERVER['REQUEST_METHOD'] == "GET")" in xmldata_get.php
Without this line, you'll have:
- 1 line for OPTIONS (which is called when clicking on the macro at least once +  after some delay)
- 1 line for HEAD (which is called for each click as you already noticed)
so 2 (HEAD+GET) or 3 lines (OPTIONS+HEAD+GET) at each click.
Comment 6 Julien Nabet 2023-05-16 19:39:30 UTC
I've submitted this patch on gerrit:
https://gerrit.libreoffice.org/c/core/+/151852

it may be more practical to discuss about it.
Comment 7 Robert Großkopf 2023-05-17 06:18:19 UTC
(In reply to Julien Nabet from comment #5)

> 
> Robert: if the patch is ok, it would allow you to just use:
> "IF ($_SERVER['REQUEST_METHOD'] == "GET")" in xmldata_get.php
> Without this line, you'll have:
> - 1 line for OPTIONS (which is called when clicking on the macro at least
> once +  after some delay)
> - 1 line for HEAD (which is called for each click as you already noticed)
> so 2 (HEAD+GET) or 3 lines (OPTIONS+HEAD+GET) at each click.

So I will get GET fired 3 times (OPTIONS+HEAD+GET). Why does LibreOffice send OPTIONS and HEAD? Normal behavior would be: Only send one thing once. Only send GET, if GET is chosen, nothing else. 

OK, this is only a little form, but traffic to the server will be an argument not to use this function.
Comment 8 Julien Nabet 2023-05-17 07:18:25 UTC
(In reply to Robert Großkopf from comment #7)
> ...
> So I will get GET fired 3 times (OPTIONS+HEAD+GET). Why does LibreOffice
> send OPTIONS and HEAD? Normal behavior would be: Only send one thing once.
> Only send GET, if GET is chosen, nothing else. 

About OPTIONS, it seems there's a temporary cache mechanism (see OptsCacheLifeNotFound which defines a cache for 15 or 30 seconds (see https://opengrok.libreoffice.org/xref/core/officecfg/registry/schema/org/openoffice/Inet.xcs?r=16072fc3#175).
For the rest, I don't know if forms module should call manually OPTIONS, HEAD and GET manually or if there should be some refactoring in ucbhelper part, for example to indicate to the openStream command to only call GET method.
I must recognize I'm a bit stuck here since it needs to understand the general image.
Eg: perhaps one's needs 1 call to OPTION and HEAD per session then only GET calls but then what defines the "session", the time the form odt is opened? other? What should happen if the form odt is always opened but the Apache server is restarted or if the php file is changed? I'd expect OPTION and HEAD should be called again.
Comment 9 Michael Stahl (allotropia) 2023-05-17 09:45:55 UTC
the HTTP/WebDAV UCP always starts by determining the capabilities of the server; it does this by sending 3 requests OPTIONS, HEAD, and GET with 0 bytes - it stops when a response arrives with no error and the info that the UCP needs.

there are funny servers that for example respond to HEAD with 404 and then to GET on the exact same URL with 200.

then there's some sort of cache that stores the capabilities for the lifetime of the soffice.bin process; not sure how well that works anyway.

the reason for this is that the UCP can't know that all the caller is interested in is making a single GET request; when you open a file from WebDAV the GET would eventually be followed by PUT and a lock needs to be acquired (if the server supports that).
Comment 10 Commit Notification 2023-05-17 13:36:34 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2811ffb4a5f6629101e851d0d57c9816404464ab

tdf#154628: XML Form Document: Sending data with GET fires very often

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Julien Nabet 2023-05-17 19:50:57 UTC
Robert: I must recognize I can't do more here so unassign myself.
I mean if it requires to create a kind of new API in ucbhelper to have something to just do a GET and still retrieving a ResultStream since 
CSubmissionGet::submit calls:
m_aResultStream = aContent.openStream();
and m_aResultStream is used afterwards.
I won't be able to do it.

I let you decide if you want to put it FIXED or let it open (I can understand any of these)