Bug 57232 - Escaped HTML on www.libreoffice.org front page
Summary: Escaped HTML on www.libreoffice.org front page
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: WWW (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsDevEval, topicWeb
Depends on:
Blocks:
 
Reported: 2012-11-17 18:26 UTC by Robinson Tryon (qubit)
Modified: 2015-12-16 05:35 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Screenshot of www.libreoffice.org, section "News at TDF blog", 2012-11-22 (111.25 KB, image/png)
2012-11-22 06:58 UTC, Roman Eisele
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robinson Tryon (qubit) 2012-11-17 18:26:24 UTC
The LO homepage (https://www.libreoffice.org/) has a section near the bottom of the page entitled "News at TDF Blog". This section appears to include excerpts from TDF Blog entries.

The entries are showing up on the libreoffice.org page with escaped HTML tags and character entities. For example:

---
The following is an open letter The Document Foundation has sent to the City of Freiburg, Germany, as a statement regarding the current discussion about Freiburg&#8217;s IT strategy. The letter in its original format is available at <a href="http://wiki.documentfoundation.org/File" target="_blank">wiki.documentfoundation.org/File</a>:OffenerBriefFreiburg.pdf At the same time, The Document
---

and

---
2012-11-07

Berlin &#38; Barcelona, November 7, 2012 &#8211; The Document Foundation announces the first group 
---

I assume that something in the blog-to-feed import chain is applying some kind of html-escaping filter (perhaps for security reasons).

Either the HTML should be stripped entirely, or the HTML should be preserved in its original state so that it can be rendered as it appears on the TDF Blog.
Comment 1 Roman Eisele 2012-11-22 06:58:17 UTC
Thank you for your bug report!

REPRODUCIBLE easily by browsing the TDF site; I will attach a screenshot showing the current state, including the escaped HTML tags cited by Qubit.

This bug is more important than one may think first; some people (including me ;-) may regard such escaped (visible) HTML on a website as a sign that the company or organization which ows that website is unprofessional ...
Comment 2 Roman Eisele 2012-11-22 06:58:47 UTC
Created attachment 70411 [details]
Screenshot of www.libreoffice.org, section "News at TDF blog", 2012-11-22
Comment 3 Robinson Tryon (qubit) 2012-11-29 17:31:48 UTC
Hi, could I please get a status update on this bug?

I really like LO, and I really want to promote LO, but it's hard for me to bill LO as a serious piece of software and promote it to business professionals when the front page doesn't reflect the high quality of the software product.

We're coming up on 2 weeks that the front page has been looking sub-par.

For comparison, here's the front page for MS-Office:
https://office.microsoft.com/en-us/
Comment 4 Roman Eisele 2012-11-29 18:09:50 UTC
Hi Rainer,

do you know (I don’t, sorry) who exactly is responsible for the technical side of the LibO website and can fix this bug? Please CC him/her about this bug. Qubit is right that this little bug should be fixed fast, because it makes the LibO website look unprofessional, or even embarrassing ...

Thank you very much!
Comment 5 Christos Strubulis 2012-12-04 14:32:35 UTC
Hello to all,
I am actually new here. But I took a glance at it and I downloaded the page to correct it. The <a> tags were somehow corrupted. I think you need something like the following html code...Isn't it right? I do not know how to commit etc...

Thanks.

<p>The following is an open letter The Document Foundation has sent to the City of Freiburg, Germany, as a statement regarding the current discussion about Freiburg&amp;#8217;s IT strategy. The letter in its original format is available at <a href=&quot;http://wiki.documentfoundation.org/File&quot; target=&quot;_blank&quot;&gt;wiki.documentfoundation.org/File&lt;>OffenerBriefFreiburg.pdf</a> At the same time, The Document Foundation has signed the open letter of the Open Source Business [...]</p>
Comment 6 Roman Eisele 2012-12-05 09:49:13 UTC
@ Christos:

Welcome and thank you for your attempt to help!

> I am actually new here. But I took a glance at it and I downloaded the page
> to correct it. The <a> tags were somehow corrupted. I think you need
> something like the following html code...

You are completely right that this would fix the problem for now. But I fear that the issue is more difficult.

This part of the LO hompage, “News at TDF Blog”, is generated automatically by some script (or however this may be called ;-) from the contents of
  http://blog.documentfoundation.org/
IMHO the script, while abbreviating the articles from the blog to short summaries, “damages” some HTML tags and character entities by converting the HTML markup into literal text ...

So what is necessary is to fix that script, and I fear this is a bit more difficult.
Comment 7 Christos Strubulis 2012-12-05 16:00:49 UTC
Well if the script is in javascript I could take a look! :).
Comment 8 Roman Eisele 2012-12-06 07:43:30 UTC
@ Florian Effenberger:

Hello Florian,

this little bug about the front/home page of www.libreoffice.org/ is embarrassing, because it makes the whole page look unprofessional.
And that is bad ...

Do you know who is responsible for the website, especially for the script which generates the contents of the “News at TDF Blog” section, and therefore should be told about this bug? I did not succeed in finding any information about that on the website (which is, BTW, probably yet another bug ;-).

Thank you very much for your help!
Comment 9 Florian Effenberger 2013-07-01 18:51:15 UTC
Cloph, can you have a look at this one?
Comment 10 Florian Effenberger 2013-08-05 15:04:13 UTC
Twitter has changed/closed its API, Cloph just told me, which is why it is not working. We need to see if there is a workaround, otherwise removing the Twitter column is an option until someone can provide code that works.

For the blog posts, all the errors so far were result of badly-formatted blog posts, I hear.
Comment 11 Christian Lohmaier 2013-08-05 16:39:12 UTC
fixed.
Comment 12 Robinson Tryon (qubit) 2015-12-16 05:35:48 UTC
Migrating Whiteboard tags to Keywords: (ProposedEasyHack ->  needsDevEval, TopicWeb)
[NinjaEdit]