Antville Project

rss output

please help, i need some advice from an expert how to do the rss output right, finally. i think i made quite a lot of moves regarding the encoding of description values (from no encoding over <-like encoding to CDATA sections).

now there's muttering about the CDATA again, which leads me back to <-like encoding or even way back to the beginning because "use of encoding within description elements considered harmful"...

so please, somebody who admires his own expertise in rss questions might state a last word now – and please not for the sake of any specific feedreader having troubles with rss – but for the sake of the rss specs and a format we still can use in the near future:

which way to do it right? (ie. if/how to encode html tags, if/how to encode special characters etc.)

ps. CDATA has to go definitely.

comment    

 
hns, November 7, 2002 at 7:06:35 PM CET

Re: rss output

The question is not the encoding, the question is whether we want to strip HTML tags for RSS output. I think we should. If you write out PCDATA you do have to encode certain characters as entities. It's not a problem, that article you link doesn't make any sense at all to me.

link  

 
tobi, November 8, 2002 at 9:49:22 AM CET

Re: Re: rss output

that article you link doesn't make any sense at all to me

could you be a bit more specific with your criticism?

link  

 
hns, November 8, 2002 at 11:12:01 AM CET

Re: Re: rss output

could you be a bit more specific with your criticism?

Sure. If you have something like a "<" character in XML text, you have 3 possibilities:

  1. It is part of a XML element
  2. You encode it as &lt;
  3. You embed it in an unparsed CDATA section
The first solution is not a solution at all unless we want to start messing around with XML namespaces like the article suggests. How many RSS reader will convert namespaced XML elements to HTML tags? Not many, I think. On the other hand, &lt;, &gt;, &amp;, &quot; and &apos; are predefined XML entities that are always valid and should always be decoded to the characters they represent. I'm totally perplexed by the claim that most XML APIs won't resolve those entitites, because both DOM and SAX sure do (I think I should know). Also, I find it curious that the author claims that most RSS readers don't use a proper XML parser - well, if that's the case than it surely is not our job to conceal the holes in that software's pseudo-parsers.

Again: First we have to decide if and where to allow HTML tags (for example I'd say stripping HTML tags from titles but leaving them in the descriptions would be a sane decision). If we do have HTML tag characters, the way to encode them is definitely as predefined character entities in PCDATA.

link  

 
tobi, November 8, 2002 at 11:52:12 AM CET

Re: Re: rss output

thanks for the clarifying words, hannes. i'm +1 for encoding html and +1 for completely removing html from the title.

i also would like to see the descriptions being without any markup – at least in rss readers. but maybe then i have to make the right choice of software (ie. a reader that strips the tags before displaying the feed). what do you think?

to complete this rss task, i also would like to discuss how long the descriptions should be. i am really against putting the whole story in there, instead i'd prefer a short teaser (maybe a little bit longer than in the history e.g. on this site). however, we could leave it up to the user's choice.

last but not least: in the current version of antville the new content model can lead to rss feeds that are completely useless, ie. when the site's stories aren't using the title and/or text properties anymore (which the rss skin relies on).

so the question is: fully enable the rss skin in the skin manager or create a meta editor to assign a story content part to the title and description fields of the rss output (a suggestion by robert, btw)...?

link  


... comment


The Antville Server Fund has been a great success. Thanks to everybody who contributed!
online for 8337 Days
last updated: 1/4/11, 10:22 AM
status
Youre not logged in ... Login
menu
April 2024
SunMonTueWedThuFriSat
123456
78910111213
14151617181920
21222324252627
282930
July
recent
zfuture's house here is zfuture's
house
by zfuture (7/31/03, 2:59 AM)
i understand your concerns however,
i hardly can think of a solution. certainly, if the...
by tobi (7/29/03, 9:47 AM)
Found several more similar sites
listed This is getting to be quite a concern to...
by cobalt123 (7/27/03, 7:56 PM)
Second Post Alert on Referrer
bug livecatz I put this into "help" and now here:...
by cobalt123 (7/26/03, 7:14 PM)
well it's not easy to
find from here, anyway. think we should include a link,...
by tobi (7/24/03, 11:25 AM)
So finally I found
the helma Bugzilla - stupid me.
by mdornseif (7/24/03, 10:28 AM)
clock not that it's particularly
earthshattering but the antclock is running slow by about 15...
by kohlehydrat (7/23/03, 8:25 PM)
but blogosphere.us isn't can't really
be rated as spam can it?
by kohlehydrat (7/23/03, 8:08 PM)
More referrer spam www.webfrost.com
by Irene (7/23/03, 7:55 PM)
How to log skin names
I accessed to console?? Hi, I would like to know...
by winson (7/23/03, 4:12 PM)

Click here to get an XML version of this weblog.

Made with Antville
powered by
Helma Object Publisher