Antville Project

Antville mangles valid tags

JohnWalsh discovered a problem: Apparently, Antville converts all angle brackets that are not part of a tag to < and &rt;, but doesn't recognize all tags correctly, thus destroying valid html code.

comment    

 
nex, December 21, 2002 at 4:50:04 PM CET

TagsTractor

As I said on help.antville, I've made something that extracts all defined tags from a DTD. This way, we can collect all existing (X)HTML tags for a start, so the bug reported above goes away. In the long run, it would be more elegant to have a seperate list of tags for each DTD and use each one where applicable.

link  

 
hns, December 21, 2002 at 6:55:03 PM CET

Re: TagsTractor

So what are the results of your program? What tags are missing?

link  

 
nex, December 23, 2002 at 4:20:55 AM CET

Re: TagsTractor

I haven't collected all available DTDs and run them through the program yet, as I was rather sickish and busyish over the weekend and just did as much as I had promised to do ;-)

First rough results showed that for the transitional and strict XHTML 1.0 DTDs (which define the same tags as HTML 4), there were the tags missing that JohnWalsh reportet, and there were also some strings in the list that aren't really tags.

I'll post more comprehensive information later today, good night... (I can't believe it's over 0400 already...)

link  

 
hns, December 23, 2002 at 4:24:49 AM CET

Re: TagsTractor

time flies, good night.

link  


... comment
 
nex, December 23, 2002 at 2:14:40 PM CET

Good morning hns; I even managed to sleep for about two hours tonight!

I analyzed some DTDs now; the result is: HTML 4 transitional defines all there is. XHTML 1.0 defines the same tags in lower case, except for two (who knows which two?) and XHTML 1.1 defines the same again, broken up into individual modules (e.g. text Module, Hypertext Module, Table Module, ...), but no additional tags. The total is 91, which is a little less than the existing list in the Helma source, which apparently includes tags from older HTML versions. I can add these, but do we really want to allow <blink>? (By the way, please at least delete the deprecated IE-only <marquee>, this is disgusting.)

I'm so proud of my little program, so here's my interaction with it:

For batch mode invoke: java tagstractor.Extract
 

Entering interactive mode.

input file name: xhtml1.0-transitional.dtd

output file name: tags.txt
Found a total of 89 tags so far.

Process another file (y/_n_)? : y

input file name: xhtml1.0-strict.dtd
Found a total of 89 tags so far.

Process another file (y/_n_)? : y

input file name: html4-transitional.dtd
Found a total of 91 tags so far.

Here's a list of what's missing from the Helma source:

allTags.add("acronym");
allTags.add("bdo");
allTags.add("dfn");
allTags.add("label");
allTags.add("legend");
allTags.add("noscript");

You can just add that to the source, I just noticed it isn't in strictly alphabetical order anyway :-) As you can see, my problem solution overkill really paid off, I found 2 (in words: TWO) tags JohnWalsh didn't find :-)

link  


... comment


The Antville Server Fund has been a great success. Thanks to everybody who contributed!
online for 8549 Days
last updated: 1/4/11, 10:22 AM
status
Youre not logged in ... Login
menu
November 2024
SunMonTueWedThuFriSat
12
3456789
10111213141516
17181920212223
24252627282930
July
recent
zfuture's house here is zfuture's
house
by zfuture (7/31/03, 2:59 AM)
i understand your concerns however,
i hardly can think of a solution. certainly, if the...
by tobi (7/29/03, 9:47 AM)
Found several more similar sites
listed This is getting to be quite a concern to...
by cobalt123 (7/27/03, 7:56 PM)
Second Post Alert on Referrer
bug livecatz I put this into "help" and now here:...
by cobalt123 (7/26/03, 7:14 PM)
well it's not easy to
find from here, anyway. think we should include a link,...
by tobi (7/24/03, 11:25 AM)
So finally I found
the helma Bugzilla - stupid me.
by mdornseif (7/24/03, 10:28 AM)
clock not that it's particularly
earthshattering but the antclock is running slow by about 15...
by kohlehydrat (7/23/03, 8:25 PM)
but blogosphere.us isn't can't really
be rated as spam can it?
by kohlehydrat (7/23/03, 8:08 PM)
More referrer spam www.webfrost.com
by Irene (7/23/03, 7:55 PM)
How to log skin names
I accessed to console?? Hi, I would like to know...
by winson (7/23/03, 4:12 PM)

Click here to get an XML version of this weblog.

Made with Antville
powered by
Helma Object Publisher