hns,
April 24, 2002 at 10:24:53 AM CEST
Update on /referrers performance I just played around with the antville database and noticed that it looks like there are no indices at all on the ACCESS table! (Except for the primary key, that is.) For example, simply querying for Access data for one weblog takes 10 seconds: mysql> select count(*) from ACCESS where WEBLOG_ID=1; +----------+ | count(*) | +----------+ | 1614 | +----------+ 1 row in set (10.25 sec) whereas a similar query takes 0.1 seconds on the TEXT table, which has an index on WEBLOG_ID: mysql> select count(*) from TEXT where WEBLOG_ID=1; +----------+ | count(*) | +----------+ | 69 | +----------+ 1 row in set (0.01 sec) This suspicion is also consistent with the antville_indexes.sql file which doesn't have any entries for the ACCESS table. I propose we add indices to the relevant columns of the ACCESS table. If that doesn't help enough, we can still delete older records, but I think if we are able to keep all the access history in DB, that would be even better. PS: for another cool Ausbaustufe of Referrer-Logs have a look at www.disenchanted.com (via LNGR). This is a per-page referrer tracker that really implements bidirectional links. Interestingly, it seems to also have to solved a performance problem because the referrers page is broken down into smaller portions. I think it wouldn't be too hard to implement this for antville if we added the target URL, or maybe better the target story id, to the ACCESS table and a listReferrers macro to the story prototype... just thinking out loud... PPS: I just noticed that the table definition for ACCESS has a lot of mediumtext columns: #---------------------------- # Table structure for ACCESS #---------------------------- create table ACCESS ( ID bigint(20) not null auto_increment, WEBLOG_ID bigint(20), REFERRER mediumtext, IP mediumtext, URL mediumtext, PATH mediumtext, `ACTION` mediumtext, BROWSER mediumtext, `DATE` datetime, unique ID (ID) ); Is this really a good thing performance- and space-wise? wouldn't it be better to have varchar(256) and varchar(512) in most places?
tobi,
April 24, 2002 at 11:04:31 AM CEST
index? ok! thanks hannes for your explorations. obviously i did not care about indexing the access table because of lack of knowledge... robert gave me some advice now and i would add indexes by issueing the following sql statements: create index IDX_WEBLOG_ID on ACCESS (WEBLOG_ID); create index IDX_DATE on ACCESS (DATE); create index IDX_REFERRER on ACCESS (REFERRER(30)); regarding your question about mediumtext columns i guess the answer must be no and so i would change the ones for IP, ACTION, PATH and BROWSER to varchar columns, too. URL and REFERRER i would like to leave with a bigger size because urls can easily become longer than 255 chars. i don't think varchar(512) is valid sql, that's why it should ad ausbaustufe: sounds and looks interesting but i have to take some time and a closer look to understand what it's all about. thanks for the link, anyway. update: ok, it's done. the columns' storage sizes are reduced and the indexes are created.
tobi,
April 24, 2002 at 12:51:17 PM CEST
new query results here's the result of the same query above, now with indexes: mysql> select count() from ACCESS where WEBLOG_ID = 1; +----------+ | count() | +----------+ | 1614 | +----------+ 1 row in set (0.01 sec)however, i am not sure if the online performance became so much more improved...?
hns,
April 24, 2002 at 1:14:37 PM CEST
online performance hm, not really, I'd say. I guess the index only helps if you access the column with "=" operator, not with "like"... It would be interesting to know if it helped on the DATE clause ... If so, maybe we could filter the intra-weblog links out in JavaScript, by just not displaying them...
hns,
April 24, 2002 at 1:31:22 PM CEST
maybe we just need to wait a little bit I guess we just need to wait until mysql has finished building the index (which is quite a bit of work after all): The referrer page on eleph.antville.org and others I have tested are already as fast as they should be. Cool! Let's hope that the page here on this weblog also catches up!
tobi,
April 24, 2002 at 2:07:33 PM CEST
is the index built in a background thread? i thought it would be finished when mysql returns the prompt...
hns,
April 24, 2002 at 2:21:22 PM CEST
I guess that would take too long for a table with 450.000 records. Anyway, also the referrers page of this weblog is also speedy now, I guess we can lean back and leave the table full (until we run out of disk space, that is ;-)
chris,
April 24, 2002 at 2:31:15 PM CEST
oh! mysql actually can use indices with LIKE queries under certain circumstances. cool.
tobi,
April 24, 2002 at 2:33:43 PM CEST
yes it's true and amazing! i am glad that we can stick to the way referrers are recorded and queried. thanks again, hannes, it feels good to lean back this way. :) as my gift back to you i will add the story id to the ACCESS table and implement a first version of such a listReferrers macro to the story prototype. i think i got the idea and it's pretty wicked! ad chris: that's nice, because just for a test i deleted the leading wildcard character in the like clause! thanks for the confirmation!
hns,
April 24, 2002 at 2:47:12 PM CEST
cool! that's the stuff that really keeps me going @ Hop development. thanks.
hns,
April 24, 2002 at 6:02:07 PM CEST
Mindbomb alert ;-) +1 for putting this in the standard story skin, maybe with the modification to not display anything at all if there are no backlinks to the story (i.e. not even the "Backlinks" ... the dotted line above could be there in all cases, as far as I'm concerned). Chris makes a good point about the necessity of a spam and bug filter, but I think we should care about that as soon as it becomes a problem. PS: what ie referrer bug? PPS: why is the standard referrers page not availabe? PPPS: do story backlinks have a last-24-h constraint? I think it might make sense not to have one in this case.
kris,
April 24, 2002 at 6:10:41 PM CEST
ie referrer bug sometimes the internet explorer sends the last visited page as referrer, for example when the user enters the url directly or uses bookmarks. orf, microsoft and white house are apparently the homepage of three antville vistors. (creepy, innit?)
hns,
April 24, 2002 at 7:42:22 PM CEST
Just FYI I turnded weblog referrer page on again - I couldn't think of any reason why it should stay disabled. Sorry if that was wrong, but the message said "hold your breath", and if I'd done that I'd be dead by now ;-)
tobi,
April 25, 2002 at 12:14:54 AM CEST
thank you that was totally ok. i simply forgot to turn it on... ad 24h limit: yes, that will be removed for the backlinks. ad display nothing if there is nothing to display: +1 (not even the dotted line, i'd say)
tomp,
April 25, 2002 at 10:37:38 AM CEST
/referrers performance i'm not sure if it's still a problem, but if you don't want to use the the like operator ...and REFERRER not like \"" + this.href() + "%\" ... you could check if its a 'local' referrer in function logAccess() prior to inserting in the ACCESS table and mark those pages with a flag in a seperate column.
tobi,
April 25, 2002 at 10:41:07 AM CEST
that's right but i have difficulties in making a decision of which way is better: decreasing the performance of every request for a story or living with a not-so-good performance when getting the referrers... ... comment
|
The Antville Server Fund has been a great success. Thanks to everybody who contributed!
online for 8550 Days
last updated: 1/4/11, 10:22 AM Youre not logged in ... Login
... home
... topics ... galleries ... Home
... Tags
... Galleries
... about antville ... download ... macros.antville.org ... help.antville.org ... translate antville! ... antville home
i understand your concerns however,
i hardly can think of a solution. certainly, if the...
by tobi (7/29/03, 9:47 AM)
Found several more similar sites
listed This is getting to be quite a concern to...
by cobalt123 (7/27/03, 7:56 PM)
Second Post Alert on Referrer
bug livecatz I put this into "help" and now here:...
by cobalt123 (7/26/03, 7:14 PM)
well it's not easy to
find from here, anyway. think we should include a link,...
by tobi (7/24/03, 11:25 AM)
clock not that it's particularly
earthshattering but the antclock is running slow by about 15...
by kohlehydrat (7/23/03, 8:25 PM)
How to log skin names
I accessed to console?? Hi, I would like to know...
by winson (7/23/03, 4:12 PM)
|