Understanding the behaviour of users of a Web-site is obviously very important.
I designed the structures to hold data on user-behaviour, and wrote the code to populate these structures. The raw data-feed is the click-stream (URL hits) by users. From this raw data, many interesting facets of user-behaviour may be extracted. This included inferring `sessions' by the user and also heuristically assigning unauthenticated hits to users based on the pattern of authenticated hits from particular I.P. addresses. I managed one person in this work.
In addition, I constructed models of user-behaviour in terms of a user's `lifetime' on the site. These models were illustrated by tabular reports and graphical representations.
Some of this work involved writing complex SQL queries, such as the following:
select /*+ ORDERED INDEX_DESC(t2 ML_HOSTACC) */ t1.host_addr, t1.member_id, begin_date, max(t2.acc_date) from (select host_addr, member_id, min(acc_date) begin_date from member_logins where (host_addr, member_id) in (select host_addr, member_id from member_logins where acc_date > '1996-12-20') group by host_addr, member_id having max(acc_date) - min(acc_date) > 14) t1, member_logins t2 where t1.host_addr = t2.host_addr and begin_date <= t2.acc_date and t1.member_id <> t2.member_id (+) group by t1.host_addr, t1.member_id, t1.begin_date
Rujith de Silva 1997-05-13