User Activity Logging

Specification & Architecture

1 Introduction

This document summarizes a year long discussion between UZH and BPS about how to do Activity Logging in OLAT and describes the next steps necessary to finish the implementation.

Terminology

There will still be two different kinds of logging:


This document only affects user activity logging

2 Architecture

IUserActivityLogger, ThreadLocalUserActivityLoggerInstaller and ThreadLocalUserActivityLogger

User Activity Logging During Event Handling

Initialization of IUserActivityLogger


Logging Targets: distributed, into OLAT DB or Separate DB

Logging is done into a database - this can either be the olat database or a separate one.

Each OLAT node logs directly into the (logging) DB - i.e. logging happens distributed. This is better performance-wise - as sending an event to a central-single-service which would then do the DB transaction is an overhead. The main reason for the RemoteAuditLogger (which did logging into a log file from a singleton service) was to make sure the log file was not garbled. With the DB this is no longer an issue and we can now log distributed. Hence the RemoteAuditLogger will no longer be used.

Regarding logging into the OLAT DB or a separate DB: This comes down to lifecycling logging data. As this table grows constantly and rather rapidly, one needs to plan how this big logging table is handled. Two or three different concepts could be considered:
  1. Logging happens into the OLAT DB - with replication configured to forward the o_loggingtable to a separate, long-term-logging-db. The latter can then do periodic compressing, normalization etc.
  2. Logging happens into a separate DB which does periodic compressing, normalization etc.
  3. Logging happens into a separate DB which is configured with replication and forwards its logging table to a separate, long-term-logging-db which then does compressing and normalization
The decision on which setup to implement is a deployment choice.

Logged Information

The following is the complete list of information which is logged. As shown above, the logging can happen to the usual OLAT DB or to a separate logging DB.

Note that not all of below fields are mandatory

Technical Fields

ID
Type
Length
Mandatory
Description
log_id
bigint
20
true
globally unique id of this log entry
creationdate
timestamp

true
date and time when this log happened
sourceclass
varchar
512
true
the class which triggered this log

Session and User Fields

ID
Type
Length
Mandatory
Description
sessionid
varchar
255
true
JSessionID
user_id
bigint
20
false o_user.user_id
username
varchar
255
false
userproperty1
varchar
255
false
customizable property 1
..
varchar
255
false ..
userproperty12
varchar
255
false customizable property 12

Action

ID
Type
Length
Mandatory
Description
actionCRUDType
varchar
1
true
CRUD: (C)reate, (R)etrieve, (U)pdate, (D)elete, (E)xit
actionVerb
varchar
16
true
verb describing this action. This comes from a limited, olat-wide defined enum. e.g.:
add,remove,edit,launch,denied,move,copy,view...
actionObject
varchar
32
true
object of this action. usually corresponds to targetrestype but the latter might not always exist. e.g.:
course,node,editor,groupmanagement,forumthread,
owner,participant...
resourceadminaction
boolean
1
true
formerly known as logStream - true for ADMIN, false for USER
simpleDuration
bigint
20
true
-1 by default, otherwise the time between the next and this log action in this session

Scope

ID
Type
Length
Mandatory
Description
businesspath
varchar
2048
false
REST-like, full business path
targetrestype
varchar
32
false
target olat resourceable type (e.g. forum,wiki)
targetresid
varchar
64
false
target olat resourceable id
targetresname
varchar
256
false
target olat resourceable name
parentrestype
varchar
32
false
the parent olat resourceable type (e.g. node)
parentresid
varchar
64
false
the parent olat resourceable id
parentresname
varchar
256
false
the parent olat resourceable name
grandparentrestype
varchar
32
false
the grand parent olat resourceable type (e.g. course)
grandparentresid
varchar
64
false
the grand parent olat resourceable id
grandparentresname
varchar
256
false
the grand parent olat resourceable name
greatgrandparentrestype
varchar
32
false
the great grand parent olat resourceable type (e.g. course)
greatgrandparentresid
varchar
64
false
the great grand parent olat resourceable id
greatgrandparentresname
varchar
256
false
the great grand parent olat resourceable name

A note on olat resourceables:
A note on target, parent, grandparent, greatgrandparent

Example 'Log-Line'

*************************** 2. row ***************************
                 log_id: 1703939
           creationdate: 2009-11-03 09:54:51
            sourceclass: org.olat.course.run.navigation.NavigationHandler
              sessionid: CF2F6ABEEBD1CC3112196ABB3699E07A
                user_id: 229376
               username: administrator
          userproperty1: NULL
          userproperty2: NULL
          userproperty3: NULL
          userproperty4: NULL
          userproperty5: NULL
          userproperty6: NULL
          userproperty7: NULL
          userproperty8: NULL
          userproperty9: NULL
         userproperty10: NULL
         userproperty11: NULL
         userproperty12: NULL
         actioncrudtype: r
                 action: NODE_ACCESS
         simpleduration: 2404
    resourceadminaction: 0
           businesspath: [RepositoryEntry:393222][CourseNode:70448659388630]
greatgrandparentrestype: NULL
  greatgrandparentresid: NULL
greatgrandparentresname: NULL
     grandparentrestype: NULL
       grandparentresid: NULL
     grandparentresname: NULL
          parentrestype: CourseModule
            parentresid: 80387775267358
          parentresname: Course template small
          targetrestype: st
            targetresid: 70448659388630
          targetresname: Course template small

Duration Estimation

In order to estimate the duration which a user spent on a particular page we can use the following algorithm:

Note that a session timeout will set the duration of the user's last click to session_timeout by default. This must be taken into account when doing reports. Also, large values of duration might not be useful since the user could have done something else in the meantime.

Configuration

The following is an overview of what and how to configure user activity logging

  1. hibernate configuration of the database in spring (a la DBModule)
  2. configuration of the user properties


3 Deployment @ UZH

Early measurements of log traffic to the o_loggingtable have resulted in an estimate of 15-20GB log-data per month (down from an original 100GB/month). With the requirement of keeping a backlog of about 2 years worth of logging data this results in quite a big database table.

When doing statistical analysis - i.e. select statements on the loggingtable - this can generate quite some, albeit only temporary, load on the database.

This brought up the need to look for alternative database setups - especially having a separate database specifically for logging. While it would be possible to have two databases configurable in OLAT, such a setup would mean having to deal with two sets of configurations, two sets of database connection pools, fixing the DBImpl/DBInstance singleton as well as making commits on two connections (not that we have strict transactional requirements on logging as much as we have on the other data).

With the above in mind we came up with the following draft of deployment which we'll aim for @UZH:



A few notes on the logging DB setup

All normal OLAT instances are still configured with only 1 database connection - nothing changes on that front.

There are now two databases used in this setup:

Main OLAT DB

Logging DB


OLAT Log(-Reporting)&Statistics node

Generating Statistics

There will be functionality - implemented by BPS - which processes logging data into some form of statistics. This data will be stored in a table - let's call it o_statistics. This table is configured to exist in the Logging DB and in a read-only-federated mode also back on the Main OLAT DB.

The statistics are triggered via a Quartz-Job which is only configured on this Singleton Statistics Service in the Cluster.

Generating the (legacy) log file

The Log & Statistics node will also take care of extracting filtered logging data from the logging table (it is the only node which is configured to have access to the logging DB) and provides the information in the method required. This could include:

Preferred way to go: store the result in a course resource folder.

SimpleDuration Update

Here is a note about updating simpleDuration. With the setup @ UZH of having BLACKHOLE configured in the o_loggingtable on the Main OLAT DB those logging entries we insert directly 'disappear' (and get replicated to the logging db). That means that doing an updateObject() later would fail because hibernate does a safety check on how many rows were updated. And since it can't find the previous logging object anymore, it fails.

To work around this issue, the code now does an update manually via createQuery and executeQuery. This way we don't assume that the log row we're updating actually exists in the database. If it exists, it will update it fine. If it doesn't, nothing happens. Except that our setup with replication will still forward the update command to the slave - hence it still gets updated accordingly in the slave.

4 Deployment @ BPS

[OPEN]

5 Deployment @ Demo Installer



In the demo installer case there is only one tomcat node and all services run in that node. Therefore the log & statistics service runs in that one tomcat node as well. The other difference to cluster setups is that there is no JMS used - MultiUserEvents are sent to singleton services internally directly.

With both the log-reporting & statistics services being a singleton service (in OLAT terms) this setup becomes a simple configuration task.

Note that the OLAT DB contains the usual tables as well as the o_loggingtable. The o_loggingtable might grow rapidly and we should add documentation about this fact - and how we deal with this in our installations.

6 Tasks

Design

BPS-1229 Branch

Framework


[DONE]

Integration into ManagedTransaction

Migration

Controllers, Managers

User Properties

User interface

Review

Workflows

7 Estimates

The below estimates require a review further down the road.