Bot Split Concept
BotSplit is a new type of log analysis software that reads log files
and emits log files. It is a log file pipe. You can run BotSplit using your
existing log analysis software.
BotSplit reads a single log file and creates two output files, one containing records of robotic visitors and the other human visitors. Optionally, BotSplit can re-write IP addresses into a canonical form so that subsequent analysis is more useful.
Bot Split Tests
BotSplit is not a black box. It has a number of adjustable or selectable options.
Here are the
options available
- compares files against a list you can edit such that any access is a robot (e.g. robots.txt)
- compares files against a list you can edit such that if access is confined
solely to that list, the visitor is a robot (e.g. a file linked from some other site)
- compare file extensions such that one of the extensions must be accessed to
qualify as human (e.g. a graphics image).
- uses version 1.0 of http protocol
- IP address morphs (e.g. 111.111.111.xxx where xxx varies)
- referrer field missing from all requests
- browser field missing from all requests
Perhaps you can imagine outlier cases where such rules will classify a visitor as robot
that is actually a real human. Our experience is that these rules are very good in
insuring that whatever remains after applying them is really human.
You can deselect any of these rules.
Notice that none of these rules rely on how the visitor identify themselves in
the browser (user agent) field of the log record.
The behavioral rules we use
correlate well with robot self-identification but do not rely on goodwill.
Multi-site file management
BotSplit has added functions for managing multi-sites, sites that have multiple domain names in a single ZIP file. LogMap allows you to select a single site, all sites, or any subset of sites from all ZIP files or a subset of ZIP files in a directory. These are then consolidated into a single analysis stream. Batch allows you to integrate and manage ordinary Windows batch files for functions such as file re-name, copy, and delete.
Once set up, you may run any task from the command line without needing to invoke the interactive interface.
|