Skip Navigation

You are here: Home > Projects > Procmail Log to SQLite Database

Procmail Log to SQLite Database

Pmlog2db will extract useful information from a Procmail log file and add it to an existing SQLite database, where it can be processed in a meriod of wonderful ways all involving SQL.

Procmail Log Files

This script processes log files from Procmail, which have entries similar to the following.
From wrivera_ip@tip.com.pl Wed May 26 12:30:49 2004
Subject: Pharmacy with smart prices
Folder: /net/u/1/c/ccn/Mail/spam/sa 4290
This entry contains information for a single message, including the originating email address, date of receipt, subject, folder where the message was placed, and size in bytes. The script expects each series of three lines to be information for a single message, and that the log file to be processed will contain no extraneous lines.

Script Actions

The script expects one command-line argument, which is the name of a Procmail log file. It expects a SQLite database named msgs.db to be in the current directory. Provided that these conditions are met, the log file is processed.

  1. Does the file passed as a command-line argument exist?
    • If so, continue.
    • If not, exit.
  2. Does each set of three lines look like it relates to a single message?
  3. Does the msgs.db file exist in the current directory?
    • If so, continue.
    • If not, exit.
  4. Re-scan the file and, for each message, extract useful information.
    • If information looks valid, continue.
    • If not, save the lines involving that message to problems.log for later review by the user.
  5. Add the information for the individual message to the SQLite database.
  6. Print statistics on number of messages processed and number successfully added to the database.

Database

The database is composed of only one table. The database and table can be created with a command similar to the following.
sqlite msgs.db 'create table msg(addr, mon, day, hour, min, sec, year, sub, folder, size);' The fields in table msgs have these meanings:

addr
The email address which the message was sent from
mon
The month in which the message was received
day
The day on which the message was received
hour
the hour in which the message was received
min
the minute in which the message was received
sec
the second at which the message was received
year
The year in which the message was received
sub
the message subject
folder
the folder in which Procmail deposited the message
size
the size of the message in bytes

Download

This script is available as pmlog2db.pl, and is made available under the BSD license.

Usefulness

Given the amount of data retained about each message, and the pouwer of SQL, it is possible to ask questions such as:

Conclusion

I hope this script is useful in some way. Feel free to send a message if you find it useful or have a comment.

Changelog