User Tools

Site Tools


sql_analysis

This is an old revision of the document!


Presence and count of certain headers

This query extracts, for each of the header fields from a fixed list, the number of occurrences of all distinct values of the field.

SELECT
 split_part(substr(LINES, 1+POSITION(E'\n'||FIELD IN LINES), 200), E'\n', 1) AS ct,
 COUNT(*)
FROM header, (VALUES ('X-Priority'), ('Importance'), ('Precedence'), ('Priority'), ('X-MSMail-Priority'), ('X-MS-Priority')) AS h(FIELD)
WHERE POSITION(E'\n'||FIELD IN LINES)>0
GROUP BY split_part(substr(LINES, 1+POSITION(E'\n'||FIELD IN LINES), 200), E'\n', 1)

Duplicate messages

This query finds each message that share the exact same headers than another message with a lower mail_id, which means that it's a duplicate.

SELECT h1.mail_id FROM header h1, header h2 WHERE h1.lines=h2.lines AND h1.mail_id > h2.mail_id
sql_analysis.1282434342.txt.gz · Last modified: 2010/08/21 23:45 by daniel