User Tools

Site Tools


sql_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
sql_analysis [2010/08/12 22:52] – created danielsql_analysis [2012/01/28 02:55] daniel
Line 1: Line 1:
- ====== Presence and count of certain headers ======+ ====== Presence and count of specific mail headers ====== 
 +This query extracts, for each of the header fields from a fixed list, the number of occurrences of all distinct values of the field.
  
 <code sql> <code sql>
Line 9: Line 10:
 group by split_part(substr(lines, 1+position(E'\n'||field in lines), 200), E'\n', 1) group by split_part(substr(lines, 1+position(E'\n'||field in lines), 200), E'\n', 1)
 </code> </code>
 +
 +Sample output:
 +<code>
 +            ct             | count 
 +---------------------------+-------
 + Importance: high          |    32
 + Importance: High          |   130
 + Importance: low               1
 + Importance: Medium        |     3
 + Importance: normal        |    44
 + Importance: Normal        |   802
 + Precedence: bulk          | 19987
 + Precedence: bulk          |     1
 + Precedence: fm-user           3
 + Precedence: junk          |     3
 + Precedence: list          |  4026
 + Priority: non-urgent      |    11
 + Priority: Non-Urgent      |     1
 + Priority: normal          |   464
 + Priority: urgent          |   603
 + Priority: Urgent          |    12
 + X-MSMail-Priority: High     121
 + X-MSMail-Priority: Low    |     1
 + X-MSMail-Priority: Lowest |    11
 + X-MSMail-Priority: Medium |     3
 + X-MSMail-Priority: Middle |     8
 + X-MSMail-Priority: Normal |   923
 + X-Priority: 0                 1
 + X-Priority: 1               102
 + X-Priority: 1 (High)      |     2
 + X-Priority: 1 (Highest)     120
 + X-Priority: 2                 6
 + X-Priority: 2 (High)      |     1
 + X-Priority: 3              2892
 + X-Priority: 3 (Normal)    |   635
 + X-Priority: 5                14
 + X-Priority: Normal        |     5
 +</code>
 +====== Duplicate messages ======
 +This query finds each message that share the exact same headers than another message with a lower mail_id, which means that it's a duplicate.
 +<code sql>
 +select h1.mail_id from header h1, header h2 where h1.lines=h2.lines and h1.mail_id > h2.mail_id
 +</code>
 +
 +====== Messages with specific attachment types ======
 +To retrieve all messages containing pdf files or any image file:
 +<code sql>
 +select distinct mail_id FROM attachments WHERE content_type='application/pdf' OR content_type like 'image/%';
 +</code>
 +
 +====== Messages sent or received today  ======
 +<code>
 +select mail_id from mail where msg_date>=date_trunc('day',now());
 +</code>
 +
  
sql_analysis.txt · Last modified: 2019/01/24 17:55 by daniel