<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.4.3">Jekyll</generator><link href="http://www.manitou-mail.org/blog/feed.xml" rel="self" type="application/atom+xml" /><link href="http://www.manitou-mail.org/blog/" rel="alternate" type="text/html" /><updated>2017-11-08T17:32:21+01:00</updated><id>http://www.manitou-mail.org/blog/</id><title type="html">Manitou-Mail Blog</title><subtitle>Blog on the use and development of the Manitou-Mail software
</subtitle><entry><title type="html">Version 1.7.0 released</title><link href="http://www.manitou-mail.org/blog/2017/11/version-1-7-0-released/" rel="alternate" type="text/html" title="Version 1.7.0 released" /><published>2017-11-08T14:04:00+01:00</published><updated>2017-11-08T14:04:00+01:00</updated><id>http://www.manitou-mail.org/blog/2017/11/version-1-7-0-released</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2017/11/version-1-7-0-released/">&lt;p&gt;&lt;strong&gt;Manitou-Mail 1.7.0&lt;/strong&gt; is released and available to &lt;a href=&quot;/download&quot;&gt;download&lt;/a&gt;.
The main changes are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The counts of archived messages per tag are now constantly visible in
the interface.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Outgoing messages can be “sent later”. As detailed in a &lt;a href=&quot;/blog/2017/08/send-later-feature&quot;&gt;a previous post&lt;/a&gt;, the sending is done by manitou-mdx, so it works even when the user
interface is closed.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There is now a built-in image viewer for pictures in attachments.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The login dialog allows to force an &lt;a href=&quot;/blog/2017/07/secure-connections/&quot;&gt;encrypted session&lt;/a&gt;
 with the database.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The segmentation of large lists of messages has been improved and
 exposed through the “Fetch previous/next segment” commands and
 the “Query selection” dialog box.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It is now possible to use an external editor to compose messages.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;In the database, the cache/staging table &lt;em&gt;mail_status&lt;/em&gt; has been obsoleted.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pre-compiled binaries for Windows and macOs come ready-to-use with
Qt-5.5 libraries (including WebKit), and binary packages for Linux
Debian 8 and 9 and Ubuntu 14.04 and 16.04 are also available through the
APT repository.
Check out the &lt;a href=&quot;http://www.manitou-mail.org/download&quot;&gt;download page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Please note that &lt;strong&gt;the APT key has changed&lt;/strong&gt;. The signing key
 has been regenerated with SHA-512 due to SHA-1 being obsoleted.
The new key is &lt;a href=&quot;/download/E330764A.asc&quot;&gt;E330764A.asc&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As always, if upgrading from a previous version, make sure to run the
server-side command:&lt;/p&gt;

&lt;pre&gt;manitou-mgr --upgrade-schema&lt;/pre&gt;</content><author><name></name></author><summary type="html">Manitou-Mail 1.7.0 is released and available to download. The main changes are:</summary></entry><entry><title type="html">The “send later” feature</title><link href="http://www.manitou-mail.org/blog/2017/08/send-later-feature/" rel="alternate" type="text/html" title="The &quot;send later&quot; feature" /><published>2017-08-09T13:06:36+02:00</published><updated>2017-08-09T13:06:36+02:00</updated><id>http://www.manitou-mail.org/blog/2017/08/send-later-feature</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2017/08/send-later-feature/">&lt;p&gt;Commits &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-ui/commit/810b9e4cbc68d0c6ce94871bed3862aecd0ba16a&quot;&gt;810b9e4&lt;/a&gt; and &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-mdx/commit/399d720c3556d6ff391e89c7c794550f6baf3765&quot;&gt;399d720&lt;/a&gt; added the “send later” functionality,
that allows to set a date in the future for an outgoing message.&lt;/p&gt;

&lt;p&gt;There are multiple use cases for this feature:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;delay delivery until the actual day of an event.&lt;/li&gt;
  &lt;li&gt;send reminder emails to yourself.&lt;/li&gt;
  &lt;li&gt;adjust to the timezone of a recipient to increase the likeliness to get read.&lt;/li&gt;
  &lt;li&gt;avoid disclosing your working hours to correspondents.&lt;/li&gt;
  &lt;li&gt;have a grace period to cancel an outgoing message before it leaves.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the user interface, deferring the sending happens simply by choosing 
&lt;em&gt;“Schedule_delivery”&lt;/em&gt; instead of &lt;em&gt;“Send mail”&lt;/em&gt; in the Message menu,
after composing a new message.&lt;/p&gt;

&lt;p&gt;Since the actual sending is done by manitou-mdx, the user interface
does not need to be open at the time of a delivery. This is an advantage
over most desktop mail user agents implementing this functionality.&lt;/p&gt;

&lt;p&gt;When a message is scheduled for future delivery, internally it has
a specific status bit set (bit 10), along with an entry in the jobs_queue table.&lt;/p&gt;

&lt;p&gt;In the list of messages, its status is associated to a clock icon:
&lt;img src=&quot;/blog/wp-content/uploads/2017/08/clock.png&quot; alt=&quot;clock&quot; width=&quot;16&quot; height=&quot;16&quot; /&gt;
until it gets submitted to the mail system.&lt;/p&gt;

&lt;p&gt;It’s also included by the &lt;em&gt;“Sent mail”&lt;/em&gt; selector in the quick
selection panel (which should probably be renamed &lt;em&gt;“Outgoing mail”&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;To cancel the submission or change its future date and time,
the &lt;em&gt;Message-&amp;gt;Properties&lt;/em&gt; now has a &lt;em&gt;“Scheduled”&lt;/em&gt; checkbox,
and a &lt;em&gt;“Send after”&lt;/em&gt; datetime field:&lt;/p&gt;

&lt;p&gt;&lt;img class=&quot;aligncenter&quot; src=&quot;/blog/wp-content/uploads/2017/08/properties-scheduled.png&quot; alt=&quot;properties-scheduled&quot; width=&quot;279&quot; height=&quot;378&quot; /&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Commits 810b9e4 and 399d720 added the “send later” functionality, that allows to set a date in the future for an outgoing message.</summary></entry><entry><title type="html">Secure connections</title><link href="http://www.manitou-mail.org/blog/2017/07/secure-connections/" rel="alternate" type="text/html" title="Secure connections" /><published>2017-07-17T15:56:00+02:00</published><updated>2017-07-17T15:56:00+02:00</updated><id>http://www.manitou-mail.org/blog/2017/07/secure-connections</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2017/07/secure-connections/">&lt;p&gt;Since commit &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-ui/commit/5881ed4193d76e803586c28b9d8ab95daa13caf5&quot;&gt;5881ed4&lt;/a&gt;, there is now an &lt;strong&gt;Encrypted session&lt;/strong&gt; tri-state checkbox in the login dialog, to be certain that the connection to the mail database is encrypted.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;When the checkbox is neither checked or unchecked (it’s in the third state), it means to use the default, which generally consists of trying an encrypted connection first, and if that’s not accepted by the server, an unencrypted connection next.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;When the checkbox is checked, it means to attemps an encrypted connection exclusively.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Finally when it’s unchecked, a non-encrypted connection is requested. This can be a good choice when the transport channel is already encrypted, such as with a VPN or an SSH tunnel, or if it’s local.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s still possible, as it has always been, to have finer control over encryption by setting directly &lt;em&gt;sslmode&lt;/em&gt; as a libpq parameter in the “More parameters” text field of the dialog.&lt;/p&gt;

&lt;p&gt;In that case, the new checkbox should be left in the “neither checked or unchecked” state, so as not to conflict with the setting in the other field.&lt;/p&gt;

&lt;p&gt;As an alternative, the environment variable &lt;em&gt;PGSSLMODE&lt;/em&gt; will be be taken into account if set in the environment. See &lt;a href=&quot;https://www.postgresql.org/docs/current/static/libpq-envars.html&quot;&gt;Environment Variables&lt;/a&gt; and &lt;a href=&quot;https://www.postgresql.org/docs/current/static/libpq-ssl.html&quot;&gt;SSL Support&lt;/a&gt; in the PostgreSQL documentation for all the details.&lt;/p&gt;

&lt;p&gt;Manually setting &lt;em&gt;sslmode&lt;/em&gt; is necessary to use the more specific modes &lt;em&gt;verify-ca&lt;/em&gt; or &lt;em&gt;verify-full&lt;/em&gt;, which in addition to request an encrypted connection, require that the server-side certificate is signed by a trusted authority.&lt;/p&gt;

&lt;p&gt;Now, what if the server is not set up to support TLS? There’s still the possibility of encrypting the connection through an SSH tunnel, provided you have a shell account on the server, or at least on a proxy server closer to the database server, and which itself can connect securely to it.&lt;/p&gt;

&lt;p&gt;An SSH tunnel requires finding a free-to-use TCP port on the client machine running the Manitou-Mail user interface. It can be 5432, the default PostgreSQL port, if there’s no local PostgreSQL instance on that host, otherwise an unused port should be taken. For example, if using 4000, this command would do:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;ssh -N -L4000:localhost:5432 dbserver.example.org&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then in Manitou-Mail connection window, the host should be set to &lt;code class=&quot;highlighter-rouge&quot;&gt;localhost&lt;/code&gt; and to port to 4000 through the “More parameters” field. When the user interface will connect to localhost:4000, ssh will do the rest by having the remote server connect to its own localhost at port 5432, and from then pas all the traffic between the user interface and the database through itself, all encrypted.&lt;/p&gt;

&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-374&quot; src=&quot;/blog/wp-content/uploads/2017/07/manitou-connect-ssh.png&quot; alt=&quot;manitou-connect-ssh&quot; width=&quot;322&quot; height=&quot;265&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In addition to the encryption, this method also alleviates the need for pg_hba.conf to allow the IP address of the remote user interface, since what connects to it is the ssh server running locally on the database server itself, or close to it in the case of a gateway to the LAN.&lt;/p&gt;</content><author><name></name></author><summary type="html">Since commit 5881ed4, there is now an Encrypted session tri-state checkbox in the login dialog, to be certain that the connection to the mail database is encrypted.</summary></entry><entry><title type="html">Note to query writers about mail_status</title><link href="http://www.manitou-mail.org/blog/2017/07/note-to-query-writers-about-mail_status/" rel="alternate" type="text/html" title="Note to query writers about mail_status" /><published>2017-07-08T15:31:29+02:00</published><updated>2017-07-08T15:31:29+02:00</updated><id>http://www.manitou-mail.org/blog/2017/07/note-to-query-writers-about-mail_status</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2017/07/note-to-query-writers-about-mail_status/">&lt;h3 id=&quot;the-mail_status-table&quot;&gt;The mail_status table&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;mail_status&lt;/code&gt;&lt;/strong&gt; is a (mail_id, status) table containing the subset of the&lt;/p&gt;

&lt;p&gt;mail that is not “current”, which in terms of status meant, technically:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;status &amp;amp; (16+32+256) = 0&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Commit &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-mdx/commit/de2ee18b29ae45d71173a88129f6a6690d96a24b&quot;&gt;de2ee18&lt;/a&gt;  and related commit &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-ui/commit/7804642c1b247c53822fd20642bfae380a650dfd&quot;&gt;7804642&lt;/a&gt; in the user interface remove that table&lt;/p&gt;

&lt;p&gt;in favor of a partial index on the mail table with the expression:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;(status &amp;amp; 32 = 0)&lt;/code&gt; which means exactly: “not archived”.&lt;/p&gt;

&lt;p&gt;This might raise a few questions among users who have developed their own set of queries. Hopefully the rest of this post will answer them in advance.&lt;/p&gt;

&lt;h3 id=&quot;can-we-keep-the-old-queries-involving-mail_status-unchanged&quot;&gt;Can we keep the old queries (involving mail_status) unchanged?&lt;/h3&gt;

&lt;p&gt;Yes, by creating a &lt;code class=&quot;highlighter-rouge&quot;&gt;mail_status&lt;/code&gt; view emulating the old table, taking advantage of the new index:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;CREATE VIEW mail_status AS
 SELECT mail_id, status FROM mail WHERE status&amp;amp;32=0 AND status&amp;amp;(16+256)=0;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Simple views like that are normally inlined by the PostgreSQL optimizer, so this should perform pretty well. When in doubt, use EXPLAIN in SQL to check the execution plan.&lt;/p&gt;

&lt;h3 id=&quot;what-is-the-motivation-behind-the-change-&quot;&gt;What is the motivation behind the change ?&lt;/h3&gt;

&lt;p&gt;Mostly performance. According to &lt;code class=&quot;highlighter-rouge&quot;&gt;EXPLAIN ANALYZE&lt;/code&gt;, joining against a &lt;code class=&quot;highlighter-rouge&quot;&gt;mail_status&lt;/code&gt; real table populated with a few thousand messages is fast (which is why this table existed in the first place), but avoiding the join and using the partial index instead is faster.&lt;/p&gt;

&lt;p&gt;Also mail_status was maintained by triggers on INSERT, UPDATE, DELETE, and these triggers were not free in execution time. Now they’re no longer necessary and have been removed in the above-mentioned commits.&lt;/p&gt;

&lt;h3 id=&quot;why-is-the-index-on-status320-instead-of-status16322560-&quot;&gt;Why is the index on status&amp;amp;32=0, instead of status&amp;amp;(16+32+256)=0 ?&lt;/h3&gt;

&lt;p&gt;For simplicity. The triggers maintaining &lt;code class=&quot;highlighter-rouge&quot;&gt;mail_status&lt;/code&gt; used the latter expression, but a message with the status “sent” (256) or “trashed” (16), but not “archived”, is a bit of a weird case, because there’s generally no action pending on a message that was sent or moved into the trashcan. It’s easier to reason about this new index knowing that it partitions the mail simply between archived and not archived, matching exactly the “archived” bit in the status.&lt;/p&gt;

&lt;p&gt;In most cases, &lt;code class=&quot;highlighter-rouge&quot;&gt;status&amp;amp;32=0&lt;/code&gt; is the expression that should be used to mean this message is “current”. For exact compatibility with the old expression, &lt;code class=&quot;highlighter-rouge&quot;&gt;status&amp;amp;32=0 AND status&amp;amp;(16+256)=0&lt;/code&gt; should be used, so that the PostgreSQL optimizer can use the new index.&lt;/p&gt;</content><author><name></name></author><summary type="html">The mail_status table</summary></entry><entry><title type="html">Version 1.6.0 released</title><link href="http://www.manitou-mail.org/blog/2017/03/version-1-6-0-released/" rel="alternate" type="text/html" title="Version 1.6.0 released" /><published>2017-03-20T11:36:21+01:00</published><updated>2017-03-20T11:36:21+01:00</updated><id>http://www.manitou-mail.org/blog/2017/03/version-1-6-0-released</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2017/03/version-1-6-0-released/">&lt;p&gt;&lt;strong&gt;Manitou-Mail 1.6.0&lt;/strong&gt; is released and available to &lt;a href=&quot;/download&quot;&gt;download&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This version adds operators in the searchbar (on message senders, recipients, status, dates, attachments, tags), a statistics panel with charts and exportable results, and the creation of users and groups within the interface.&lt;/p&gt;

&lt;p&gt;The users management features also include access rights checked at the database level, and the possibility of restricting certain accounts to certain identities, using policies with PostgreSQL’s Row Level Security feature.&lt;/p&gt;

&lt;p&gt;Pre-compiled binaries for Windows and macOs come ready-to-use with Qt-5.5 libraries (including WebKit), and binary packages for Linux Debian 8 and Ubuntu 14.04 and 16.04 are also available through the APT repository (see the &lt;a href=&quot;http://www.manitou-mail.org/download&quot;&gt;download page&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;If upgrading from a previous version, make sure to run the server-side command:&lt;/p&gt;

&lt;pre&gt;manitou-mgr --upgrade-schema&lt;/pre&gt;</content><author><name></name></author><summary type="html">Manitou-Mail 1.6.0 is released and available to download.</summary></entry><entry><title type="html">Improvements in mail deduplication</title><link href="http://www.manitou-mail.org/blog/2016/10/improvements-in-mail-deduplication/" rel="alternate" type="text/html" title="Improvements in mail deduplication" /><published>2016-10-04T17:08:51+02:00</published><updated>2016-10-04T17:08:51+02:00</updated><id>http://www.manitou-mail.org/blog/2016/10/improvements-in-mail-deduplication</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2016/10/improvements-in-mail-deduplication/">&lt;p&gt;The &lt;strong&gt;no_duplicate&lt;/strong&gt; plugin tracks exact duplicates, precisely incoming mail files having the same SHA1 fingerprint as a previously imported mail file.&lt;/p&gt;

&lt;p&gt;Up to now, such duplicates could be discarded by simply declaring in manitou-mdx configuration file:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[name@example.com]
incoming_preprocess_plugins = no_duplicate
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;But when a manitou-mail database is used to only sync new messages from an IMAP server, with hierarchical tags reflecting folders, a message move across IMAP folders is interpreted as a duplicate coming in.&lt;/p&gt;

&lt;p&gt;It’s fine and actually desirable not to import the message again, but ideally we’d want to see it in its new folder.&lt;/p&gt;

&lt;p&gt;The no_duplicate plugin can now do that by acting both as a incoming_preprocess_plugin and as a incoming_postprocess_plugin.&lt;/p&gt;

&lt;p&gt;The first step recognizes the duplicate, and optionally, updates the tags of the message instance already in the database.&lt;/p&gt;

&lt;p&gt;The second step associates the SHA1 fingerprint of a newly imported message to its unique ID, which is necessary for the optional tags update to work, if a duplicate of this message comes in the future with different tags.&lt;/p&gt;

&lt;p&gt;The declaration taking advantage of this new feature looks like:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[name@example.com]
incoming_preprocess_plugins = no_duplicate({update_tags=&amp;gt;1})
incoming_postprocess_plugins = no_duplicate
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;For more information on manitou-mdx plugins, see the &lt;a href=&quot;/doc/mdx/mdx.plugins.html&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;</content><author><name></name></author><summary type="html">The no_duplicate plugin tracks exact duplicates, precisely incoming mail files having the same SHA1 fingerprint as a previously imported mail file.</summary></entry><entry><title type="html">Users management in the interface</title><link href="http://www.manitou-mail.org/blog/2016/09/users-management-in-the-interface/" rel="alternate" type="text/html" title="Users management in the interface" /><published>2016-09-21T22:50:16+02:00</published><updated>2016-09-21T22:50:16+02:00</updated><id>http://www.manitou-mail.org/blog/2016/09/users-management-in-the-interface</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2016/09/users-management-in-the-interface/">&lt;p&gt;Starting with version 1.6, Manitou-Mail will allow the creation of users and groups from within the user interface, as shown in the screenshot below:&lt;/p&gt;

&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-337&quot; src=&quot;/blog/wp-content/uploads/2016/09/ui-users-1.png&quot; alt=&quot;ui-users-1&quot; width=&quot;500&quot; height=&quot;666&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Although PostgreSQL has merged users and groups into “roles” years ago (since version 8.1), we intentionally stick to the old terminology, because it’s easier to describe the model. Users are accounts for persons connecting to the database. Groups are entities having a set of permissions, and to which users are assigned. Users can belong to multiple groups.&lt;/p&gt;

&lt;p&gt;Under the hood, users correspond to PostgreSQL roles having the LOGIN attribute, so they can log in (assuming proper permissions), whereas groups are roles that don’t have this attribute. Permissions are associated to groups through SQL GRANT commands.&lt;/p&gt;

&lt;p&gt;Manitou-Mail chooses to associate permissions to groups, instead of individual users, given its focus on team work and shared mail corpuses. For a set of permissions that apply to a single person, a group will have to be created with only that user as member.&lt;/p&gt;

&lt;p&gt;The set of permissions currently handled is shown in the snapshot below. More fined-tuned permissions will probably be added in the future depending on users needs.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-338&quot; src=&quot;/blog/wp-content/uploads/2016/09/ui-users-2.png&quot; alt=&quot;ui-users-2&quot; width=&quot;278&quot; height=&quot;525&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Version 1.6 can be built from the git repository; pre-compiled binaries and packages will be made available soon.&lt;/p&gt;</content><author><name></name></author><summary type="html">Starting with version 1.6, Manitou-Mail will allow the creation of users and groups from within the user interface, as shown in the screenshot below:</summary></entry><entry><title type="html">Parallel import</title><link href="http://www.manitou-mail.org/blog/2016/07/parallel-import/" rel="alternate" type="text/html" title="Parallel import" /><published>2016-07-18T17:37:43+02:00</published><updated>2016-07-18T17:37:43+02:00</updated><id>http://www.manitou-mail.org/blog/2016/07/parallel-import</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2016/07/parallel-import/">&lt;p&gt;Importing in parallel from a single source  is really enabled in manitou-mdx since &lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-mdx/commit/76a860ed53fa713a3f75d781c7faacf34277b91a&quot;&gt;commit 6a860e&lt;/a&gt;, under the following conditions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;parallelism is driven from the outside: manitou-mdx instances run concurrently, but don’t fork and manage child workers. Workers don’t share anything. Fortunately &lt;a href=&quot;https://www.gnu.org/software/parallel/&quot;&gt;GNU parallel&lt;/a&gt; can easily handle this part.&lt;/li&gt;
  &lt;li&gt;the custom full text indexing is done once the contents are imported, not during the import. The reason is that it absolutely needs a cache for performance, and such a cache wouldn’t work in the share-nothing implementation mentioned above.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href=&quot;/blog/2016/07/mass-importing-case-the-enron-mail-database/&quot;&gt;previous post&lt;/a&gt; showed how to create a list of all mail files to import from the Enron sample database.&lt;/p&gt;

&lt;p&gt;Now instead of that, let’s create a list splitted in chunks of 25k messages, that will be fed separately to the parallel workers:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;
$ find . -type f | split -d -l 25000 - /data/enron/list-
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The result is 21 numbered files of 25000 lines each, except for the last one, list-20 containing 17401 lines.&lt;/p&gt;

&lt;p&gt;The main command is essentially the same as before. As a shell variable:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;
cmd=&quot;mdx/script/manitou-mdx --import-list={} \
--import-basedir=$basedir/maildir \
--conf=$basedir/enron-mdx.conf \
--status=33&quot;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Based on this, a parallel import with 8 workers can be launched through a single command:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;
ls &quot;$basedir&quot;/list-* | parallel -j 8 $cmd
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This invocation will automatically launch manitou-mdx processes and feed them each with a different list of mails to import (through the –import-list={} argument). It will also take care that there are always 8 such running processes if possible, launching a new one when another terminates.&lt;/p&gt;

&lt;p&gt;This is very effective, compared to a serial import. Here are the times spent to import to entire mailset (517401 messages) for various degrees of parallelism, on a small server with a Xeon D-1540 @ 2.00GHz processor (8 cores, 16 threads).&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-330&quot; src=&quot;/blog/wp-content/uploads/2016/07/parallel-mdx.png&quot; alt=&quot;parallel-mdx&quot; width=&quot;638&quot; height=&quot;393&quot; /&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Importing in parallel from a single source  is really enabled in manitou-mdx since commit 6a860e, under the following conditions:</summary></entry><entry><title type="html">Mass-importing case: the Enron mail database</title><link href="http://www.manitou-mail.org/blog/2016/07/mass-importing-case-the-enron-mail-database/" rel="alternate" type="text/html" title="Mass-importing case: the Enron mail database" /><published>2016-07-12T14:45:18+02:00</published><updated>2016-07-12T14:45:18+02:00</updated><id>http://www.manitou-mail.org/blog/2016/07/mass-importing-case-the-enron-mail-database</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2016/07/mass-importing-case-the-enron-mail-database/">&lt;p&gt;Importing mail messages en masse works best when fiddling a bit with the configuration, rather than pushing the mail messages into the normal feed.&lt;/p&gt;

&lt;p&gt;As an example, we’re going to use the mails from Enron, the energy provider that famously went down in the 90s, amidst a fraud scandal.&lt;/p&gt;

&lt;p&gt;The mail corpus has been made public by the judicial process:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.cs.cmu.edu/~enron/&quot;&gt;http://www.cs.cmu.edu/~enron/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It has been cleaned from all attachments, in addition to another cleaning process to remove potentially sensitive personal information, done by &lt;a href=&quot;https://www.nuix.com/&quot;&gt;Nuix&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The archive format is a 423MB .tar.gz file with an MH-style layout:&lt;/p&gt;

&lt;p&gt;– one top-level directory per account.&lt;/p&gt;

&lt;p&gt;– inside each account, files and directories with mail folders.&lt;/p&gt;

&lt;p&gt;It contains 3500 directories for 151 accounts, and a total of 517401 files, taking 2.6GB on disk once uncompressed.&lt;/p&gt;

&lt;p&gt;After unpacking the archive, follow these steps to import the mailset from scratch:&lt;/p&gt;

&lt;h3 id=&quot;1-create-the-list-of-files&quot;&gt;1) Create the list of files&lt;/h3&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;
$ cd /data/enron/maildir
$ find . -type f &amp;gt; /data/enron/00-list-all
&lt;/code&gt;&lt;/p&gt;

&lt;h3 id=&quot;2-create-a-database-and-a-dedicated-configuration-file-for-manitou-mdx&quot;&gt;2) Create a database and a dedicated configuration file for manitou-mdx&lt;/h3&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Run this as a user with enough privileges to create
# a database (generally, postgres should do)
$ manitou-mgr --create-database --db-name=enron
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Create a specific configuration file with some optimizations for mass import:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ cat enron-mdx.conf
[common]
db_connect_string = Dbi:Pg:dbname=enron;user=manitou
update_runtime_info = no
update_addresses_last = no
apply_filters = no
index_words = no
preferred_datetime = sender
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;update_runtime_info&lt;/em&gt; is set to no to avoid needlessly update
timestamps in the &lt;code class=&quot;highlighter-rouge&quot;&gt;runtime_info&lt;/code&gt; table for every imported message.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;update_addresses_last&lt;/em&gt; set to no also will avoid some unnecessary writes.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;apply_filters&lt;/em&gt; is again a micro-optimization to avoid querying for filters on every message. On the other hand, it should be left to yes if happen to have defined filters and want them to be used during this import.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;index_words&lt;/em&gt; is &lt;strong&gt;key to performance&lt;/strong&gt;. Running the full-text indexing after the import instead of during it makes it 3x faster. Also the full-text indexing as a separate process can be parallelized (more on that below).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;preferred_datetime&lt;/em&gt; set to &lt;strong&gt;sender&lt;/strong&gt; indicates that the date of a message is given by its header Date field, as opposed to the file creation time.&lt;/p&gt;

&lt;p&gt;If we were importing into a pre-existing manitou-mdx instance running in the background, we would stop it at this point, as&lt;/p&gt;

&lt;p&gt;several instances of manitou-mdx cannot work on the same database because of caching, except in specific circumstances (also more on that later).&lt;/p&gt;

&lt;h3 id=&quot;3-run-the-actual-import-command&quot;&gt;3) Run the actual import command&lt;/h3&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ cd /data/enron/maildir
$ time manitou-mdx --import-list=../00-list-all --conf=../enron-mdx.conf
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;On a low-end server, it takes about 70 minutes to import the 517402 messages with this configuration and PostgreSQL 9.5.&lt;/p&gt;

&lt;p&gt;We can check with psql that all messages came in:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ psql -d enron -U manitou
psql (9.5.3)
Type &quot;help&quot; for help.&amp;lt;/p&amp;gt;
&amp;lt;p&amp;gt;enron=&amp;gt; select count(*) from mail;
 count
--------
 517401
(1 row)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;4-run-the-full-text-indexing&quot;&gt;4) Run the full text indexing&lt;/h3&gt;

&lt;p&gt;As it’s a new database with no preexisting index, we don’t have to worry about existing partitions. We let manitou-mgr index the messages with 4 jobs in parallel:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;
$ time manitou-mgr --conf=enron-mdx.conf  --reindex-full-text --reindex-jobs=4
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Output from time:&lt;/p&gt;

&lt;pre&gt;
real    10m41.855s
user    28m22.744s
sys     1m8.476s
&lt;/pre&gt;

&lt;p&gt;So this part of the process takes about 10 minutes.&lt;/p&gt;

&lt;h3 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;With manitou-mgr, we can check the final size of the database and its main tables:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ manitou-mgr --conf=enron-mdx.conf --print-size
-----------------------------------
addresses           :    13.52 MB
attachment_contents :     0.02 MB
attachments         :     0.02 MB
body                :   684.98 MB
header              :   402.45 MB
inverted_word_index :  2664.77 MB
mail                :   250.12 MB
mail_addresses      :   441.17 MB
mail_tags           :     0.01 MB
pg_largeobject      :     0.01 MB
raw_mail            :     0.01 MB
words               :   106.52 MB
-----------------------------------
Total database size : 4633 MB
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Future posts will show how it compares to the full mailset (with attachments, 18GB of .pst files), and how to parallelize the main import itself.&lt;/p&gt;</content><author><name></name></author><summary type="html">Importing mail messages en masse works best when fiddling a bit with the configuration, rather than pushing the mail messages into the normal feed.</summary></entry><entry><title type="html">Operators in the search bar</title><link href="http://www.manitou-mail.org/blog/2016/07/operators-in-the-search-bar/" rel="alternate" type="text/html" title="Operators in the search bar" /><published>2016-07-09T18:30:28+02:00</published><updated>2016-07-09T18:30:28+02:00</updated><id>http://www.manitou-mail.org/blog/2016/07/operators-in-the-search-bar</id><content type="html" xml:base="http://www.manitou-mail.org/blog/2016/07/operators-in-the-search-bar/">&lt;p&gt;Until now, the search bar in the user interface did not support query
terms to search on metadata.&lt;/p&gt;

&lt;p&gt;I’m glad to say that commits
&lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-ui/commit/2ddddaae684bb7d4f17f069a6d4a08a803e52f12&quot;&gt;2ddddaae&lt;/a&gt;
and
&lt;a href=&quot;https://github.com/manitou-mail/manitou-mail-ui/commit/a1cbe72a8a9fc38f3f53e6efe8d05378d6482632&quot;&gt;a1cbe72a&lt;/a&gt;
add support for filtering by date and message status right from the
search bar, introducing five operators:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;“date:” must be followed by an iso-8601 date (format &lt;em&gt;YYYY-MM-DD&lt;/em&gt;),
or by a specific month (format &lt;em&gt;YYYY-MM&lt;/em&gt;), or just a year (&lt;em&gt;YYYY&lt;/em&gt;).
It selects the messages from respectively that day,or month, or year.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;“before:” has the same format but selects messages dated
    from this day/month/year or an earlier date.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;“after:” is of course the opposite, selecting messages past
        the date that follows.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;“is:” must be followed by a status among read,replied,forward,archived,sent.
      Criteria can be combined by using the option several times, as statuses are cumulative, not mutually exclusive,&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;“isnot:” is of course the opposite of “is”. It accepts the same arguments  and filters out the messages that have the corresponding status bit.&lt;/p&gt;

    &lt;p&gt;“is:” and “isnot:” can also be combined, for instance: “is:archived isnot:sent”.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A few more search bar operators are likely to be added to that list, as it’s a pretty handy and fast way to express basic queries.&lt;/p&gt;</content><author><name></name></author><summary type="html">Until now, the search bar in the user interface did not support query terms to search on metadata.</summary></entry></feed>