For some reason I wanted to do some mass-managing of my inbox. If it had been in the file system I would have created a shell script - or used Python. But my mail reader focus on being user friendly for normal users, not on scripting for power users. A wise decision; they can't cover all use cases and preferences for language of chocie etc.
With Python as my favorite tool imaplib seemed like the obvious choice. imaplib is however not that intuitive to use - unless you know http://tools.ietf.org/html/rfc3501 by heart...
Making it work was a bit too hard, so here comes the script I came up with - it can easily be customized for other tasks:
import imaplib
M = imaplib.IMAP4()
print 'logging in:', M.login('username', 'password')
print 'selecting folder:', M.select('HomeInbox')
_type, data = M.search(None, 'ALL')
# or to search for string in field: (None, 'Subject', 'Re:')
message_ids = set()
for num in data[0].split():
# fetch seems quite obscure - fetching one field at a time to make it "simpler"
date = M.fetch(num, '(BODY[HEADER.FIELDS (DATE)])')[1][0][1].strip()
subject = M.fetch(num, '(BODY[HEADER.FIELDS (SUBJECT)])')[1][0][1].strip()
message_id = M.fetch(num, '(BODY[HEADER.FIELDS (Message-ID)])')[1][0][1].strip()
if message_id in message_ids:
print 'deleting', (num, date, subject, message_id)
#M.store(num, '+FLAGS', '\\Deleted') # uncomment to delete
else:
message_ids.add(message_id)
M.expunge()
M.close()
M.logout()
This script as it is removes all doublets based on message id.