With Python as my favorite tool imaplib seemed like the obvious choice. imaplib is however not that intuitive to use - unless you know http://tools.ietf.org/html/rfc3501 by heart...
Making it work was a bit too hard, so here comes the script I came up with - it can easily be customized for other tasks:
import imaplib
M = imaplib.IMAP4()
print 'logging in:', M.login('username', 'password')
print 'selecting folder:', M.select('HomeInbox')
_type, data = M.search(None, 'ALL')
# or to search for string in field: (None, 'Subject', 'Re:')
message_ids = set()
for num in data[0].split():
# fetch seems quite obscure - fetching one field at a time to make it "simpler"
date = M.fetch(num, '(BODY[HEADER.FIELDS (DATE)])')[1][0][1].strip()
subject = M.fetch(num, '(BODY[HEADER.FIELDS (SUBJECT)])')[1][0][1].strip()
message_id = M.fetch(num, '(BODY[HEADER.FIELDS (Message-ID)])')[1][0][1].strip()
if message_id in message_ids:
print 'deleting', (num, date, subject, message_id)
#M.store(num, '+FLAGS', '\\Deleted') # uncomment to delete
else:
message_ids.add(message_id)
M.expunge()
M.close()
M.logout()
This script as it is removes all doublets based on message id.
1 comment:
Note that this will mark all messages as read.
Post a Comment