Parsing outlook .msg files with python
Even though this is an old thread, I hope this information might help someone who is looking for a solution to what the thread subject exactly says. I strongly advise using the solution of mattgwwalker in github, which requires OleFileIO_PL module to be installed externally.
I succeeded extracting relevant fields from MS Outlook files (.msg) using msg-extractor
utilitity by Matt Walker.
Prerequesites
pip install extract-msg
Note, it may require to install additional modules, in my case, it required to install imapclient:
pip install imapclient
Usage
import extract_msg
f = r'MS_Outlook_file.msg' # Replace with yours
msg = extract_msg.Message(f)
msg_sender = msg.sender
msg_date = msg.date
msg_subj = msg.subject
msg_message = msg.body
print('Sender: {}'.format(msg_sender))
print('Sent On: {}'.format(msg_date))
print('Subject: {}'.format(msg_subj))
print('Body: {}'.format(msg_message))
There are many other goodies in MsgExtractor utility, to be explored, but this is good to start with.
Note
I had to comment out lines 3 to 8 within the file C:\Anaconda3\Scripts\ExtractMsg.py:
#"""
#ExtractMsg:
# Extracts emails and attachments saved in Microsoft Outlook's .msg files
#
#https://github.com/mattgwwalker/msg-extractor
#"""
Error message was:
line 3
ExtractMsg:
^
SyntaxError: invalid syntax
After blocking those lines, the error message disappeared and the code worked just fine.
This works for me:
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(r"C:\test_msg.msg")
print msg.SenderName
print msg.SenderEmailAddress
print msg.SentOn
print msg.To
print msg.CC
print msg.BCC
print msg.Subject
print msg.Body
count_attachments = msg.Attachments.Count
if count_attachments > 0:
for item in range(count_attachments):
print msg.Attachments.Item(item + 1).Filename
del outlook, msg
Please refer to the following post regarding methods to access email addresses and not just the names (ex. "John Doe") from the To, CC and BCC properties - enter link description here