I wanted to do an analysis of my emails since I joined IBM, and see the flow of messages in-and-out of my inbox.
With my preferences for Jupyter Notebooks, I built a small notebook for analysis.
Steps
Open IBM Lotus Notes Rich Client
Open the Notes Database with the View you want to analyze.
Select the View you are interested in ‘All Documents’. For instance the All Documents view, like my inbox *obfuscated* with a purpose.
Click File > Export
Enter a file name – email.csv
Select Format “Comma Separate Value”
Click Export
Upload the Notebook to your Jupyter server
The notebook is describes the flow through my process. If you encounter ValueError: (‘Unknown string format:’, ’12/10/2018 08:34 AM’), you can refer to https://stackoverflow.com/a/8562577/1873438
iconv -c -f utf-8 -t ascii email.csv > email.csv.clean
You can break the data into month-year-day analysis with the following, and peek the results with df_emailA.head()
When you run the final cell, the code generates a Year-Month-Day count as a bar graph.
# Title: Volume in Months when emails are sent.
# Plots volume based on known year-mm-dd
# to be included in the list, one must have data in those years.
# Kind is a bar graph, so that the (Y - YYYY,MM can be read)
y_m_df = df_emailA.groupby(['year','month','day']).year.count()
y_m_df.plot(kind="bar")
plt.title('Numbers submitted By YYYY-MM-DD')
plt.xlabel('Email Flow')
plt.ylabel('Year-Month-Day')
plt.autoscale(enable=True, axis='both', tight=False)
plt.rcParams['figure.figsize'] = [20, 200]
You’ll see the trend of emails I receive over the years.
Leave a Reply