I wanted to do an analysis of my emails since I joined IBM, and see the flow of messages in-and-out of my inbox.
With my preferences for Jupyter Notebooks, I built a small notebook for analysis.
Steps
Open IBM Lotus Notes Rich Client
Open the Notes Database with the View you want to analyze.
![](https://bastide.org/wp-content/uploads/2019/01/image-2.png)
Select the View you are interested in ‘All Documents’. For instance the All Documents view, like my inbox *obfuscated* with a purpose.
![](https://bastide.org/wp-content/uploads/2019/01/image-3.png)
Click File > Export
Enter a file name – email.csv
Select Format “Comma Separate Value”
Click Export
![](https://bastide.org/wp-content/uploads/2019/01/image-4.png)
Upload the Notebook to your Jupyter server
The notebook is describes the flow through my process. If you encounter ValueError: (‘Unknown string format:’, ’12/10/2018 08:34 AM’), you can refer to https://stackoverflow.com/a/8562577/1873438
iconv -c -f utf-8 -t ascii email.csv > email.csv.clean
You can break the data into month-year-day analysis with the following, and peek the results with df_emailA.head()
![](https://bastide.org/wp-content/uploads/2019/01/image-5-1024x297.png)
When you run the final cell, the code generates a Year-Month-Day count as a bar graph.
# Title: Volume in Months when emails are sent.
# Plots volume based on known year-mm-dd
# to be included in the list, one must have data in those years.
# Kind is a bar graph, so that the (Y - YYYY,MM can be read)
y_m_df = df_emailA.groupby(['year','month','day']).year.count()
y_m_df.plot(kind="bar")
plt.title('Numbers submitted By YYYY-MM-DD')
plt.xlabel('Email Flow')
plt.ylabel('Year-Month-Day')
plt.autoscale(enable=True, axis='both', tight=False)
plt.rcParams['figure.figsize'] = [20, 200]
You’ll see the trend of emails I receive over the years.
Leave a Reply