Fun with Patent Data: Thomas Edison Jupyter Notebook

Thomas Alva Edison was a famous American inventor and businessman, “described as America’s greatest inventor”, and was one of the most prolific inventors in US history. Thomas Edison was granted/filed 1084 patents from 1847-1931.[1] He’s just one cool inventor – lamps, light bulbs, phonograph and so many more life changing inventions.

Google Patents has a wonderful depth of patent history, and the history is searchable with custom search strings:

  • inventor:(Thomas Edison) before:priority:19310101
  • inventor:(Paul R Bastide) after:priority:2009-01-01

Google provides a seriously cool feature – a downloadable csv. Pandas anyone? The content is provided in an agreement between the USPTO and Google. Google also provides it as part of the Google APIs/Platform. The data is fundamentally public, and Google has made it very accessible with some GitHub examples. [2] The older patent data more difficult to search as the content has been scraped from Optical Character Recognition.

I have found a cross-section of three things I am very interested in: History, Inventing and Data Science. Time to see what cool things about the Edison data.

Step

To start the playing with the data, one must install Jupyter.

python3 -m pip install --upgrade pip
python3 -m pip install jupyter

Launch jupyter and navigate to the http://localhost:8888/tree

jupyter notebook

Load and Launch the notebook

  1. Download the Edison.ipynb
  2. Unzip the Edison.ipynb.zip
  3. Upload the Edison.ipynb to Jupyter
  4. Launch the Edison notebook and follow along with the cells.

The notebook renders some interesting insights using numpy, pandas, matplotlib and scipy. The notebook includes a cell to install python libraries, and once one executes the per-requisites cell; all is loaded.

The Jupyter notebook loads the data using an input cell, once run, the analytics enable me to see the number of co-inventors (but need to cleanse the data first).

One notices that Thomas Alva is not an inventor in those results, as such one needs to modify to the notebook to use the API with more recent Inventors. With the comprehensive APIs from USPTO, one extracts patent data by one of a number of JSON REST APIs. Kudos to the USPTO to really open up the data and the API.

Conclusion

All-in the APIs/Python/Jupyter Notebook/Analysis are for fun, and provide insight into Thomas Edison’s patent data – one focused individual.

References

[1] Prolific Inventors https://en.wikipedia.org/wiki/List_of_prolific_inventors number wise it appears to conflict with https://en.wikipedia.org/wiki/List_of_Edison_patents which reports 1093 (it’s inclusive of design patents)
[2] Google / USPTO Patent Data https://www.google.com/googlebooks/uspto-patents.html
[3] USPTO Open Data https://developer.uspto.gov/about-open-data and https://developer.uspto.gov/api-catalog
[4] PatentsView http://www.patentsview.org/api/faqs.html

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.