Machine learning and natural language processing on the patent corpus: Data, tools, and new measures | Synapse