The Electrolyte Genome project: A big data approach in battery materials discovery
We present a high-throughput infrastructure for the automated calculation of molecular properties with a focus on battery electrolytes. The infrastructure is largely open-source and handles both practical aspects (input file generation, output file parsing, and information management) as well as more complex problems (structure matching, salt complex generation, and failure recovery). Using this infrastructure, we have computed the ionization potential (IP) and electron affinities (EA) of 4830 molecules relevant to battery electrolytes (encompassing almost 55,000 quantum mechanics calculations) at the B3LYP/ 6-31+G⁄ level. We describe automated workflows for computing redox potential, dissociation constant, and salt-molecule binding complex structure generation. We present routines for automatic recovery from calculation errors, which brings the failure rate from 9.2% to 0.8% for the QChem DFT code. Automated algorithms to check duplication between two arbitrary molecules and structures are described. We present benchmark data on basis sets and functionals on the G2-97 test set; one finding is that a IP/EA calculation method that combines PBE geometry optimization and B3LYP energy evaluation requires less computational cost and yields nearly identical results as compared to a full B3LYP calculation, and could be suitable for the calculation of large molecules. Our data indicates that among the 8 functionals tested, XYGJ-OS and B3LYP are the two best functionals to predict IP/EA with an RMSE of 0.12 and 0.27 eV, respectively. Application of our automated workflow to a large set of quinoxaline derivative molecules shows that functional group effect and substitution position effect can be separated for IP/EA of quinoxaline derivatives, and the most sensitive position is different for IP and EA.