MIT: New computational chemistry techniques accelerate the prediction of molecules and materials

Back in the old days — the really old days — the task of designing materials was laborious. Investigators, over the course of 1,000-plus years, tried to make gold by combining things like lead, mercury, and sulfur, mixed in what they hoped would be just the right proportions. Even famous scientists like Tycho Brahe, Robert Boyle, and Isaac Newton tried their hands at the fruitless endeavor we call alchemy.

Materials science has, of course, come a long way. For the past 150 years, researchers have had the benefit of the periodic table of elements to draw upon, which tells them that different elements have different properties, and one can’t magically transform into another. Moreover, in the past decade or so, machine learning tools have considerably boosted our capacity to determine the structure and physical properties of various molecules and substances. New research by a group led by Ju Li — the Tokyo Electric Power Company Professor of Nuclear Engineering at MIT and professor of materials science and engineering — offers the promise of a major leap in capabilities that can facilitate materials design. The results of their investigation are reported in a December 2024 issue of Nature Computational Science.

At present, most of the machine-learning models that are used to characterize molecular systems are based on density functional theory (DFT), which offers a quantum mechanical approach to determining the total energy of a molecule or crystal by looking at the electron density distribution — which is, basically, the average number of electrons located in a unit volume around each given point in space near the molecule. (Walter Kohn, who co-invented this theory 60 years ago, received a Nobel Prize in Chemistry for it in 1998.) While the method has been very successful, it has some drawbacks, according to Li: “First, the accuracy is not uniformly great. And, second, it only tells you one thing: the lowest total energy of the molecular system.”

“Couples therapy” to the rescue

His team is now relying on a different computational chemistry technique, also derived from quantum mechanics, known as coupled-cluster theory, or CCSD(T). “This is the gold standard of quantum chemistry,” Li comments. The results of CCSD(T) calculations are much more accurate than what you get from DFT calculations, and they can be as trustworthy as those currently obtainable from experiments. The problem is that carrying out these calculations on a computer is very slow, he says, “and the scaling is bad: If you double the number of electrons in the system, the computations become 100 times more expensive.” For that reason, CCSD(T) calculations have normally been limited to molecules with a small number of atoms — on the order of about 10. Anything much beyond that would simply take too long.

That’s where machine learning comes in. CCSD(T) calculations are first performed on conventional computers, and the results are then used to train a neural network with a novel architecture specially devised by Li and his colleagues. After training, the neural network can perform these same calculations much faster by taking advantage of approximation techniques. What’s more, their neural network model can extract much more information about a molecule than just its energy. “In previous work, people have used multiple different models to assess different properties,” says Hao Tang, an MIT PhD student in materials science and engineering. “Here we use just one model to evaluate all of these properties, which is why we call it a ‘multi-task’ approach.”

The “Multi-task Electronic Hamiltonian network,” or MEHnet, sheds light on a number of electronic properties, such as the dipole and quadrupole moments, electronic polarizability, and the optical excitation gap — the amount of energy needed to take an electron from the ground state to the lowest excited state. “The excitation gap affects the optical properties of materials,” Tang explains, “because it determines the frequency of light that can be absorbed by a molecule.”