Accelerating cathode optimization shortens development cycles for high‑energy lithium‑ion batteries, a critical bottleneck for electric‑vehicle and grid‑storage markets. The method also reduces reliance on scarce elements, lowering supply‑chain risk.
Battery research has long been hampered by the sheer size of compositional space. Traditional trial‑and‑error methods rely on chemists’ intuition, which can only sample a few dozen candidates while millions remain unexplored. High‑voltage cathodes such as LiCoPO₄ promise greater energy density, yet their poor conductivity and stability have stalled commercial adoption. By framing materials discovery as a data‑driven optimization problem, AI can systematically navigate these vast landscapes, turning a combinatorial nightmare into a tractable search.
The McGill‑Mila team combined a set‑transformer neural network—pre‑trained on 100,000 inorganic compounds—with a multi‑task Gaussian process to predict four key electrochemical metrics simultaneously. An active‑learning loop evaluated all 14.2 million possible triple‑doped formulations in under 20 minutes on a single GPU, then selected the most promising 63 candidates for robotic synthesis. Within three rounds and fewer than 200 physical experiments, the workflow uncovered compositions achieving a figure‑of‑merit of 5.1, representing a fivefold gain over the undoped baseline and delivering near‑theoretical capacity with dramatically lower overpotential.
The implications extend beyond a single material system. Rapid, low‑cost screening reduces time‑to‑market for next‑generation lithium‑ion batteries, a decisive advantage for electric‑vehicle manufacturers and grid‑scale storage providers. Moreover, the AI’s shift away from indium toward more abundant dopants like chromium and niobium mitigates supply‑chain vulnerabilities. As the surrogate model is agnostic to the host material, the same closed‑loop architecture can be deployed for solid‑state electrolytes, anode alloys, or entirely new chemistries, heralding a new era where computational insight, not chemical intuition alone, drives breakthroughs in energy storage.
Feb 20 2026 · A machine‑learning loop searched 14 million battery cathode compositions and found fivefold performance gains across four metrics using fewer than 200 experiments. · Michael Berger
Nanowerk Spotlight – Among millions of untested chemical compositions, there almost certainly exists a battery cathode that charges faster, lasts longer, and wastes less energy than any currently known. The problem is finding it. A chemist armed with deep expertise and sound intuition might pick a few dozen promising candidates from that vast landscape, synthesize them, test them, and learn something useful. But “something useful” is a far cry from “optimal,” and the gap between the two has become one of the defining frustrations of modern battery research.
The cathode, the electrode that stores and releases lithium ions during each charge‑discharge cycle, is where much of the action happens in a lithium‑ion cell. It is also where the most stubborn performance trade‑offs live. Boost a cathode’s energy density and its cycle life often suffers. Reduce the voltage penalty during operation and the material may become structurally unstable.
Years of incremental progress, from the original lithium cobalt oxide commercialized by Sony in 1991 to today’s nickel‑rich layered oxides, have been driven largely by trial, error, and chemical intuition. Doping, the deliberate substitution of small amounts of one element for another in the crystal lattice, is one of the most powerful levers available. But when researchers combine two or three dopants at varying concentrations, the number of possible formulations becomes staggering, and intuition alone cannot chart a course through millions of candidates.
A collaboration between McGill University, Mila‑Quebec AI Institute, and Université de Montréal has now shown that a machine can. In a study published in Advanced Materials (“Navigating Ternary Doping in Li‑ion Cathodes With Closed‑Loop Multi‑Objective Bayesian Optimization”), the team describes a closed‑loop system that couples robotic high‑throughput synthesis with multi‑objective machine learning to search a space of approximately 14.2 million unique triple‑doped variants of lithium cobalt phosphate (LiCoPO₄), a high‑voltage cathode material.
That number arises from the combinatorics of choosing three dopants from a pool of 56 candidate elements, each at eight possible concentration levels. Where previous work on the same material managed to optimize only one electrochemical property at a time, this approach simultaneously improved four.
![Workflow where experimental electrochemical data is supplemented and guided by a ML loop]
Workflow used in this project, where experimental electrochemical data is supplemented and guided by a ML loop. (Image: Reproduced from DOI:10.1002/adma.202519790, CC BY)
LiCoPO₄ operates at a high voltage, which translates directly to high energy density, a desirable trait for applications where weight and volume matter. But the material has well‑documented shortcomings. Poor electronic conductivity prevents it from delivering its full theoretical capacity. Side reactions during the first charge cycle consume lithium irreversibly. The gap between charging and discharging voltages, known as overpotential, is large. And the material degrades with repeated cycling.
In earlier work (Advanced Energy Materials, “Accelerated Development of High Voltage Li‑Ion Cathodes”), the same research group tested 47 individual dopants and identified indium as especially effective at boosting conductivity. They then paired indium with molybdenum, producing a codoped cathode that delivered 160 mAh g⁻¹ of capacity, near the theoretical ceiling of 167 mAh g⁻¹, and retained 76 % of that capacity over ten cycles. Both numbers represented major gains over the undoped material, which managed about 100 mAh g⁻¹ and 50 % retention. But even that optimized composition failed to improve all four performance metrics: its irreversible capacity actually worsened compared to the baseline.
The new study set out to overcome this limitation by searching a much larger composition space. The team allowed triple doping with elements drawn from the full pool of 56 candidates, each at eight concentration levels from 0.01 to 0.08 per formula unit, yielding the roughly 14 million unique compositions.
To search this space efficiently, the researchers built a surrogate model with two coupled components. The first is a set transformer, a neural‑network architecture designed to process unordered collections of inputs and produce the same output regardless of the order in which the inputs are presented. Each triple‑doped composition is fed to this network as three element–concentration pairs.
The transformer was pretrained on approximately 100 000 inorganic compounds from the publicly available Materials Project database, where it learned general relationships between elemental makeup and electronic structure. The second component is a multi‑task Gaussian process, a statistical framework that predicts not just a single best estimate but also a measure of confidence in that estimate. This Gaussian process takes the transformer’s output and simultaneously predicts all four battery metrics.
The team then ran three rounds of active learning. They began with 222 samples of single‑ and double‑doped data from their prior studies, plus 125 randomly chosen triple‑doped compositions. In each round, the model scored all 14 million candidates using an acquisition function that balanced expected performance against prediction uncertainty, then flagged the top 63 compositions for synthesis and testing. A liquid‑handling robot dispensed precursor solutions into 64 small crucibles per batch. The samples were processed through sol‑gel synthesis and sintered at 850 °C.
To compare materials across batches, the team defined a figure of merit that multiplies the performance ratios of all four metrics relative to the undoped baseline. By construction, an undoped sample scores 1.0. Their best previously published material scored 1.3. For perspective, a hypothetical 50 % improvement in every metric would yield a score of about 5.
The first prediction round delivered immediate results. One composition, a combination of aluminum, cesium, and indium at specific concentrations, achieved a figure of merit of 5.1. Its capacity retention reached 98 % over five cycles, its irreversible capacity fell from 73 to 39 mAh g⁻¹, and its overpotential dropped from 0.713 V to 0.445 V.
Across successive rounds, the share of predicted samples that beat the undoped material on all four metrics rose from 8.8 % among random compositions to 44.4 % in the final round. The average figure of merit climbed from 1.05 for random samples to 2.13 in the last prediction round, while the variance among top performers shrank, indicating that the model was homing in on reliably good compositions rather than getting lucky.
The algorithm also shifted the palette of useful dopants. The prior dataset leaned heavily on indium, since no other single element had proved as effective at overcoming the conductivity bottleneck. But the active‑learning loop increasingly favored chromium, niobium, scandium, and phosphorus, while reducing its selection of indium.
Four of the highest‑performing materials in the study contained no indium at all. This diversification carries practical significance: as battery production scales globally, dependence on any single element creates supply‑chain risk.
X‑ray diffraction patterns confirmed that most top‑performing compositions were nearly single‑phase, meaning the dopants had integrated into the host crystal structure rather than forming separate, electrochemically inactive compounds. This happened despite the model receiving no structural information whatsoever; it was trained exclusively on composition and battery performance data.
The one notable exception, a sample appearing to show extremely low overpotential, turned out on closer inspection to have an anomalous feature in its electrochemistry. Human review of the raw data immediately identified it as misleading, a reminder that automated optimization benefits from expert oversight.
The full enumeration of all 14 million candidates finished in under 20 minutes on a single GPU. With fewer than 200 experimentally tested compositions, the system identified materials representing a fivefold composite improvement over the undoped baseline. Because the surrogate model accepts any number of target properties and is not tied to a specific host material, the workflow is designed to generalize well beyond LiCoPO₄. When a single GPU can score 14 million candidates in 20 minutes and fewer than 200 experiments can identify a fivefold improvement, the bottleneck in battery materials discovery starts to look less like chemistry and more like the decision of where to search next.
Michael Berger – author of four books published by the Royal Society of Chemistry.
Comments
Want to join the conversation?
Loading comments...