pKa—the inverse logarithm of the acid dissociation constant—is a fundamental molecular property. The pKa of a molecule dictates whether or not a molecule or functional group will be protonated or deprotonated, and thus what the overall charge and reactivity of the system will be. pKa can also be used to predict solubility, membrane permeability, and a variety of other properties through linear free-energy relationships.
In the laboratory, titration experiments can be used to experimentally determine pKa values for a given compound. In general, this is the most accurate way to determine pKa values. Unfortunately, these experiments are also time-consuming and require specialized equipment and a sample of the compound in question (which must be purchased or synthesized).
For these reasons, it's not practical to experimentally measure pKa values for lots of potential compounds. Instead, pKa values can sometimes be found in reference tables, like the Evans pKa table. For simple molecules, it's often possible to find an exact match in a reference table—but for complex molecules, scientists have to try to find the closest match and then guess how the pKa will change between the reference compound and the actual compound. Since pKa values are so sensitive to substitution and electronic effects, this can lead to surprisingly large errors.
Computational chemistry allows scientists to estimate the pKa of molecules without relying on a a database of reference compounds. The difference in energy between the conjugate acid and conjugate base can be computed using various theoretical methods, and this value can be converted into a predicted pKa value. While these predictions are not perfect, they're often quite close to experimental values, and can be used to make data-driven decisions about structure–activity relationships or to inform downstream synthesis.
However, developing a robust algorithm that can predict pKa from molecular structure is more challenging than it seems: the most accurate quantum chemical methods can take days, weeks, or months to give results, and faster machine learning–based methods suffer from limited accuracy on unseen classes of compounds. (The complexity of pKa prediction is likely why popular cheminformatics packages like RDKit intentionally avoid pKa prediction.)
Rowan uses physics-based machine learning to vastly accelerate the prediction of pKa values. We employ neural network potentials as a low-cost way to run physics-based calculations, thus mimicking the accuracy of quantum mechanics-based modeling for a tiny fraction of the cost. We've tested the accuracy of Rowan's pKa prediction workflow on a variety of challenging benchmark sets and studied the performance: see our preprint for a full assessment.
Rowan's free online pKa prediction workflow can generate pKa values for each functional group on a molecule, starting from a 3D structure or a SMILES string. Rowan also makes it easy to visualize the geometry of the neutral molecule and each conjugate acid and base produced by protonation or deprotonation events, allowing scientists to understand the effect of structure on acidity and basicity.
Create a free account and try Rowan's pKa predictor within minutes!