Screening Conformer Ensembles with PRISM Pruner

by Nicolò Tampellini · Nov 25, 2025

This is a guest post by Nicolò Tampellini, the author of the PRISM Pruner conformer screening package we use here at Rowan. Nicolò is currently a Ph.D. student in Scott Miller's lab at Yale University, and has worked on the computational modeling of multiple conformationally complex reactions in the context of asymmetric catalysis.

Many properties in computational chemistry are obtained from conformational ensembles: sets of many spatial arrangements of the same molecule (or aggregate) that are processed as a whole to accurately model the desired property. Working with ensembles is essential when targeting the lowest energy conformations (which often influence reactivity), calculating conformational entropies, or modeling any observable property that is modulated by conformations, e.g. the shielding tensors from which to obtain NMR chemical shifts.

In many instances, the generation of such ensembles and their refinement occur in separate steps and with different levels of theory. For example, a conformational search might be carried out with an inexpensive force field or semiempirical method, but further refinement of the ensemble needs to be carried out with high-level DFT to achieve chemical accuracy. In these multi-level workflows, it often happens that multiple geometries converge to the same local minima, and a pruning step is necessary to remove duplicates and make sure to only carry forward the minimal number of structures to keep the computational cost as low as possible.

The most well-known and used metric to compare conformations is the root-mean-squared deviation (RMSD) of atomic positions. This strategy often works well, but there are some tricky caveats: different rotamers of the same structure will have artificially high RMSD values, while being chemically identical! A more ingenious solution is to compare the moments of inertia along the principal axes. This strategy is indexing-invariant, and therefore should circumvent the degenerate rotamers issue, on top of also being faster to compute.

The popular conformational search engine CREST features an ensemble sorting routine which implements both of these metrics called CREGEN. While possible to use as a standalone program, it focuses on the removal of duplicate structures while retaining all rotamers. This is necessary for some tasks like the calculation of conformational entropy, but can enormously inflate the size of ensembles if you are not interested in them. Imagine modeling some organocatalyst with a dozen tert-butyl groups! CREGEN is also written in Fortran, which can lead to difficulties when integrating into existing Python pipelines.

Born out of necessity after working with large conformational ensembles, years ago I started writing a conformational pruning implementation in Python (initially as part of FIRECODE, a modular ensemble optimization driver). Rowan indicated a need for an open-source, standalone conformer screening tool in their recent Open-Source Projects We Wish Existed blog post, and I volunteered to convert my existing code into a standalone package. Working with Jonathon from Rowan, I extracted and polished the code into PRISM Pruner.

The code implements a cached, iterative, divide-and-conquer approach on increasingly larger subsets of the ensemble and removes duplicates as assessed by the two metrics above, RMSD and moment of inertia on the principal axes. On top of that, a third mode uses a rotamer-corrected RMSD metric, in cases where the moment of inertia alone is not sufficient to weed out redundant conformations. Comparing every structure to every other requires a lot of costly evaluations, and has O(N2) scaling (where N is the number of structures). If there are a lot of similar conformers, using a divide-and-conquer strategy to group them into smaller chunks can drastically reduce the number of calls, as the small chunks keep the number of evaluations under control by using small N values as N decreases.

If energies are available, we sort the ensemble before dividing it into chunks to have the best chance of grouping similar structures together early. After all chunks are evaluated, the leftover structures are used to repeat the process with larger chunks, until all active structures are included in the final evaluation. In many instances this results in significantly fewer comparisons, and a faster and more scalable algorithm. Even in the worst case of no similar structures, the use of a cache ensures that we don't ever perform more calls than a simple all-to-all algorithm would.

Our initial comparisons against CREGEN are very positive, particularly for larger ensembles with many identical rotamers, where the divide-and-conquer approach really shines. The worst-case scaling factor is still O(N2) if all conformers are different, but functionally it is much lower for most conformational ensembles.

Conformer pruning via MOIConformer pruning via RMSD

We have also added a convenience function to perform sequential pruning, using reasonable default values for each step: starting with the fast moment of inertia mode, it follows with RMSD-based pruning and then an optional, final, rotamer-corrected RMSD pruning. Processing ensembles with ≈1,000 structures of ≈150 atoms using these settings takes seconds, and removes many rotamers from ensembles obtained from CREST. Here are two examples from my Ph.D. work, showing how much a conformational ensemble can be inflated by undesired rotamers—the second one is really pathological! The DFT time saved by processing these ensembles before the next step is significant.

Conformer ensemble pruning

The future of this project is also in your hands: if you are interested in contributing with new features, feel free to reach out to me or open a request on GitHub! For example, more similarity evaluation metrics could be implemented to screen for specific conformational attributes.

The performance of some sections could also be improved, if needed: while the MOI-based evaluation of similarity is really fast, the RMSD evaluation with numpy alone could be faster. The original FIRECODE implementation of the RMSD metric relies on Numba, which compiles low-level code at runtime and achieves a ≈7x speedup on the numerically-intensive RMSD calculation. While really performant, the Numba library is very heavy, and can complicate integration into packages already containing a large number of dependencies, thus we decided not to include it. If you see further room for improvement in the code performance, we'd love to hear from you!

Banner background image

What to Read Next

Improving Rowan's API

Improving Rowan's API

API as a coequal interface to Rowan's product; what we're changing in v3.0.0 of rowan-python; typed outputs; new workflow API; more agent-friendly features; acknowledging our early partners here
Mar 19, 2026 · Eli Mann, Corin Wagen, Jonathon Vandezande, and Spencer Schneider
Building Modern AI-Enabled Infrastructure for Pharma: A Conversation with Anthony Bradley from Dalton

Building Modern AI-Enabled Infrastructure for Pharma: A Conversation with Anthony Bradley from Dalton

Corin talks with Anthony about the real problems in computer-assisted drug discovery, how to sell software to pharma, and what Dalton can learn from Nike.
Mar 17, 2026 · Corin Wagen
Free-Energy Perturbation

Free-Energy Perturbation

what FEP is and why it's useful; limitations of current methods; Rowan FEP, TMD, and public benchmarks; how to run FEP in Rowan; the dream of FEP "too cheap to meter"; how to try Rowan FEP
Mar 4, 2026 · Corin Wagen, Eli Mann, Ari Wagen, and Spencer Schenider
Free-Energy Perturbation: A Pedagogical Introduction

Free-Energy Perturbation: A Pedagogical Introduction

Learn the core concepts behind free energy perturbation (FEP) using interactive 1D toy systems with exact analytical results.
Mar 4, 2026 · Corin Wagen
Solvent-Dependent Conformer Search

Solvent-Dependent Conformer Search

a good conformer is hard to find; clustering and the ReSCoSS workflow; Rowan's implementation, with some expert help; a demonstration on maraviroc
Feb 26, 2026 · Corin Wagen and Ari Wagen
How to Predict Protein–Ligand Binding Affinity

How to Predict Protein–Ligand Binding Affinity

A comparison of seven different approaches to predicting binding affinity.
Feb 13, 2026 · Corin Wagen
SAPT, Protein Preparation, and Starling-Based Microscopic pKa

SAPT, Protein Preparation, and Starling-Based Microscopic pKa

interaction energy decomposition w/ SAPT0 & a warning; making protein preparation more granular; catching forcefield errors earlier; microscopic pKa via Starling; internship applications now open
Feb 12, 2026 · Corin Wagen, Jonathon Vandezande, Ari Wagen, and Eli Mann
Credits FAQ

Credits FAQ

How credits work, why Rowan tracks usage with credits, and how these numbers translate into real-world workflows.
Feb 9, 2026 · Corin Wagen and Ari Wagen
Analogue Docking, Protein MD, Multiple Co-Folding Samples, Speed Estimates, and 2FA

Analogue Docking, Protein MD, Multiple Co-Folding Samples, Speed Estimates, and 2FA

docking analogues to a template; running MD on proteins w/o ligands; generating multiple structures with Boltz & Chai; runtime estimates & dispatch information; two-factor authentication; speedups
Jan 28, 2026 · Corin Wagen, Ari Wagen, and Spencer Schneider
Predicting Permeability for Small Molecules

Predicting Permeability for Small Molecules

why permeability matters; different experimental and computational approaches; Rowan's supported methods; an example script
Jan 9, 2026 · Corin Wagen, Eli Mann, and Ari Wagen