Quickstart: from SMILES to report in minutes
This guide runs the full “happy path” on the included sample file (examples/example.smi). It produces a
wide results table, then creates a shareable report bundle and HTML picklists.
1) Install (recommended: conda)
This is the most reliable public setup for RDKit-based calculators.
git clone https://github.com/kelokely/-molprop-toolkit.git molprop-toolkit
cd molprop-toolkit
conda env create -f environment.yml
conda activate molprop-toolkit
pip install -e ".[dev,parquet]"
2) Generate a results table (v5)
The v5 calculator includes solubility/permeability/PK heuristics and supports optional 3D descriptors.
Output format is inferred from the filename extension: .csv, .tsv, or .parquet.
# CSV
molprop-calc-v5 examples/example.smi -o results.csv
# Parquet (recommended for large libraries)
molprop-calc-v5 examples/example.smi -o results.parquet
# sanity check: list analysis categories present
molprop-analyze results.parquet --list
Parquet requires pyarrow. Install via pip install "molprop-toolkit[parquet]" or pip install pyarrow.
3) Build a report bundle (Markdown + HTML)
Creates a timestamped folder under reports/ with report and plots. Accepts CSV/TSV/Parquet.
molprop-report results.parquet
Look for an output path like reports/results_report_YYYYMMDD_HHMMSS/report.html.
4) Generate picklists (CSV + HTML)
Picklists are operational “decision lists” derived from filters and sorts. Accepts CSV/TSV/Parquet.
# built-in picklists
molprop-picklists results.parquet --html
# list built-ins
molprop-picklists results.parquet --list-builtins
Optional: retrosynthesis planning (offline)
If you install AiZynthFinder and download its public model/stock bundle, you can generate retrosynthesis routes and browse them as an HTML site.
pip install -e .[retro]
# download AiZynthFinder public data + config
download_public_data aizynth_data
# run retrosynthesis on your results table
molprop-retro results.csv --config aizynth_data/config.yml --top-routes 5 --nproc 4
Notes: structure-of-record SMILES
Many tools need a single “structure-of-record” SMILES column. By default, MolProp Toolkit will prefer
Calc_Canonical_SMILES when present (the exact structure used for descriptor calculation), and will fall back
to Calc_Base_SMILES → Canonical_SMILES → Input_Canonical_SMILES → SMILES. You can override
this in individual tools with --smiles-col.
Common gotchas (public installs)
If the calculators fail to import RDKit, confirm you activated the conda environment before running the CLI.
If you only want to analyze an existing results table, you can run molprop-analyze, molprop-report,
and molprop-picklists as long as your file contains the expected columns (CSV/TSV/Parquet supported).