Quickstart: from SMILES to report in minutes

This guide runs the full “happy path” on the included sample file (examples/example.smi). It produces a wide results table, then creates a shareable report bundle and HTML picklists.

1) Install (recommended: conda)

This is the most reliable public setup for RDKit-based calculators.

git clone https://github.com/kelokely/-molprop-toolkit.git molprop-toolkit
cd molprop-toolkit

conda env create -f environment.yml
conda activate molprop-toolkit

pip install -e ".[dev,parquet]"

2) Generate a results table (v5)

The v5 calculator includes solubility/permeability/PK heuristics and supports optional 3D descriptors. Output format is inferred from the filename extension: .csv, .tsv, or .parquet.

# CSV
molprop-calc-v5 examples/example.smi -o results.csv

# Parquet (recommended for large libraries)
molprop-calc-v5 examples/example.smi -o results.parquet

# sanity check: list analysis categories present
molprop-analyze results.parquet --list

Parquet requires pyarrow. Install via pip install "molprop-toolkit[parquet]" or pip install pyarrow.

3) Build a report bundle (Markdown + HTML)

Creates a timestamped folder under reports/ with report and plots. Accepts CSV/TSV/Parquet.

molprop-report results.parquet

Look for an output path like reports/results_report_YYYYMMDD_HHMMSS/report.html.

4) Generate picklists (CSV + HTML)

Picklists are operational “decision lists” derived from filters and sorts. Accepts CSV/TSV/Parquet.

# built-in picklists
molprop-picklists results.parquet --html

# list built-ins
molprop-picklists results.parquet --list-builtins

Optional: retrosynthesis planning (offline)

If you install AiZynthFinder and download its public model/stock bundle, you can generate retrosynthesis routes and browse them as an HTML site.

pip install -e .[retro]

# download AiZynthFinder public data + config
download_public_data aizynth_data

# run retrosynthesis on your results table
molprop-retro results.csv --config aizynth_data/config.yml --top-routes 5 --nproc 4

Next steps

Calculator options

3D mode, ionization/protomers, stereo/tautomer handling, output columns.

Columns reference

Schema-driven definitions for every column.

Feature families

Prefix-based cheat sheet: Tox_*, Met_*, Dev_*, etc.

Notes: structure-of-record SMILES

Many tools need a single “structure-of-record” SMILES column. By default, MolProp Toolkit will prefer Calc_Canonical_SMILES when present (the exact structure used for descriptor calculation), and will fall back to Calc_Base_SMILES → Canonical_SMILES → Input_Canonical_SMILES → SMILES. You can override this in individual tools with --smiles-col.

Common gotchas (public installs)

If the calculators fail to import RDKit, confirm you activated the conda environment before running the CLI. If you only want to analyze an existing results table, you can run molprop-analyze, molprop-report, and molprop-picklists as long as your file contains the expected columns (CSV/TSV/Parquet supported).