Examples
This page shows how to run the example inputs stored in tests_run and what
outputs to expect from each step.
Fixture files
The repository ships with small ROOT fixtures:
tests_run/test_mc.roottests_run/test_data.root
These are the inputs used by the example configuration files:
tests_run/run_reweighting_config.yamltests_run/apply_weights_config.yamltests_run/throughput_config.yaml
If you want to regenerate the fixture ROOT files from larger inputs, use:
python tests_run/make_test_root_samples.py \
--input-data <source_data.root> \
--input-mc <source_mc.root> \
--output-data tests_run/test_data.root \
--output-mc tests_run/test_mc.root \
--tree DecayTree \
--n-events 5000
Expected result:
two ROOT files are written under
tests_run/;each file contains the first
n-eventsentries of the requested tree;by default the output object type is
TTree.
Example 1: train reweighters
The main training example uses tests_run/run_reweighting_config.yaml.
Run it with:
run-reweight --config tests_run/run_reweighting_config.yaml
or, in a Pixi environment:
pixi run run-reweight --config tests_run/run_reweighting_config.yaml
What this config does
It trains the following methods on the fixture sample:
ONNXGBGBNNBins
using these four training variables:
B_DTF_Jpsi_PB_DTF_Jpsi_PTnPVsnLongTracks
and these monitoring variables:
B_PHIB_ETA
The config also enables:
transform: yeo-johnsonn_trials: 5shap: true
Expected outputs
By default the example writes into sample-specific subdirectories:
weights/bd_jpsikst_ee/for trained models and serialized weight arrays;plots/bd_jpsikst_ee/for validation and diagnostic plots.
Warning
Bins is included here as a lightweight baseline because the fixture uses
only four variables. For production use, treat it as a low-dimensional
method; it is much more fragile than the model-based reweighters once the
dimensionality or sparsity increases.
For the four configured methods, you should expect model files such as:
weights/gbr_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pklweights/onnxgb_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_meta.pklweights/onnxgb_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_stages/weights/inn_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_meta.pklweights/inn_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_stages/weights/binning_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_meta.pklweights/binning_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_edges.npyweights/binning_model_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks_ratio.npy
and predicted MC weight arrays such as:
weights/gbr_weights_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pklweights/onnxgb_weights_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pklweights/inn_weights_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pklweights/onnx_binning_weights_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pkl
You should also expect diagnostic plots such as:
plots/corr_data.pngplots/corr_mc.pngplots/input_features_training.pngplots/input_features_testing.pngplots/input_features_training_transformed.pngplots/input_features_testing_transformed.pngplots/other_vars_training.pngplots/other_vars_testing.pngplots/input_features_gb_weighted.pngplots/input_features_onnxgb_weighted.pngplots/input_features_nn_weighted.pngplots/input_features_binning_weighted.pngplots/roc_curve.pngplots/classifier_output.pngplots/weight_distributions.pngplots/training_throughput.jsonplots/training_throughput.pngplots/training_memory.jsonplots/training_memory.png
In practice, those files are written under weights/bd_jpsikst_ee/ and
plots/bd_jpsikst_ee/ because the CLI appends the configured sample name to
the root output directories.
Because shap: true is enabled, non-folding methods also produce feature
importance plots, for example:
plots/feature_importance_GB.pngplots/feature_importance_ONNXGB.pngplots/feature_importance_NN.pngplots/feature_importance_Bins.png
What a successful run looks like
A successful run should:
read both ROOT inputs without raising I/O errors;
split the sample into train and test subsets;
train all requested methods;
serialize models and weight arrays under
weights/;create a non-empty set of PNG plots under
plots/;write
plots/training_throughput.jsonsummarizing fit timing and event rates;write
plots/training_memory.jsonsummarizing peak resident memory usage during each fit.
The exact numerical weights are not fixed, especially for methods with classifier training or Optuna tuning, but the general expectation is that the reweighted training and testing distributions should move closer to the target data sample in the output plots.
Example 2: apply a trained model
The application example uses tests_run/apply_weights_config.yaml.
Run it with:
apply-weights --config tests_run/apply_weights_config.yaml
or:
pixi run apply-weights --config tests_run/apply_weights_config.yaml
Important note
This config requests method: XGB. That means the corresponding XGB model
must already exist in weights/bd_jpsikst_ee/ before the command can succeed.
The default training example above does not train XGB. To make this example
work, either:
run training with a config that includes
XGB; oroverride the application method to one of the methods already trained by
tests_run/run_reweighting_config.yaml, for exampleONNXGB.
Expected outputs
For a successful application run, expect:
a serialized normalized weight array in
weights/bd_jpsikst_ee/mcweights_B_DTF_Jpsi_P_B_DTF_Jpsi_PT_nPVs_nLongTracks.pkl;an output ROOT file named
test_applied_weights.root;a new branch named
mult_and_kin_weights_XGBin the output tree;comparison plots such as:
plots/bd_jpsikst_ee/mc_vars_reweighting.pngplots/bd_jpsikst_ee/mc_other_vars_reweighting.pngplots/bd_jpsikst_ee/input_features_reweighted.pngplots/bd_jpsikst_ee/other_vars_reweighted.png
The expected behavior is that the output ROOT file keeps the original event content and adds the requested weight branch for the rows that survived the input loading mask.
Example 3: throughput and memory sweep
The benchmarking example uses tests_run/throughput_config.yaml and is meant
to exercise all available methods on a small sample while recording both
training speed and memory usage.
Run it with:
run-reweight --config tests_run/throughput_config.yaml
This config enables:
GBandFoldingONNXGBandONNXFoldingXGBandXGBFoldingNNandNNFoldingBins
Expected outputs
This run should produce:
one trained model and one weight-array artifact per method;
plots/training_throughput.jsoncontaining per-method timing and throughput summaries;plots/training_throughput.pngwith a visual summary of relative training speed;plots/training_memory.jsoncontaining per-method peak RSS summaries;plots/training_memory.pngwith a visual summary of relative memory consumption;the usual validation plots comparing the different methods.
This is the best example to use when you want to compare backends side by side or verify that the full method registry is still working.
What is measured
The throughput summary reports:
fit wall-clock time for each method;
dataset events per second, defined as the number of training events processed per fit second.
The memory summary reports:
peak RSS (resident set size) reached by the process while fitting each method.
Peak RSS is the highest amount of physical memory occupied by the process during the fit. It is the most useful metric when comparing methods for CI stability or for estimating whether a given workflow will fit in RAM on a target machine.
Practical ways to reduce runtime and memory
When a run is too slow or too heavy for the available machine, the most useful config changes are usually:
reduce the number of requested
methodsand compare backends in separate runs instead of training everything at once;disable
shapunless feature-importance plots are specifically needed;lower
n_trialswhen using Optuna, since each trial performs an additional full training pass;avoid folding methods, or lower
n_folds, because folding trains multiple reweighters per method;reduce the number of
training_varsand especiallymonitoring_vars, as all requested columns are loaded into memory and several diagnostics scale with the feature count;for the
Binsmethod, reducen_binsor the number of input features, since the histogram size grows quickly with dimensionality;use smaller benchmark-style configs first to compare methods, then rerun only the most promising ones on the full sample.
In practice, the easiest low-cost speedup is often to start with a single
method such as GB or XGB, set shap: false, and keep n_trials at
0 or 1 until the rest of the workflow is validated.
Reading the outputs
The most useful files to inspect after running the examples are:
plots/input_features_*_weighted.pngto see whether the reweighted MC moves toward the data distribution on the training variables;plots/other_vars_*_weighted.pngto see whether improvements transfer to monitoring variables not used directly for training;plots/roc_curve.pngandplots/classifier_output.pngto assess post-reweighting separability;plots/weight_distributions.pngto check whether the learned weights are numerically well behaved;plots/training_throughput.jsonto compare computational cost across methods;plots/training_memory.jsonto compare peak memory usage across methods.
In short, the expected qualitative outcome is not a specific number but a set of artifacts showing that:
training completed;
models were saved;
weights were produced;
reweighted MC is generally closer to the data than the original MC;
no method generated obviously pathological weight distributions.