pycompressor package

Subpackages

Submodules

pycompressor.app module

pycompressor.compressing module

pycompressor.compressor module

class pycompressor.compressor.Compress(prior, enhanced, estdic, nbred, idx, estm, fldr, rnd)

Bases: object

Compress the Prior set of replicas into a subset of replicas that faithfully contains the statistical properties of the prior (in other words a subset that gives the best value of the error function).

Parameters
  • prior (array_like) – Prior PDF replicas

  • estdic (dic) – Dictionary contaning the list of estimators

  • nbred (int) – Size of the reduced/compressed replicas

error_function(index)

Sample a subset of replicas as given by the index. Then computes the corrresponding ERF value.

Parameters

index (array_like) – Array containing the index of the replicas

Returns

Value of the ERF

Return type

float

all_error_function(index)

Sample a subset of replicas as given by the index. Then computes the corrresponding ERFs for all estimators.

Parameters

index (array_like) – Array containing the index of the replicas

Returns

Value of the ERF

Return type

float

final_erfs(index)

Compute the final ERF after minimization.

Parameters

index (array_like) – Array containing the index of the selected replicas

Returns

Dictionary containing the list of estimators and their respective values.

Return type

dict

genetic_algorithm(nb_mut=5)

Look for the combination of replicas that gives the best total ERF value. When enhanced_already_exists is set to True, the starting index is set to be the best from the standard compression.

Parameters

nb_mut (int, optional) – Number of mutation

Returns

The first argument is the value of the best ERF while the second contains the index of the reduced PDF

Return type

tuple(float, array_like)

cma_algorithm(std_dev=0.3, seed=0, verbosity=0, min_itereval=1000, max_itereval=15000)

Define the ERF function that is going to be minimized.

Parameters

index (array_like) – Array containing the index of the replicas

Returns

Value of the ERF

Return type

float

pycompressor.errfunction module

pycompressor.errfunction.randomize_rep(replica, number, rndgen)

Extract a subset of random replica from the prior in a nun- redundant way (no duplicates).

Parameters
  • replica (array_like) – Prior set of replicas shape=(replicas, flavours, x-grid)

  • number (int) – Number of subset of replicas

Returns

Randomized array of shape=(number, flavours, x-grid)

Return type

array_like

pycompressor.errfunction.compute_cfd68(reslt_trial)

Compute the confidence interval of a randomized trial arrays.

Parameters

reslt_trial (array_like) – Array of shape=(size_trials)

Returns

Value of the cfd

Return type

array_like

pycompressor.errfunction.compute_erfm(prior, nset)

Non-normalized error function. The ERF of the moment estimators given by eq.(6) of https://arxiv.org/pdf/1504.06469.

Parameters
  • prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)

  • nset (array_like) – Reduced or Random set of replica shape=(flavours, x-grid)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.compute_erfs(prior, nset)

Non-normalized error function for Statistical estimators.

The Kolmogorov-smirnov is given by eq.(13) of https://arxiv.org/pdf/1504.06469.

Parameters
  • prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)

  • nset (array_like) – Array of shape (flavor, x-grid, regions)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.compute_erfc(prior, nset)

Non-normalized error function for correlation estimators.

The correlation ERF is given by eq.(21) of https://arxiv.org/pdf/1504.06469.

Parameters
  • prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)

  • nset (array_like) – Array of shape (NxCorr*flavors, NxCorr*flavors)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.estimate(prior, est_dic)

Compute estimators for the PRIOR set.

Parameters
  • prior (array_like) – Prior set of shape=(replicas, flavours, x-grid)

  • est_dic (dict) – Contains the list ot all estimators

Returns

Array of shape=(flavours, x-grid)

Return type

float

pycompressor.errfunction.normalization(prior, est_prior, rndm_size, est_dic, trials, folder, rndgen)

Compute normalization for each Estimator. The normalization is computed by calculating the ERF of the given estimator for each trials as given generally by eq.(9) of the paper (https://arxiv.org/pdf/1504.06469).

Parameters
  • prior (array_like) – Prior set of replica fo shape=(replicas, flavours, x-grid)

  • est_prior (dict) – Dictionary containing the values of the estimated results

  • rndm_size (int) – Size of random replicas

  • est_dic (dict) – Contains the list of estimators

  • trials (int) – Number of random trials

Returns

Normalization value for each estimator

Return type

float

class pycompressor.errfunction.ErfComputation(prior, est_dic, nreduc, folder, rndgen, trials=1000, norm=True)

Bases: object

Class that computes the normalized Error Functions. The complete ERF expression is generally given by eq.(6) of https://arxiv.org/pdf/1504.06469.

When this class is initialized, the Estimators and the normalization factors are computed.

Parameters
  • prior (array_like) – Prior set of replicas of shape=(replicas, flavours, x-grid)

  • est_dic (dict) – Contains the list of all the Estimators

  • nreduc (int) – Size of reduced replicas

  • trials (int) – Number of trials

compute_tot_erf(reduc)

Compute the total normalized Error Function which is given by the sum of all the normalized estimators.

Parameters

reduc (array_like) – Reduced set of replicas of shape=(replica, flavours, x-grid)

Returns

Value of the total normalized ERF

Return type

float

compute_all_erf(reduc)

Compute the total normalized Error Function which is given by the sum of all the normalized estimators.

Parameters

reduc (array_like) – Reduced set of replicas of shape=(replica, flavours, x-grid)

Returns

Value of the total normalized ERF

Return type

float

pycompressor.estimators module

class pycompressor.estimators.Estimators(replicas, axs=0)

Bases: object

Class containing the different types of statistical estimators.

This class takes a set of PDF replicas (prior/compressed/random) with a shape (replicas, flavours, xgrid) and then compute the value of the estimators w.r.t to the PDF replicas

Parameters
  • replicas (array) – Prior or Reduced PDF replicas of shape=(replicas, flavours, x-grid)

  • axs (int) – Axis to which the estimator is computed. By default is set to zero to compute along the direction of the pdf replicas

static moment(replicas, mean, stdev, order)

Compute skewness in the standard way following exactly eq.(11) of the paper.

Parameters
  • replicas (array_like) – Array of PDF replicas (prior/reduced/random)

  • mean (array_like) – Array with the mean values of replicas

  • stdev (array_like) – Array with the values of standard deviation of replicas

  • nb_regions (int, optional) – Number of regions. This is by default set to 6

Returns

Array of the value of the n-order moment

Return type

array_like

static kolmogorov(replicas, mean, stdev)

Compute Kolmogorov-smirnov (KS) estimator as in the C-implementation of the compressor:

https://github.com/scarrazza/compressor/blob/master/src/Estimators.cc#L122

This function counts the number of replicas (for all fl and x in xgrid) which fall in the region given by eq.(14) of https://arxiv.org/abs/1504.06469 and normalize the result by the total number of replicas.

As opposed to the above implementation, this computes the KS for all replicas, flavours and x-grid.

Parameters
  • replicas (array_like) – PDF replicas (prior/reduced/random)

  • mean (array_like) – Array with the mean values of replicas

  • stdev (array_like) – Array with the values of standard deviation of replicas

Returns

Array containing the number of replicas that fall into a region

Return type

array_like

static correlation(replicas)

Compute the correlation matrix of a given PDF replicas as in eq.(16) of https://arxiv.org/pdf/1504.06469.

Parameters

replicas (array_like) – Array of PDF replicas (prior/reduced/random)

Returns

Correlation matrix

Return type

array_like

compute_for(estm_name)

Method that maps the called estimators to the coorect one.

Parameters

estm_name (str) – Name of the estimator

pycompressor.pdfgrid module

pycompressor.postgans module

pycompressor.utils module

pycompressor.utils.remap_index(index, shuffled)
pycompressor.utils.extract_estvalues(comp_size)

Extract the result from the prior for a given compressed set (w.r.t the size).

Parameters

comp_size (int) – Size of the compressed set

pycompressor.utils.extract_index(pdfname, comp_size)

Extract the list of indices for a given compressed set (w.r.t the size)

Parameters
  • pdfname (str) – Name of the original/input PDF

  • comp_size (int) – Size of the compressed set

pycompressor.utils.extract_bestErf(pdfname, comp_size)

Extract the best/final ERF value for a given compressed set (w.r.t the size).

Parameters
  • pdfname (str) – Name of the original/input PDF

  • comp_size (int) – Size of the compressed set

pycompressor.utils.compare_estimators(est1, est2)

Compare if the values of all the estimators in est1 are samller or equal than in est2 (est1`<`est2) and returns True if it is the case.

Parameters
  • est1 – Value of the first estimator

  • est2 – Value of the second estimator

pycompressor.utils.get_best_estimator(list_ests)

Get the best estimator from a list of dictionaries containing values of all the different estimators.

Parameters

list_ests (list) – List of dictionaries containing the results of all the statistical estimators

Module contents