pycompressor package

Subpackages

pycompressor.scripts package

Submodules

pycompressor.app module

pycompressor.compressing module

pycompressor.compressor module

class pycompressor.compressor.Compress(prior, enhanced, estdic, nbred, idx, estm, fldr, rnd)

Bases: object

Compress the Prior set of replicas into a subset of replicas that faithfully contains the statistical properties of the prior (in other words a subset that gives the best value of the error function).

Parameters

prior (array_like) – Prior PDF replicas
estdic (dic) – Dictionary contaning the list of estimators
nbred (int) – Size of the reduced/compressed replicas

error_function(index)

Sample a subset of replicas as given by the index. Then computes the corrresponding ERF value.

Parameters: index (array_like) – Array containing the index of the replicas
Returns: Value of the ERF
Return type: float

all_error_function(index)

Sample a subset of replicas as given by the index. Then computes the corrresponding ERFs for all estimators.

Parameters: index (array_like) – Array containing the index of the replicas
Returns: Value of the ERF
Return type: float

final_erfs(index)

Compute the final ERF after minimization.

Parameters: index (array_like) – Array containing the index of the selected replicas
Returns: Dictionary containing the list of estimators and their respective values.
Return type: dict

genetic_algorithm(nb_mut=5)

Look for the combination of replicas that gives the best total ERF value. When enhanced_already_exists is set to True, the starting index is set to be the best from the standard compression.

Parameters: nb_mut (int, optional) – Number of mutation
Returns: The first argument is the value of the best ERF while the second contains the index of the reduced PDF
Return type: tuple(float, array_like)

cma_algorithm(std_dev=0.3, seed=0, verbosity=0, min_itereval=1000, max_itereval=15000)

Define the ERF function that is going to be minimized.

Parameters: index (array_like) – Array containing the index of the replicas
Returns: Value of the ERF
Return type: float

pycompressor.errfunction module

pycompressor.errfunction.randomize_rep(replica, number, rndgen)

Extract a subset of random replica from the prior in a nun- redundant way (no duplicates).

Parameters

replica (array_like) – Prior set of replicas shape=(replicas, flavours, x-grid)
number (int) – Number of subset of replicas

Returns

Randomized array of shape=(number, flavours, x-grid)

Return type

array_like

pycompressor.errfunction.compute_cfd68(reslt_trial)

Compute the confidence interval of a randomized trial arrays.

Parameters: reslt_trial (array_like) – Array of shape=(size_trials)
Returns: Value of the cfd
Return type: array_like

pycompressor.errfunction.compute_erfm(prior, nset)

Non-normalized error function. The ERF of the moment estimators given by eq.(6) of https://arxiv.org/pdf/1504.06469.

Parameters

prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)
nset (array_like) – Reduced or Random set of replica shape=(flavours, x-grid)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.compute_erfs(prior, nset)

Non-normalized error function for Statistical estimators.

The Kolmogorov-smirnov is given by eq.(13) of https://arxiv.org/pdf/1504.06469.

Parameters

prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)
nset (array_like) – Array of shape (flavor, x-grid, regions)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.compute_erfc(prior, nset)

Non-normalized error function for correlation estimators.

The correlation ERF is given by eq.(21) of https://arxiv.org/pdf/1504.06469.

Parameters

prior (array_like) – Prior set of replicas of shape=(flavours, x-grid)
nset (array_like) – Array of shape (NxCorr*flavors, NxCorr*flavors)

Returns

Value of the error Estimation

Return type

float

pycompressor.errfunction.estimate(prior, est_dic)

Compute estimators for the PRIOR set.

Parameters

prior (array_like) – Prior set of shape=(replicas, flavours, x-grid)
est_dic (dict) – Contains the list ot all estimators

Returns

Array of shape=(flavours, x-grid)

Return type

float

pycompressor.errfunction.normalization(prior, est_prior, rndm_size, est_dic, trials, folder, rndgen)

Compute normalization for each Estimator. The normalization is computed by calculating the ERF of the given estimator for each trials as given generally by eq.(9) of the paper (https://arxiv.org/pdf/1504.06469).

Parameters

prior (array_like) – Prior set of replica fo shape=(replicas, flavours, x-grid)
est_prior (dict) – Dictionary containing the values of the estimated results
rndm_size (int) – Size of random replicas
est_dic (dict) – Contains the list of estimators
trials (int) – Number of random trials

Returns

Normalization value for each estimator

Return type

float

class pycompressor.errfunction.ErfComputation(prior, est_dic, nreduc, folder, rndgen, trials=1000, norm=True)

Bases: object

Class that computes the normalized Error Functions. The complete ERF expression is generally given by eq.(6) of https://arxiv.org/pdf/1504.06469.

When this class is initialized, the Estimators and the normalization factors are computed.

Parameters

prior (array_like) – Prior set of replicas of shape=(replicas, flavours, x-grid)
est_dic (dict) – Contains the list of all the Estimators
nreduc (int) – Size of reduced replicas
trials (int) – Number of trials

compute_tot_erf(reduc)

Compute the total normalized Error Function which is given by the sum of all the normalized estimators.

Parameters: reduc (array_like) – Reduced set of replicas of shape=(replica, flavours, x-grid)
Returns: Value of the total normalized ERF
Return type: float

compute_all_erf(reduc)

Compute the total normalized Error Function which is given by the sum of all the normalized estimators.

Parameters: reduc (array_like) – Reduced set of replicas of shape=(replica, flavours, x-grid)
Returns: Value of the total normalized ERF
Return type: float

pycompressor.estimators module

class pycompressor.estimators.Estimators(replicas, axs=0)

Bases: object

Class containing the different types of statistical estimators.

This class takes a set of PDF replicas (prior/compressed/random) with a shape (replicas, flavours, xgrid) and then compute the value of the estimators w.r.t to the PDF replicas

Parameters

replicas (array) – Prior or Reduced PDF replicas of shape=(replicas, flavours, x-grid)
axs (int) – Axis to which the estimator is computed. By default is set to zero to compute along the direction of the pdf replicas

static moment(replicas, mean, stdev, order)

Compute skewness in the standard way following exactly eq.(11) of the paper.

Parameters

replicas (array_like) – Array of PDF replicas (prior/reduced/random)
mean (array_like) – Array with the mean values of replicas
stdev (array_like) – Array with the values of standard deviation of replicas
nb_regions (int, optional) – Number of regions. This is by default set to 6

Returns

Array of the value of the n-order moment

Return type

array_like

static kolmogorov(replicas, mean, stdev)

Compute Kolmogorov-smirnov (KS) estimator as in the C-implementation of the compressor:

https://github.com/scarrazza/compressor/blob/master/src/Estimators.cc#L122

This function counts the number of replicas (for all fl and x in xgrid) which fall in the region given by eq.(14) of https://arxiv.org/abs/1504.06469 and normalize the result by the total number of replicas.

As opposed to the above implementation, this computes the KS for all replicas, flavours and x-grid.

Parameters

replicas (array_like) – PDF replicas (prior/reduced/random)
mean (array_like) – Array with the mean values of replicas
stdev (array_like) – Array with the values of standard deviation of replicas

Returns

Array containing the number of replicas that fall into a region

Return type

array_like

static correlation(replicas)

Compute the correlation matrix of a given PDF replicas as in eq.(16) of https://arxiv.org/pdf/1504.06469.

Parameters: replicas (array_like) – Array of PDF replicas (prior/reduced/random)
Returns: Correlation matrix
Return type: array_like

compute_for(estm_name)

Method that maps the called estimators to the coorect one.

Parameters: estm_name (str) – Name of the estimator

pycompressor.pdfgrid module

pycompressor.postgans module

pycompressor.utils module

pycompressor.utils.remap_index(index, shuffled)

pycompressor.utils.extract_estvalues(comp_size)

Extract the result from the prior for a given compressed set (w.r.t the size).

Parameters: comp_size (int) – Size of the compressed set

pycompressor.utils.extract_index(pdfname, comp_size)

Extract the list of indices for a given compressed set (w.r.t the size)

Parameters

pdfname (str) – Name of the original/input PDF
comp_size (int) – Size of the compressed set

pycompressor.utils.extract_bestErf(pdfname, comp_size)

Extract the best/final ERF value for a given compressed set (w.r.t the size).

Parameters

pdfname (str) – Name of the original/input PDF
comp_size (int) – Size of the compressed set

pycompressor.utils.compare_estimators(est1, est2)

Compare if the values of all the estimators in est1 are samller or equal than in est2 (est1`<`est2) and returns True if it is the case.

Parameters

est1 – Value of the first estimator
est2 – Value of the second estimator

pycompressor.utils.get_best_estimator(list_ests)

Get the best estimator from a list of dictionaries containing values of all the different estimators.

Parameters: list_ests (list) – List of dictionaries containing the results of all the statistical estimators

pycompressor package

Subpackages

Submodules

pycompressor.app module

pycompressor.compressing module

pycompressor.compressor module

pycompressor.errfunction module

pycompressor.estimators module

pycompressor.pdfgrid module

pycompressor.postgans module

pycompressor.utils module

Module contents