`linna.util`

Module Contents

Classes

`CPU_Unpickler`	This takes a binary file for reading a pickle data stream.
`chtoPool`	A reimplimentation of `schwimmbad.MPIPool` that will not broacast redundant function
`chtoMultiprocessPool`	pool class if one wish to to multiprocess
`Transform`	Transform parameters so that all the prior is gaussian with zero mean and unit variance
`invTransform`	Inverse the `Transform` function.
`ArrayDataset`	prepare data for torch
`Y_transform_data`	Transform data vector from y-->y/sigma
`Y_invtransform_data`	Transform data vector from y-->y sigma (Api is the same as `Y_transform_data`)
`X_transform_class`	Transform parameters from x --> (x-mean)/std or x --> (log10(x)-mean)/std
`Y_transform_class`	Transform data vector: y-->ystd+mean or np.exp(y-->ystd+mean)
`Y_invtransform_class`	Transform data vector: y-->(y-mean)/std or (np.log(y)-mean)/std (API tis the same as `` Y_transform_class``
`_FunctionWrapper`	Only for internal use
`NN_samplerv1`	A class to perform neural network sampling for each iteration
`Log_prob`	Class do loglikelihood
`Dlnp`	Class do derivative of loglikelihood (API the same as `Log_prob`)
`Ddlnp`	Class do 2nd derivative of loglikelihood (API the same as `Log_prob`)
`Auxilleryfunc`	Class for internal use
`Loss_fn`	Class defined loss function
`Val_metric_fn`	Class for validation metric (API the same as loss)
`LogPrior`	Priors handling

Functions

`makepositivedefinite`(cov, fcut=0.99)
`get_good_walker_list`(log_prob_samples)
`read_chain_and_cut`(chainname, nk, ntimes=20, walkercut=False, method='emcee', flat=False)
`_dummy_callback`(x)
`gauss2unif`(x)	transform a guaaisan distributed random variable to a uniformly distributed variable
`invgauss2unif`(x)	inverse transform a guaaisan distributed random variable to a uniformly distributed variable
`retrieve_model`(outdir, inshape, outshape, nnmodel_in=ChtoModelv2)	Retrieve the trained model
`retrieve_model_wrapper_in`(outdir, nnmodel_in=ChtoModelv2, no_grad=True)	Retrieve the trained model (more user friendly than retrieve_model)
`gaussianlogliklihood`(m, data, invcov)
`lnprior`(x)	internal function
`generate_training_point`(theory, nnsampler, pool, outdir, ntrain, nval, data, invcov, chain=None, nsigma=1, omegab2cut=None, options=0, negloglike=None, nbest_in=None, chisqcut=None)	Generate training point
`chisqcut_all`(data, invcov, chisqcut, fnamey, fnamex)	Internal function
`train_nn`(outdir, model, train_x, train_y, val_x, val_y, X_transform, y_transform, loss_fn, val_metric_fn, dev='cpu', verbose=False, retrain=True, pool=None, nocpu=False, size=0, rank=0, params=None)	Internal function
`median_absolute_deviation`(y, median, dim)	Internal function
`train_NN`(nnsampler, cov, inv_cov, sigma, outdir_in, outdir_list, data, dolog10index=None, ypositive=False, retrain=True, norder=2, temperature=None, docuda=False, pool=None, tsize=1, nnmodel_in=None, params=None, usebest=False)	Internal function
`run_mcmc`(nnsampler, outdir, method, ndim, nwalkers, init, log_prob, dlnp=None, ddlnp=None, pool=None, transform=None, ntimes=50, tautol=0.01, meanshift=0.1, stdshift=0.1, nk=2)	Run mcmc using the trained model
`logp_theory_data`(samples, theory, data, invcov, logprior)	Internal function

Attributes

initialize

linna.util.initialize = False[source]

linna.util.makepositivedefinite(cov, fcut=0.99)[source]

class linna.util.CPU_Unpickler[source]

Bases: linna.nn.pickle.Unpickler

This takes a binary file for reading a pickle data stream.

The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.

The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus file can be a binary file object opened for reading, an io.BytesIO object, or any other custom object that meets this interface.

Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is True, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

find_class(self, module, name)[source]

Return an object from a specified module.

If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).

This method is called whenever a class or a function object is needed. Both arguments passed are str objects.

linna.util.get_good_walker_list(log_prob_samples)[source]

linna.util.read_chain_and_cut(chainname, nk, ntimes=20, walkercut=False, method='emcee', flat=False)[source]

linna.util._dummy_callback(x)[source]

class linna.util.chtoPool(comm=None)[source]

Bases: schwimmbad.MPIPool

A reimplimentation of schwimmbad.MPIPool that will not broacast redundant function

wait(self)[source]: Walkers will listen to the main process

map(self, worker, tasks, callback=None)[source]

Evaluate a function or callable on each task in parallel using MPI.

The callable, worker, is called on each element of the tasks iterable. The results are returned in the expected order (symmetric with tasks).

Parameters

worker (callable) – A function or callable object that is executed on each element of the specified tasks iterable. This object must be picklable (i.e. it can’t be a function scoped within a function or a lambda function). This should accept a single positional argument and return a single object.
tasks (iterable) – A list or iterable of tasks. Each task can be itself an iterable (e.g., tuple) of values or data to pass in to the worker function.
callback (callable, optional) – An optional callback function (or callable) that is called with the result from each worker run and is executed on the master process. This is useful for, e.g., saving results to a file, since the callback is only called on the master thread.

Returns

A list of results from the output of each worker() call.

Return type

list

noduplicate_close(self)[source]: Reset no duplicate function

bcast(self, worker, args, sizemax)[source]

Broadcast function to all the workers:

Parameters

worker (callable) – a function which you want to parallelize
args (list) – list of things to be passed to worker
sizemax (int) – number of worker you wish to use

class linna.util.chtoMultiprocessPool(nwalker)[source]

pool class if one wish to to multiprocess

map(self, worker, tasks, callback=None)[source]

Parameters

worker (function) – function of worker
taskes (list of objects) – lists of talks

Returns

list of objects

noduplicate_close(self)[source]: close the pool

is_master(self)[source]

linna.util.gauss2unif(x)[source]

transform a guaaisan distributed random variable to a uniformly distributed variable

Parameters: x (torch.tensor) – input
Returns: output
Return type: torch.tensor

linna.util.invgauss2unif(x)[source]

inverse transform a guaaisan distributed random variable to a uniformly distributed variable

Parameters: x (torch.tensor) – input
Returns: output
Return type: torch.tensor

class linna.util.Transform(priors)[source]

Transform parameters so that all the prior is gaussian with zero mean and unit variance

__call__(self, x, returnnumpy=True, inputnumpy=True)[source]

Transform perameters so that all the prior is gaussian with zero mean and unit variance

Parameters

x (nd array or torch array) – array of parameters
returnnumpy (bool) – If true, then the return value will be in numpy array. Otherwise, the return value will be in torch tensor
inputnumpy (bool) – If true, the return value should be in numpy array. Otherwise, the return value should be in torch tensor

Returns

depends on the input parameters returnnumpy

Return type

numpy array or torch tensor

class linna.util.invTransform(priors)[source]

Inverse the `Transform` function.

__call__(self, x, returnnumpy=True, inputnumpy=True)[source]

Parameters

x (nd array or torch array) – array of parameters
returnnumpy (bool) – If true, then the return value will be in numpy array. Otherwise, the return value will be in torch tensor
inputnumpy (bool) – If true, the return value should be in numpy array. Otherwise, the return value should be in torch tensor

Returns

depends on the input parameters returnnumpy

Return type

numpy array or torch tensor

class linna.util.ArrayDataset(X, y)[source]

Bases: torch.utils.data.Dataset

prepare data for torch

__len__(self)[source]

__getitem__(self, i)[source]

class linna.util.Y_transform_data(sigma, device)[source]

Transform data vector from y–>y/sigma

__call__(self, y)[source]

Parameters: y (torch tensor) – data vector
Returns: y/sigma
Return type: torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters: path (string) – name of the pickle file

transform_cov(self, cov)[source]

Transform the associated covariance matrix if one transform the data vector by 1/sigma

Parameters: cov (2d array) – covariance matrix
Returns: transformed covariance matrix
Return type: torch(2d array)

class linna.util.Y_invtransform_data(sigma, device)[source]

Transform data vector from y–>y sigma (Api is the same as Y_transform_data)

__call__(self, y)[source]

pickle(self, path)[source]

class linna.util.X_transform_class(X_mean, X_std, device, dolog10index=None)[source]

Transform parameters from x –> (x-mean)/std or x –> (log10(x)-mean)/std

__call__(self, X)[source]

Parameters: X (torch tensor) – array with first dimension the same as the length of X_mean
Returns: torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters: path (string) – name of the pickle file

class linna.util.Y_transform_class(y_mean, y_std, dev, ypositive=False)[source]

Transform data vector: y–>y*std+mean or np.exp(y–>y*std+mean)

__call__(self, y)[source]

Parameters: y (torch tensor) – data
Returns: transformed data
Return type: torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters: path (string) – name of the pickle file

class linna.util.Y_invtransform_class(y_mean, y_std, data_tensor, dev, ypositive=False)[source]

Transform data vector: y–>(y-mean)/std or (np.log(y)-mean)/std (API tis the same as `` Y_transform_class``

__call__(self, y)[source]

transform_cov(self, cov)[source]

Transform the associated covariance matrix if one transform the data vector by 1/sigma

Parameters: cov (2d array) – covariance matrix
Returns: transformed covariance matrix
Return type: torch(2d array)

pickle(self, path)[source]

class linna.util._FunctionWrapper(f, args, kwargs)[source]

Bases: object

Only for internal use :meta private:

__call__(self, x)[source]

linna.util.retrieve_model(outdir, inshape, outshape, nnmodel_in=ChtoModelv2)[source]

Retrieve the trained model

Parameters

Outdir (string) – directory of the outdir
inshape (int) – input vector size of the model
outshape (int) – output vector size of the model
nnmodel_in (callable, optional) – neural network instance defined in nn.py

Returns

model linna.util.Y_invtransform_data: callable that transform the model to the same space as data vector

Return type

linna.predictor_gpu.Predictor

linna.util.retrieve_model_wrapper_in(outdir, nnmodel_in=ChtoModelv2, no_grad=True)[source]

Retrieve the trained model (more user friendly than retrieve_model)

Parameters

Outdir (string) – directory of the outdir
nnmodel_in (callable, optional) – neural network instance defined in nn.py
no_grad (bool) – True: not keep gradient information, Flase: keep gradient information

Returns

a function takes in cosmological and nuisance parameters (torch tensor) and returns the prediction of the data vector using the neural network. Note that the output is in the format of torch.tensor, so that its differentiation can be evaluated.

Return type

model (callable)

class linna.util.NN_samplerv1(outdir, prior_range)[source]

A class to perform neural network sampling for each iteration

generate_training_data(self, samples, model, pool=None, args=None, kwargs=None)[source]

Generate predicted data vector from a set of parameters

Parameters

samples (ndarray) – 2d array containing data with float type. Set of parameters in each row
model (function) – a function that take a row of samples, args and kwargs and return the predicted data vector
pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables).
args (lists, optional) – args or kwargs to be passed to model
kwargs (lists, optional) – args or kwargs to be passed to model

Returns

each row corresponds to the output of model at each row of the samples.

Return type

ndarray

gensample_flat(self, Nsamples, omegab2cut=None)[source]

Generate parameters for training and validation using latin hypercube.

Parameters

Nsamples (int) – number of samples to be generated.
omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

gensample_chain(self, Nsamples, chain_in, nsigma, omegab2cut=None)[source]

Generate parameters for training and validation from a chain using latin hyper cube.

Parameters

Nsamples (int) – number of samples to be generated.
chain_in (ndarray) – a mcmc chain.
nsigma (int) – up to this number an mcmc chain is generated
omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

gensample_chain_randomsample(self, Nsamples, chain_in, nsigma, omegab2cut=None)[source]

Generate parameters for training and validation from a chain using latin hyper cube.

Parameters

Nsamples (int) – number of samples to be generated.
chain_in (ndarray) – a mcmc chain.
nsigma (int) – up to this number an mcmc chain is generated
omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

emcee_sample(self, log_prob, ndim, nwalkers, init, pool, transform, ntimes=50, tautol=0.01, dlnp=None, ddlnp=None, meanshift=0.1, stdshift=0.1, nk=1)[source]

Generate MCMC chains using emcee.

Parameters

log_prob (function) – function of posterior.
ndim (int) – the dimension of posterior
nwalkers (int) – number of mcmc walkers
init (ndarray) – array of init points of the sampler
pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables)
transform (function) – mapping mcmc samples to actually parameters

Zeus_sample(self, log_prob, ndim, nwalkers, init, pool, transform, ntimes=50, tautol=0.01, dlnp=None, ddlnp=None, meanshift=0.1, stdshift=0.1, nk=1)[source]

Generate MCMC chains using zeus.

Parameters

log_prob (function) – function of posterior.
ndim (int) – the dimension of posterior
nwalkers (int) – number of mcmc walkers
init (ndarray) – array of init points of the sampler
pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables)
transform (function) – mapping mcmc samples to actually parameters

_HMC_sample(self, log_prob, dlnp, ddlnp, ndim, nwalkers, init, pool, transform, samp_steps, samp_eps)[source]

_NUTS_sample(self, log_prob, dlnp, ddlnp, ndim, nwalkers, init, pool, transform, Madapt)[source]

linna.util.gaussianlogliklihood(m, data, invcov)[source]

class linna.util.Log_prob(data_new, invcov_new, model, y_invtransform_data, transform, temperature, loglikelihoodfunc, nograd=False)[source]

Class do loglikelihood

__call__(self, x, returntorch=True, inputnumpy=True)[source]

Parameters

x (numpy array or tensor) –
returntorch (bool) – whether to return torch tensor or numpy array
inputnumpy (bool) – whether the input is in numpy array or torch tensor

Returns

tensor or numpy array

class linna.util.Dlnp(data_new, invcov_new, model, y_invtransform_data, transform, temperature)[source]

Class do derivative of loglikelihood (API the same as Log_prob)

__call__(self, x, lnP=None, returntorch=None, inputnumpy=None)[source]

class linna.util.Ddlnp(data_new, invcov_new, model, y_invtransform_data, transform, temperature)[source]

Class do 2nd derivative of loglikelihood (API the same as Log_prob)

__call__(self, x)[source]

class linna.util.Auxilleryfunc(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class for internal use :meta private:

__call__(self, y_pred, y_target)[source]

class linna.util.Loss_fn(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class defined loss function

__call__(self, y_pred, y_target)[source]

Parameters

y_pred (tensor) – predicted data vector by nn
y_target (tensor) – targeted data vector

Returns

loss

Return type

torch.float

class linna.util.Val_metric_fn(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class for validation metric (API the same as loss)

__call__(self, y_pred, y_target)[source]

class linna.util.LogPrior(prior)[source]

Priors handling

__call__(self, xlist)[source]

Parameters: xlist (list) – data
Returns: prior
Return type: float

linna.util.lnprior(x)[source]: internal function :meta private:

linna.util.generate_training_point(theory, nnsampler, pool, outdir, ntrain, nval, data, invcov, chain=None, nsigma=1, omegab2cut=None, options=0, negloglike=None, nbest_in=None, chisqcut=None)[source]

Generate training point

Parameters

theory (callable) – model
nnsampler (NN_samplerv1 instance) –
pool (chtoPool instance) – mpi pool
outdir (string) – output directory
ntrain (int) – number of training data
nval (int) – number of validation data
data (1d array) – float array, data vector
invcov (2d array) – float array, inverse of covariance matrix
chain (array or None) – if None: generate from prior, if and array: training sample will be generated using thie chain
nsigma (int) – if option ==0, this means we build a LH in nsignma region of the chain.
omegab2cut (list or None) – additional cut on omegabh2, if list, omegab2cut = [index of omegab, index of h, lower limit, upper limit]
options (int) – if 0, generate using Latin Hypercube. If 1, random sample the chain
negloglike (callable) – if not None, generate one sample usin maximum likelihood method (minimize negloglike)
nbest_in (int, optional) – number of samples to be included in the training set according to the optimizer
chisqcut (float, optional) – cut the training data if there chisq is greater than this value

linna.util.chisqcut_all(data, invcov, chisqcut, fnamey, fnamex)[source]: Internal function

linna.util.train_nn(outdir, model, train_x, train_y, val_x, val_y, X_transform, y_transform, loss_fn, val_metric_fn, dev='cpu', verbose=False, retrain=True, pool=None, nocpu=False, size=0, rank=0, params=None)[source]: Internal function

linna.util.median_absolute_deviation(y, median, dim)[source]: Internal function

linna.util.train_NN(nnsampler, cov, inv_cov, sigma, outdir_in, outdir_list, data, dolog10index=None, ypositive=False, retrain=True, norder=2, temperature=None, docuda=False, pool=None, tsize=1, nnmodel_in=None, params=None, usebest=False)[source]: Internal function

linna.util.run_mcmc(nnsampler, outdir, method, ndim, nwalkers, init, log_prob, dlnp=None, ddlnp=None, pool=None, transform=None, ntimes=50, tautol=0.01, meanshift=0.1, stdshift=0.1, nk=2)[source]

Run mcmc using the trained model

Parameters

nnsampler (NN_samplerv1 instance) –
outdir (string) – output directory
method (string) – mcmc method, only emcee or zeus is supported
ndim (int) – dimension of input parameters
nwalkers (int) – number of walkers
init (array) – numpy array with shape (nwalker, ndim)
log_probi (callable) – likelihood
pool (None or chtoPool) – mpi pool
transform (Transform or None) – function to transform input
ntimes (int) – number of autocorrelation time
tautol (float) – tolerance of autocorrelation time error
meanshift (float) – maximum shifts of parameter mean estimations between first half and second half of the chains in unit of sigma
stdshift (float) – maximum shift of parameter error estimation between first half and second half of the chains in unit of percent

linna.util.logp_theory_data(samples, theory, data, invcov, logprior)[source]: Internal function

linna.util

Module Contents

Classes

Functions

Attributes

`linna.util`