linna.util

Module Contents

Classes

CPU_Unpickler

This takes a binary file for reading a pickle data stream.

chtoPool

A reimplimentation of schwimmbad.MPIPool that will not broacast redundant function

chtoMultiprocessPool

pool class if one wish to to multiprocess

Transform

Transform parameters so that all the prior is gaussian with zero mean and unit variance

invTransform

Inverse the `Transform` function.

ArrayDataset

prepare data for torch

Y_transform_data

Transform data vector from y-->y/sigma

Y_invtransform_data

Transform data vector from y-->y sigma (Api is the same as Y_transform_data)

X_transform_class

Transform parameters from x --> (x-mean)/std or x --> (log10(x)-mean)/std

Y_transform_class

Transform data vector: y-->y*std+mean or np.exp(y-->y*std+mean)

Y_invtransform_class

Transform data vector: y-->(y-mean)/std or (np.log(y)-mean)/std (API tis the same as `` Y_transform_class``

_FunctionWrapper

Only for internal use

NN_samplerv1

A class to perform neural network sampling for each iteration

Log_prob

Class do loglikelihood

Dlnp

Class do derivative of loglikelihood (API the same as Log_prob)

Ddlnp

Class do 2nd derivative of loglikelihood (API the same as Log_prob)

Auxilleryfunc

Class for internal use

Loss_fn

Class defined loss function

Val_metric_fn

Class for validation metric (API the same as loss)

LogPrior

Priors handling

Functions

makepositivedefinite(cov, fcut=0.99)

get_good_walker_list(log_prob_samples)

read_chain_and_cut(chainname, nk, ntimes=20, walkercut=False, method='emcee', flat=False)

_dummy_callback(x)

gauss2unif(x)

transform a guaaisan distributed random variable to a uniformly distributed variable

invgauss2unif(x)

inverse transform a guaaisan distributed random variable to a uniformly distributed variable

retrieve_model(outdir, inshape, outshape, nnmodel_in=ChtoModelv2)

Retrieve the trained model

retrieve_model_wrapper_in(outdir, nnmodel_in=ChtoModelv2, no_grad=True)

Retrieve the trained model (more user friendly than retrieve_model)

gaussianlogliklihood(m, data, invcov)

lnprior(x)

internal function

generate_training_point(theory, nnsampler, pool, outdir, ntrain, nval, data, invcov, chain=None, nsigma=1, omegab2cut=None, options=0, negloglike=None, nbest_in=None, chisqcut=None)

Generate training point

chisqcut_all(data, invcov, chisqcut, fnamey, fnamex)

Internal function

train_nn(outdir, model, train_x, train_y, val_x, val_y, X_transform, y_transform, loss_fn, val_metric_fn, dev='cpu', verbose=False, retrain=True, pool=None, nocpu=False, size=0, rank=0, params=None)

Internal function

median_absolute_deviation(y, median, dim)

Internal function

train_NN(nnsampler, cov, inv_cov, sigma, outdir_in, outdir_list, data, dolog10index=None, ypositive=False, retrain=True, norder=2, temperature=None, docuda=False, pool=None, tsize=1, nnmodel_in=None, params=None, usebest=False)

Internal function

run_mcmc(nnsampler, outdir, method, ndim, nwalkers, init, log_prob, dlnp=None, ddlnp=None, pool=None, transform=None, ntimes=50, tautol=0.01, meanshift=0.1, stdshift=0.1, nk=2)

Run mcmc using the trained model

logp_theory_data(samples, theory, data, invcov, logprior)

Internal function

Attributes

initialize

linna.util.initialize = False[source]
linna.util.makepositivedefinite(cov, fcut=0.99)[source]
class linna.util.CPU_Unpickler[source]

Bases: linna.nn.pickle.Unpickler

This takes a binary file for reading a pickle data stream.

The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.

The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus file can be a binary file object opened for reading, an io.BytesIO object, or any other custom object that meets this interface.

Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is True, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

find_class(self, module, name)[source]

Return an object from a specified module.

If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).

This method is called whenever a class or a function object is needed. Both arguments passed are str objects.

linna.util.get_good_walker_list(log_prob_samples)[source]
linna.util.read_chain_and_cut(chainname, nk, ntimes=20, walkercut=False, method='emcee', flat=False)[source]
linna.util._dummy_callback(x)[source]
class linna.util.chtoPool(comm=None)[source]

Bases: schwimmbad.MPIPool

A reimplimentation of schwimmbad.MPIPool that will not broacast redundant function

wait(self)[source]

Walkers will listen to the main process

map(self, worker, tasks, callback=None)[source]

Evaluate a function or callable on each task in parallel using MPI.

The callable, worker, is called on each element of the tasks iterable. The results are returned in the expected order (symmetric with tasks).

Parameters
  • worker (callable) – A function or callable object that is executed on each element of the specified tasks iterable. This object must be picklable (i.e. it can’t be a function scoped within a function or a lambda function). This should accept a single positional argument and return a single object.

  • tasks (iterable) – A list or iterable of tasks. Each task can be itself an iterable (e.g., tuple) of values or data to pass in to the worker function.

  • callback (callable, optional) – An optional callback function (or callable) that is called with the result from each worker run and is executed on the master process. This is useful for, e.g., saving results to a file, since the callback is only called on the master thread.

Returns

A list of results from the output of each worker() call.

Return type

list

noduplicate_close(self)[source]

Reset no duplicate function

bcast(self, worker, args, sizemax)[source]

Broadcast function to all the workers:

Parameters
  • worker (callable) – a function which you want to parallelize

  • args (list) – list of things to be passed to worker

  • sizemax (int) – number of worker you wish to use

class linna.util.chtoMultiprocessPool(nwalker)[source]

pool class if one wish to to multiprocess

map(self, worker, tasks, callback=None)[source]
Parameters
  • worker (function) – function of worker

  • taskes (list of objects) – lists of talks

Returns

list of objects

noduplicate_close(self)[source]

close the pool

is_master(self)[source]
linna.util.gauss2unif(x)[source]

transform a guaaisan distributed random variable to a uniformly distributed variable

Parameters

x (torch.tensor) – input

Returns

output

Return type

torch.tensor

linna.util.invgauss2unif(x)[source]

inverse transform a guaaisan distributed random variable to a uniformly distributed variable

Parameters

x (torch.tensor) – input

Returns

output

Return type

torch.tensor

class linna.util.Transform(priors)[source]

Transform parameters so that all the prior is gaussian with zero mean and unit variance

__call__(self, x, returnnumpy=True, inputnumpy=True)[source]

Transform perameters so that all the prior is gaussian with zero mean and unit variance

Parameters
  • x (nd array or torch array) – array of parameters

  • returnnumpy (bool) – If true, then the return value will be in numpy array. Otherwise, the return value will be in torch tensor

  • inputnumpy (bool) – If true, the return value should be in numpy array. Otherwise, the return value should be in torch tensor

Returns

depends on the input parameters returnnumpy

Return type

numpy array or torch tensor

class linna.util.invTransform(priors)[source]

Inverse the `Transform` function.

__call__(self, x, returnnumpy=True, inputnumpy=True)[source]
Parameters
  • x (nd array or torch array) – array of parameters

  • returnnumpy (bool) – If true, then the return value will be in numpy array. Otherwise, the return value will be in torch tensor

  • inputnumpy (bool) – If true, the return value should be in numpy array. Otherwise, the return value should be in torch tensor

Returns

depends on the input parameters returnnumpy

Return type

numpy array or torch tensor

class linna.util.ArrayDataset(X, y)[source]

Bases: torch.utils.data.Dataset

prepare data for torch

__len__(self)[source]
__getitem__(self, i)[source]
class linna.util.Y_transform_data(sigma, device)[source]

Transform data vector from y–>y/sigma

__call__(self, y)[source]
Parameters

y (torch tensor) – data vector

Returns

y/sigma

Return type

torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters

path (string) – name of the pickle file

transform_cov(self, cov)[source]

Transform the associated covariance matrix if one transform the data vector by 1/sigma

Parameters

cov (2d array) – covariance matrix

Returns

transformed covariance matrix

Return type

torch(2d array)

class linna.util.Y_invtransform_data(sigma, device)[source]

Transform data vector from y–>y sigma (Api is the same as Y_transform_data)

__call__(self, y)[source]
pickle(self, path)[source]
class linna.util.X_transform_class(X_mean, X_std, device, dolog10index=None)[source]

Transform parameters from x –> (x-mean)/std or x –> (log10(x)-mean)/std

__call__(self, X)[source]
Parameters

X (torch tensor) – array with first dimension the same as the length of X_mean

Returns

torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters

path (string) – name of the pickle file

class linna.util.Y_transform_class(y_mean, y_std, dev, ypositive=False)[source]

Transform data vector: y–>y*std+mean or np.exp(y–>y*std+mean)

__call__(self, y)[source]
Parameters

y (torch tensor) – data

Returns

transformed data

Return type

torch tensor

pickle(self, path)[source]

Pickle the transform

Parameters

path (string) – name of the pickle file

class linna.util.Y_invtransform_class(y_mean, y_std, data_tensor, dev, ypositive=False)[source]

Transform data vector: y–>(y-mean)/std or (np.log(y)-mean)/std (API tis the same as `` Y_transform_class``

__call__(self, y)[source]
transform_cov(self, cov)[source]

Transform the associated covariance matrix if one transform the data vector by 1/sigma

Parameters

cov (2d array) – covariance matrix

Returns

transformed covariance matrix

Return type

torch(2d array)

pickle(self, path)[source]
class linna.util._FunctionWrapper(f, args, kwargs)[source]

Bases: object

Only for internal use :meta private:

__call__(self, x)[source]
linna.util.retrieve_model(outdir, inshape, outshape, nnmodel_in=ChtoModelv2)[source]

Retrieve the trained model

Parameters
  • Outdir (string) – directory of the outdir

  • inshape (int) – input vector size of the model

  • outshape (int) – output vector size of the model

  • nnmodel_in (callable, optional) – neural network instance defined in nn.py

Returns

model linna.util.Y_invtransform_data: callable that transform the model to the same space as data vector

Return type

linna.predictor_gpu.Predictor

linna.util.retrieve_model_wrapper_in(outdir, nnmodel_in=ChtoModelv2, no_grad=True)[source]

Retrieve the trained model (more user friendly than retrieve_model)

Parameters
  • Outdir (string) – directory of the outdir

  • nnmodel_in (callable, optional) – neural network instance defined in nn.py

  • no_grad (bool) – True: not keep gradient information, Flase: keep gradient information

Returns

a function takes in cosmological and nuisance parameters (torch tensor) and returns the prediction of the data vector using the neural network. Note that the output is in the format of torch.tensor, so that its differentiation can be evaluated.

Return type

model (callable)

class linna.util.NN_samplerv1(outdir, prior_range)[source]

A class to perform neural network sampling for each iteration

generate_training_data(self, samples, model, pool=None, args=None, kwargs=None)[source]

Generate predicted data vector from a set of parameters

Parameters
  • samples (ndarray) – 2d array containing data with float type. Set of parameters in each row

  • model (function) – a function that take a row of samples, args and kwargs and return the predicted data vector

  • pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables).

  • args (lists, optional) – args or kwargs to be passed to model

  • kwargs (lists, optional) – args or kwargs to be passed to model

Returns

each row corresponds to the output of model at each row of the samples.

Return type

ndarray

gensample_flat(self, Nsamples, omegab2cut=None)[source]

Generate parameters for training and validation using latin hypercube.

Parameters
  • Nsamples (int) – number of samples to be generated.

  • omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

gensample_chain(self, Nsamples, chain_in, nsigma, omegab2cut=None)[source]

Generate parameters for training and validation from a chain using latin hyper cube.

Parameters
  • Nsamples (int) – number of samples to be generated.

  • chain_in (ndarray) – a mcmc chain.

  • nsigma (int) – up to this number an mcmc chain is generated

  • omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

gensample_chain_randomsample(self, Nsamples, chain_in, nsigma, omegab2cut=None)[source]

Generate parameters for training and validation from a chain using latin hyper cube.

Parameters
  • Nsamples (int) – number of samples to be generated.

  • chain_in (ndarray) – a mcmc chain.

  • nsigma (int) – up to this number an mcmc chain is generated

  • omegab2cut (list of int) – 2 elements containing the lower and upper limits of omegab*h^2

Returns

a 2d array containing data with float type. Parameters for training and validation

Return type

ndarray

emcee_sample(self, log_prob, ndim, nwalkers, init, pool, transform, ntimes=50, tautol=0.01, dlnp=None, ddlnp=None, meanshift=0.1, stdshift=0.1, nk=1)[source]

Generate MCMC chains using emcee.

Parameters
  • log_prob (function) – function of posterior.

  • ndim (int) – the dimension of posterior

  • nwalkers (int) – number of mcmc walkers

  • init (ndarray) – array of init points of the sampler

  • pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables)

  • transform (function) – mapping mcmc samples to actually parameters

Zeus_sample(self, log_prob, ndim, nwalkers, init, pool, transform, ntimes=50, tautol=0.01, dlnp=None, ddlnp=None, meanshift=0.1, stdshift=0.1, nk=1)[source]

Generate MCMC chains using zeus.

Parameters
  • log_prob (function) – function of posterior.

  • ndim (int) – the dimension of posterior

  • nwalkers (int) – number of mcmc walkers

  • init (ndarray) – array of init points of the sampler

  • pool (mpi pool, optional) – a mpi pool instance that can do pool.map(function, iterables)

  • transform (function) – mapping mcmc samples to actually parameters

_HMC_sample(self, log_prob, dlnp, ddlnp, ndim, nwalkers, init, pool, transform, samp_steps, samp_eps)[source]
_NUTS_sample(self, log_prob, dlnp, ddlnp, ndim, nwalkers, init, pool, transform, Madapt)[source]
linna.util.gaussianlogliklihood(m, data, invcov)[source]
class linna.util.Log_prob(data_new, invcov_new, model, y_invtransform_data, transform, temperature, loglikelihoodfunc, nograd=False)[source]

Class do loglikelihood

__call__(self, x, returntorch=True, inputnumpy=True)[source]
Parameters
  • x (numpy array or tensor) –

  • returntorch (bool) – whether to return torch tensor or numpy array

  • inputnumpy (bool) – whether the input is in numpy array or torch tensor

Returns

tensor or numpy array

class linna.util.Dlnp(data_new, invcov_new, model, y_invtransform_data, transform, temperature)[source]

Class do derivative of loglikelihood (API the same as Log_prob)

__call__(self, x, lnP=None, returntorch=None, inputnumpy=None)[source]
class linna.util.Ddlnp(data_new, invcov_new, model, y_invtransform_data, transform, temperature)[source]

Class do 2nd derivative of loglikelihood (API the same as Log_prob)

__call__(self, x)[source]
class linna.util.Auxilleryfunc(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class for internal use :meta private:

__call__(self, y_pred, y_target)[source]
class linna.util.Loss_fn(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class defined loss function

__call__(self, y_pred, y_target)[source]
Parameters
  • y_pred (tensor) – predicted data vector by nn

  • y_target (tensor) – targeted data vector

Returns

loss

Return type

torch.float

class linna.util.Val_metric_fn(data_in, cov_tensor, inv_cov_tensor, y_transform_data, y_inv_transform, device)[source]

Class for validation metric (API the same as loss)

__call__(self, y_pred, y_target)[source]
class linna.util.LogPrior(prior)[source]

Priors handling

__call__(self, xlist)[source]
Parameters

xlist (list) – data

Returns

prior

Return type

float

linna.util.lnprior(x)[source]

internal function :meta private:

linna.util.generate_training_point(theory, nnsampler, pool, outdir, ntrain, nval, data, invcov, chain=None, nsigma=1, omegab2cut=None, options=0, negloglike=None, nbest_in=None, chisqcut=None)[source]

Generate training point

Parameters
  • theory (callable) – model

  • nnsampler (NN_samplerv1 instance) –

  • pool (chtoPool instance) – mpi pool

  • outdir (string) – output directory

  • ntrain (int) – number of training data

  • nval (int) – number of validation data

  • data (1d array) – float array, data vector

  • invcov (2d array) – float array, inverse of covariance matrix

  • chain (array or None) – if None: generate from prior, if and array: training sample will be generated using thie chain

  • nsigma (int) – if option ==0, this means we build a LH in nsignma region of the chain.

  • omegab2cut (list or None) – additional cut on omegabh2, if list, omegab2cut = [index of omegab, index of h, lower limit, upper limit]

  • options (int) – if 0, generate using Latin Hypercube. If 1, random sample the chain

  • negloglike (callable) – if not None, generate one sample usin maximum likelihood method (minimize negloglike)

  • nbest_in (int, optional) – number of samples to be included in the training set according to the optimizer

  • chisqcut (float, optional) – cut the training data if there chisq is greater than this value

linna.util.chisqcut_all(data, invcov, chisqcut, fnamey, fnamex)[source]

Internal function

linna.util.train_nn(outdir, model, train_x, train_y, val_x, val_y, X_transform, y_transform, loss_fn, val_metric_fn, dev='cpu', verbose=False, retrain=True, pool=None, nocpu=False, size=0, rank=0, params=None)[source]

Internal function

linna.util.median_absolute_deviation(y, median, dim)[source]

Internal function

linna.util.train_NN(nnsampler, cov, inv_cov, sigma, outdir_in, outdir_list, data, dolog10index=None, ypositive=False, retrain=True, norder=2, temperature=None, docuda=False, pool=None, tsize=1, nnmodel_in=None, params=None, usebest=False)[source]

Internal function

linna.util.run_mcmc(nnsampler, outdir, method, ndim, nwalkers, init, log_prob, dlnp=None, ddlnp=None, pool=None, transform=None, ntimes=50, tautol=0.01, meanshift=0.1, stdshift=0.1, nk=2)[source]

Run mcmc using the trained model

Parameters
  • nnsampler (NN_samplerv1 instance) –

  • outdir (string) – output directory

  • method (string) – mcmc method, only emcee or zeus is supported

  • ndim (int) – dimension of input parameters

  • nwalkers (int) – number of walkers

  • init (array) – numpy array with shape (nwalker, ndim)

  • log_probi (callable) – likelihood

  • pool (None or chtoPool) – mpi pool

  • transform (Transform or None) – function to transform input

  • ntimes (int) – number of autocorrelation time

  • tautol (float) – tolerance of autocorrelation time error

  • meanshift (float) – maximum shifts of parameter mean estimations between first half and second half of the chains in unit of sigma

  • stdshift (float) – maximum shift of parameter error estimation between first half and second half of the chains in unit of percent

linna.util.logp_theory_data(samples, theory, data, invcov, logprior)[source]

Internal function