`pemtk.fit.fitClass`

PEMtk fitting base classes

Development for fitting class, take pemtk.dataClass as base and add some subselection & fitting methods.

10/05/21 v1b added fitting routines using ePSproc and lmfit libraries.: Adapted from dev code notebook (functional forms), still needs some tidying up & wrapping for class. For dev code see http://127.0.0.1:8888/lab/tree/dev/PEMtk/fitting/fitting_routines_dev_PEMtk_300421.ipynb

13/10/20 v1

TODO:

Clean up/finalise data scheme. Currently mix of dictionary style, self.data[dataset][datatype] and class attribs self.datatype. Better to go with former for flexibility, or latter for using as simple base class to wrap later? Former works with some existing plotting fns., but complicated.
afblmMatEfit() defaults similarly messy, although basically working.
More analysis tools for fitting & results to add, currently just self.fit() to run.

Module Contents

Classes

pemtkFit

Class prototype for pemtkFit class. Dev version builds on dataClass, and adds some basic subselection & fitting capabilities.

Attributes

snsFlag

pemtk.fit.fitClass.snsFlag = True[source]

class pemtk.fit.fitClass.pemtkFit(matE=None, data=None, ADM=None, backend='afblmXprod', **kwargs)[source]

Bases: pemtk.data.dataClasses.dataClass

Class prototype for pemtkFit class. Dev version builds on dataClass, and adds some basic subselection & fitting capabilities.

backends(backend=None)[source]

Set backends (model functions) for fitting & select for use.

Pass backend = ‘name of backend’ to select and set backend (model function) from the presets.

Pass None to return dict of available backends.

If a function is passed it is set directly.

Settings are pushed to self.backend (name) and self.fitOpts[‘backend’] (function handle).

01/09/22 v2, modified to set directly, rather than by return. 22/08/22 v1

setData(keyExpt=None, keyData=None)[source]

Data prototype - this will be used to set experimental data to the master structure. For now, set data by passing, and computational data can be set for testing, by passing a key. This basically assumes that the expt. provides AFBLMs.

TO CONSIDER:

Data format, file IO for HDF5?
Routines to read VMI images and process etc, planned for experimental code-base.
Further simulation options, e.g. add noise etc., for testing.

setWeights(wConfig=None, keyExpt=None, keyData=None, **kwargs)[source]

Wrapper for setting weights for/from data. Basically follows self.setData, with some additional options.

Will set self.data[keyExpt][‘weights’] from existing data if keyData is a string, or from keyData as passed otherwise.

Parameters:

wConfig (optional, str, default = None) –

Additional handling for weights.

’poission’, set Poissionian weights to match data dims using self.setPoissWeights()
’errors’, set weights as 1/(self.data[keyExpt][‘weights’]**2)

setMatE(**kwargs)[source]: Thin wrapper for ep.setMatE.setMatE(), pass args & set returns to self.data[‘matE’]

setADMs(**kwargs)[source]: Thin wrapper for ep.setADMs(), pass args & set returns to self.data[‘ADM’]

setPolGeoms(**kwargs)[source]: Thin wrapper for ep.setPolGeoms(), pass args & set returns to self.data[‘pol’]

setSubset(dataKey, dataType, sliceParams=None, subKey=None, resetSelectors=False, **kwargs)[source]

Threshold and subselect on matrix elements.

Wrapper for epsproc.Esubset() and epsproc.matEleSelector(), to handle data array slicing & subselection from params dict.

Subselected elements are set to self.data[subKey][dataType], where subKey defaults to self.subKey (uses existing .data structure for compatibility with existing functions!)

To additionally slice data, set dict of parameters sliceParams = {‘sliceDim’:[start, stop, step]}

To reset existing parameters, pass resetSelectors = True.

To do: better slice handling - will likely have issues with different slices for different dataTypes in current form.

setMatEFit(matE=None, paramsCons='auto', refPhase=0, colDim='it', verbose=1)[source]

Convert an input Xarray into (mag,phase) array of matrix elements for fitting routine.

Parameters:

matE (Xarray) – Input set of matrix elements, used to set allowed (l,m,mu) and input parameters. If not passed, use self.data[self.subKey][‘matE’].
paramsCons (dict, optional, default = 'auto') – Input dictionary of constraints (expressions) to be set for the parameters. See https://lmfit.github.io/lmfit-py/constraints.html If ‘auto’, parameters will be set via self.symCheck()
refPhase (int or string, default = 0) – Set reference phase by integer index or name (string). If set to None (or other types) no reference phase will be set.
colDims (dict, default = 'it') –
Quick hack to allow for restacking via ep.multiDimXrToPD, this will set to cols = ‘it’, then restack to 1D dataframe. This should always work for setting matE > fit parameters, but can be overridden if required.

This is convienient for converting to Pandas > lmfit inputs, but should redo directly from Xarray for more robust treatment. For ePS matrix elements the default should always work, although will drop degenerate cases (it>1). but shouldn’t matter here. TODO:
- make this better, support for multiple selectors.
- For eps case, matE.pd may already be set?

Returns:

params (lmfit parameters object) – Set of fitting parameters.
lmmu (dict) – List of states and mappings from states to fitting parameters (names & indexes).

29/06/21: Adapted to use ‘it’ on restack, then set to single-column with dummy dim. No selection methods, use self.setSubset() for this.

reconParams(params=None, lmmuList=None)[source]

Convert parameters object > Xarray for tensor multiplication.

VERY UGLY! Should be a neater way to do this with existing Xarray and just replace/link values (i.e. pointers)?

… but it does work.

randomizeParams()[source]: Set random values for self.params.

afblmMatEfit(matE=None, data=None, lmmuList=None, basis=None, ADM=None, pol=None, resetBasis=False, selDims={}, thres=None, thresDims='Eke', lmModelFlag=False, XSflag=True, weights=None, backend=None, debug=False, **kwargs)[source]

Wrap epsproc.geomFunc.afblmXprod() for use with lmfit fitting routines.

Parameters:

matE (Xarray or lmfit Parameters object) – Matrix elements to use in calculation. For Parameters object, also require lmmuList to convert to Xarray. If not passed, use self.data[self.subKey][‘matE’].
data (Xarray, optional, default = None) – Data for fitting. If set, return residual. If not set, return model result.
lmmuList (list, optional, default = None) – Mapping for paramters. Uses self.lmmu if not passed.
basis (dict, optional, default = None) – Pre-computed basis set to use for calculations. If not set try to use self.basis, or passed set of ADMs. NOTE: currently defaults to self.basis if it exists, pass resetBasis=True to force overwrite.
ADM (Xarray) – Set of ADMs (alignment parameters). Not required if basis is set.
pol (Xarray) – NOTE: currently NOT used for epsproc.geomFunc.afblmXprod() Set of polarization geometries (Euler angles). Not required if basis is set. (If not set, defaults to ep.setPolGeoms())
resetBasis (bool, optional, default=False) – Force self.basis overwrite with updated values. NOT YET IMPLEMENTED
{} (selDims =) – Selectors passed to backend. TODO: should use global options here.
None (thres =) – Selectors passed to backend. TODO: should use global options here.
'Eke' (thresDims =) – Selectors passed to backend. TODO: should use global options here.
lmModelFlag (bool, optional, default=False) – Output option for flat results structure for lmfit testing.
XSflag (bool, optional, default=True) – Use absolute cross-section (XS) in fitting? This is passed to backends as BLMRenorm flag. If true, use passed B00(t) values in fit, and do not renormalise. If false, renorm by B00(t), i.e. all values will be set to unity (B00(t)=1).
weights (int, Xarray or np.array, optional, default = None) –
Weights to use for residual calculation. - If set, return np.sqrt(weights) * residual. (Must match size of data along key dimension(s).) - If None, try to use use self.data[self.subKey][‘weights’].

If that is not found, or is None, an unweighted residual will be returned.

For bootstrap sampling, setting Poissonian weights can be used, see https://en.wikipedia.org/wiki/Bootstrapping_(statistics)#Poisson_bootstrap Use self.setWeights() for this, e.g. weights = rng.poisson(weights, data.t.size) To use uncertainties from the data, set weights = 1/(sigma^2)

backendfunction, optional, default = None

UPDATED 21/08/22 - now default = None, uses self.fitOpts[‘backend’] Set at class init, see also self.backends().

Testing 12/08/22 Supports backend = afblmXprod or mfblmXprod, and test OK with latter. NOTE - when passing fn externally, it may need to be defined in base namespace. (Default case uses locally defined function.) E.g.

data.afblmMatEfit(backend = ep.mfblmXprod) should be OK following import epsproc as ep data.afblmMatEfit(backend = mfblmXprod) will fail

debugbool, optional, default = False

Print additional debug output for testing.

**kwargsoptional

Additional args passed to backends.

NOTE:

some assumptions here, will probably need to run once to setup (with ADMs), then fit using basis returned.
Currently fitting abs matrix elements and renorm Betas. This sort-of works, but gives big errors on |matE|. Should add options to renorm matE for this case, and options for known B00 values.

TODO:

Consolidate weights to main data structure.
04/05/22: added to basis return as basis[‘weights’], may want to pipe back to self.data[self.subKey][‘weights’], or just set elsewhere?
More sophisticated bootstrapping methods, maybe with https://github.com/smartass101/xr-random and https://arch.readthedocs.io/en/latest/index.html

21/08/22: now with improved backend handling, working for AF and MF case. 12/08/22: testing for MF fitting. Initial tests for case where BASIS PASSED ONLY, otherwise still runs AF calc. 02/05/22: added weights options and updated docs.

fit(fcn_args=None, fcn_kws=None, fitInd=None, keepSubset=False, **kwargs)[source]

Wrapper to run lmfit.Minimizer, for details see https://lmfit.github.io/lmfit-py/fitting.html#lmfit.minimizer.Minimizer

Uses preset self.params for parameters, and self.data[self.subKey] for data.

Default case runs a Levenberg-Marquardt minimization (method=’leastsq’), using scipy.optimize.least_squares(), see the Scipy docs for more options, using the AF fitting model epsproc.geomFunc.afblmXprod() calculation routine. For MF fitting backend set fcn_kws[‘backend’] = ep.geomFunc.mfblmXprod

Parameters:

fcn_args (tuple, optional, default = None) – Positional arguments to pass to the fitting function. If None, will be set as (self.data[self.subKey][‘AFBLM’], self.lmmu, self.basis)
fcn_kws (dict, optional, default = {}) – Keyword arguments to pass to the fitting function. For MF fitting backend set fcn_kws[‘backend’] = ep.geomFunc.mfblmXprod

fitIndint, optional, default = None

If None, will use self.fitInd For parallel usage, supply explicit fitInd instead of using class var to ensure unique key per fit.

keepSubsetbool, optional, default = False

If True, keep a copy of self.data[self.subKey] in self.data[fitInd][self.subKey]

**kwargs

Passed to the fitting functions, for options see:

For lmfit options and defaults see https://lmfit.github.io/lmfit-py/fitting.html
For scipy (lmfit backend) see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html

18/08/22: debugged for MF fitting case, now can pass MF backend via fcn_kws[‘backend’] = ep.geomFunc.mfblmXprod

02/05/22: added **kwags for backends.

07/09/21: updating for parallel use.: Note that main outputs (self.reults etc.) are now dropped. May want to set to last result?

pemtk.fit.fitClass

Module Contents

Classes

Attributes

`pemtk.fit.fitClass`