Basic PEMtk fitting class demo

03/06/21

Tidy version for docs, derived from http://127.0.0.1:8888/lab/tree/dev/PEMtk/fitting/fitting_routines_class-demo_v1_110521-tidy-StimpyTest_020621.ipynb

11/05/21

First version of class test/demo.

See dev notebook for background.

Outline of this notebook:

Load required packages.
Setup pemtkFit object.
- Set various parameters, either from existing data or new values.
- Data is handled as a set of dictionaries within the class, self.data[key][dataType], where key is an arbitrary label for, e.g. a specific experiment, calculation etc, and dataType contains a set of values, parameters etc. (Should become clear below!)
- Methods operate on all self.data items in general, with some special cases: self.data['subset'] contains data to be used in fitting.
Simulate data
- Use ePSproc to simulated aligned-frame measurements.
Fit data

Prerequisities

Working installation of ePSproc + PEMtk (or local copies of the Git repos, which can be pointed at for setup below).
Test/demo data, from ePSproc Github repo.

Setup

Imports

A few standard imports…

[1]:

import sys
import os
from pathlib import Path
# import numpy as np
# import epsproc as ep
# import xarray as xr

from datetime import datetime as dt
timeString = dt.now()

And local module imports. This should work either for installed versions (e.g. via pip install), or for test code via setting the base path below to point at your local copies.

[2]:

# For module testing, include path to module here, otherwise use global installation
if sys.platform == "win32":
    modPath = Path(r'D:\code\github')  # Win test machine
    winFlag = True
else:
    modPath = Path(r'/home/femtolab/github')  # Linux test machine
    winFlag = False

# Append to sys path
sys.path.append((modPath/'ePSproc').as_posix())
sys.path.append((modPath/'PEMtk').as_posix())

[3]:

# ePSproc
import epsproc as ep

# Set data path
# Note this is set here from ep.__path__, but may not be correct in all cases - depends on where the Github repo is.
epDemoDataPath = Path(ep.__path__[0]).parent/'data'

[4]:

# PEMtk
# import pemtk as pm
# from pemtk.data.dataClasses import dataClass

# Import fitting class
from pemtk.fit.fitClass import pemtkFit

[5]:

# Set HTML output style for Xarray in notebooks (optional), may also depend on version of Jupyter notebook or lab, or Xr
# See http://xarray.pydata.org/en/stable/generated/xarray.set_options.html
# if isnotebook():
# xr.set_options(display_style = 'html')

[6]:

# Set some plot options
ep.plot.hvPlotters.setPlotters()

Polarisation geometry/ies

This wraps ep.setPolGeoms. This defaults to (x,y,z) polarization geometries. Values are set in self.data['pol'].

Note: if this is not set, the default value will be used, which is likely not very useful for the fit!

[14]:

data.setPolGeoms()
data.data['pol']['pol']

[14]:

xarray.DataArray

Labels: 3

quaternion(1, -0, 0, 0) ... quaternion(0.5, -0.5, 0.5, 0.5)

array([quaternion(1, -0, 0, 0),
       quaternion(0.707106781186548, -0, 0.707106781186548, 0),
       quaternion(0.5, -0.5, 0.5, 0.5)], dtype=quaternion)

Coordinates: (2)

Euler

(Labels)

object

(0.0, 0.0, 0.0) ... (1.5707963267948966, 1.5707963267948966, 0.0)

array([(0.0, 0.0, 0.0), (0.0, 1.5707963267948966, 0.0),
       (1.5707963267948966, 1.5707963267948966, 0.0)], dtype=object)

Labels
(Labels)
<U18
'z' 'x' 'y'
```
array(['z', 'x', 'y'], dtype='<U18')
```

Attributes: (1)
dataType :
Euler

[15]:

# # data.setPolGeoms(eulerAngs = [[0,0,0]], labels = ['z'])
# data.setPolGeoms(eulerAngs = [0,0,0], labels = 'z')
# data.data['pol']['pol']  #.swap_dims({'Euler':'Labels'})

[16]:

# data.data['pol']['pol'] = data.data['pol']['pol'].swap_dims({'Euler':'Labels'})
# data.selOpts['pol'] = {'inds': {'Labels': 'z'}}
# data.setSubset(dataKey = 'pol', dataType = 'pol')

Compute AF-\(\beta_{LM}\)s

[23]:

# data.afblmMatEfit(data = None)  # OK
BetaNormX, basis = data.afblmMatEfit()  # OK, uses default polarizations & ADMs as set in data['subset']
# BetaNormX, basis = data.afblmMatEfit(ADM = data.data['subset']['ADM'])  # OK, but currently using default polarizations
# BetaNormX, basis = data.afblmMatEfit(ADM = data.data['subset']['ADM'], pol = data.data['pol']['pol'].sel(Labels=['x']))
# BetaNormX, basis = data.afblmMatEfit(ADM = data.data['subset']['ADM'], pol = data.data['pol']['pol'].sel(Labels=['x','y']))  # This fails for a single label...?
# BetaNormX, basis = data.afblmMatEfit(RX=data.data['pol']['pol'])  # This currently fails, need to check for consistency in ep.sphCalc.WDcalc()
                                                                    # - looks like set values and inputs are not consistent in this case? Not passing angs correctly, or overriding?
                                                                    # - See also recently-added sfError flag, which may cause additional problems.

Set the data to fit

Here we’ll use the values calculated above as our test data. This currently needs to be set as self.data['subset']['AFBLM'] for fitting.

[28]:

# data.data['subset']['AFBLM'] = BetaNormX  # Set manually

data.setData('sim', BetaNormX)  # Set simulated data to master structure as "sim"
data.setSubset('sim','AFBLM')   # Set to 'subset' to use for fitting.

Subselected from dataset 'sim', dataType 'AFBLM': 195 from 195 points (100.00%)

[29]:

# Set basis functions
data.basis = basis

Setting up the fit parameters

In this case, we can work from the existing matrix elements to speed up parameter creation, although in practice this may need to be approached ab initio - nonetheless, the method will be the same, and the ab initio case detailed later.

[30]:

# Input set, as defined earlier
data.data['subset']['matE'].pd

[30]:

Cont  Targ  Total  it  l  m   mu
PU    SG    PU     1   1  -1   1    1.162941-1.353670j
                           1  -1    1.162941-1.353670j
                       3  -1   1   -0.802725-0.016979j
                           1  -1   -0.802725-0.016979j
SU    SG    SU     1   1   0   0   -2.317060+1.358736j
                       3   0   0    1.105722-0.087176j
dtype: complex128

[31]:

# data.setMatEFit()  # Need to fix self.subset usage
data.setMatEFit(data.data['subset']['matE'])  #, Eke=1.1) # Some hard-coded things to fix here! Now roughly working.

Set 6 complex matrix elements to 12 fitting params, see self.params for details.

name	value	initial value	min	max	vary
m_PU_SG_PU_1_n1_1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_1_1_n1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_1_n1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.68606212	2.686062120382649	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.10915311	1.109153108617096	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	False
p_PU_SG_PU_1_1_n1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	True
p_PU_SG_PU_3_1_n1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	2.61122920	2.611229196458127	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	-0.07867828	-0.07867827542158025	-3.14159265	3.14159265	True

This sets self.params from the matrix elements, which are a set of (real) parameters for lmfit, as a Parameters object.

Note that:

The input matrix elements are converted to magnitude-phase form, hence there are twice the number as the input array, and labelled m or p accordingly, along with a name based on the full set of QNs/indexes set.
One phase is set to vary=False, which defines a reference phase. This defaults to the first phase item.
Min and max values are defined, by default the ranges are 1e-4<mag<5, -pi<phase<pi.
No relationships between the parameters are set by default (apart from the single fixed phase), but can be set manually, see section below.

[32]:

data.params

[32]:

name	value	initial value	min	max	vary
m_PU_SG_PU_1_n1_1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_1_1_n1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_1_n1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.68606212	2.686062120382649	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.10915311	1.109153108617096	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	False
p_PU_SG_PU_1_1_n1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	True
p_PU_SG_PU_3_1_n1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	2.61122920	2.611229196458127	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	-0.07867828	-0.07867827542158025	-3.14159265	3.14159265	True

Running a fit…

With the parameters and data set, just call self.fit()!

Statistics and outputs are handled by lmfit, which includes uncertainty estimates and correlations in the fitted parameters.

[33]:

data.fit()

[34]:

# Check fit outputs
data.result

[34]:

Fit Statistics

fitting method	leastsq
# function evals	13
# data points	195
# variables	11
chi-square	5.3662e-31
reduced chi-square	2.9164e-33
Akaike info crit.	-14597.7369
Bayesian info crit.	-14561.7339

Variables

name	value	standard error	relative error	initial value	min	max	vary
m_PU_SG_PU_1_n1_1	1.78461575	1.4711e-09	(0.00%)	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_1_1_n1	1.78461575	1.4711e-09	(0.00%)	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	0.80290495	2.6141e-09	(0.00%)	0.802904951323892	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_1_n1	0.80290495	2.6141e-09	(0.00%)	0.802904951323892	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.68606212	6.0138e-16	(0.00%)	2.686062120382649	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.10915311	1.4590e-15	(0.00%)	1.109153108617096	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.86104140	0.00000000	(0.00%)	-0.8610414024232179	-3.14159265	3.14159265	False
p_PU_SG_PU_1_1_n1	-0.86104140	2.5882e-15	(0.00%)	-0.8610414024232179	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	-3.12044446	2.9686e-09	(0.00%)	-3.1204444620772467	-3.14159265	3.14159265	True
p_PU_SG_PU_3_1_n1	-3.12044446	2.9686e-09	(0.00%)	-3.1204444620772467	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	2.61122920	5.6788e-16	(0.00%)	2.611229196458127	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	-0.07867828	1.5863e-15	(0.00%)	-0.07867827542158025	-3.14159265	3.14159265	True

Correlations (unreported correlations are < 0.100)

m_PU_SG_PU_1_n1_1	m_PU_SG_PU_1_1_n1	-1.0000
m_PU_SG_PU_3_n1_1	m_PU_SG_PU_3_1_n1	-1.0000
p_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_1_n1	-1.0000
m_SU_SG_SU_1_0_0	m_SU_SG_SU_3_0_0	-0.9896
m_SU_SG_SU_3_0_0	p_SU_SG_SU_3_0_0	-0.8970
m_SU_SG_SU_1_0_0	p_SU_SG_SU_3_0_0	0.8887
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_1_0_0	-0.7707
p_SU_SG_SU_1_0_0	p_SU_SG_SU_3_0_0	-0.7543
m_SU_SG_SU_1_0_0	p_SU_SG_SU_1_0_0	-0.6756
m_SU_SG_SU_3_0_0	p_SU_SG_SU_1_0_0	0.6633
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_3_1_n1	0.4627
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	-0.4627
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_3_n1_1	-0.4627
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_n1_1	0.4627
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_1_1_n1	-0.4471
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_1_1_n1	0.4471
m_PU_SG_PU_1_n1_1	m_PU_SG_PU_3_1_n1	-0.4414
m_PU_SG_PU_1_1_n1	m_PU_SG_PU_3_1_n1	0.4414
m_PU_SG_PU_1_n1_1	m_PU_SG_PU_3_n1_1	0.4414
m_PU_SG_PU_1_1_n1	m_PU_SG_PU_3_n1_1	-0.4414
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_1_1_n1	0.3751
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_1_1_n1	-0.3751
p_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	-0.3431
p_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_n1_1	0.3431
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_3_0_0	-0.3314
m_PU_SG_PU_3_n1_1	m_SU_SG_SU_3_0_0	0.3314
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_1_0_0	0.3303
m_PU_SG_PU_3_n1_1	m_SU_SG_SU_1_0_0	-0.3303
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	0.2973
m_SU_SG_SU_1_0_0	p_PU_SG_PU_1_1_n1	0.2958
m_PU_SG_PU_1_n1_1	m_SU_SG_SU_3_0_0	0.2929
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_3_0_0	-0.2929
m_SU_SG_SU_3_0_0	p_PU_SG_PU_1_1_n1	-0.2914
m_PU_SG_PU_1_n1_1	m_SU_SG_SU_1_0_0	-0.2913
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_1_0_0	0.2913
m_SU_SG_SU_3_0_0	p_PU_SG_PU_3_1_n1	0.2865
m_SU_SG_SU_3_0_0	p_PU_SG_PU_3_n1_1	-0.2865
m_PU_SG_PU_1_n1_1	p_SU_SG_SU_1_0_0	0.2840
m_PU_SG_PU_1_1_n1	p_SU_SG_SU_1_0_0	-0.2840
m_PU_SG_PU_3_1_n1	p_SU_SG_SU_3_0_0	0.2821
m_PU_SG_PU_3_n1_1	p_SU_SG_SU_3_0_0	-0.2821
m_PU_SG_PU_3_1_n1	p_SU_SG_SU_1_0_0	-0.2818
m_PU_SG_PU_3_n1_1	p_SU_SG_SU_1_0_0	0.2818
m_SU_SG_SU_1_0_0	p_PU_SG_PU_3_1_n1	-0.2814
m_SU_SG_SU_1_0_0	p_PU_SG_PU_3_n1_1	0.2814
m_PU_SG_PU_1_n1_1	p_SU_SG_SU_3_0_0	-0.2200
m_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	0.2200
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_3_1_n1	-0.2023
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_1_n1	0.2023
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_3_n1_1	0.2023
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_n1_1	-0.2023
p_PU_SG_PU_3_1_n1	p_SU_SG_SU_1_0_0	0.1343
p_PU_SG_PU_3_n1_1	p_SU_SG_SU_1_0_0	-0.1343
p_PU_SG_PU_3_1_n1	p_SU_SG_SU_3_0_0	-0.1313
p_PU_SG_PU_3_n1_1	p_SU_SG_SU_3_0_0	0.1313

Results vs. data can be (crudely) plotted with self.BLMfitPlot() (better plotting routines to follow!).

This will plot all results by default, vs. the subset data used for the fitting routine inputs.

[35]:

# Plot data subset (--x) plus fit (solid lines)
data.BLMfitPlot()

Dataset: subset, AFBLM
Dataset: 0, AFBLM

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_59_1.png

[36]:

# Fit results are currently added to the main data dict by an index number
data.data.keys()

[36]:

dict_keys(['orb6', 'orb5', 'ADM', 'pol', 'subset', 'sim', 0])

[37]:

# Plot results with lmPlot wrapper - this also defaults to show data subset + most recent fit results
data.lmPlotFit()

Plotting data n2_3sg_0.1-50.1eV_A2.inp.out, pType=a, thres=0.01, with Seaborn
Plotting data (No filename), pType=a, thres=0.01, with Seaborn

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_61_1.png

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_61_2.png

Fitting with randomised parameter inputs

Here we might expect some variation in the results, depending on various properites (ionizing channel, dataset size etc.), and also for the fitting to take a bit longer (see benchmarks later).

TODO:

More careful analysis routines here, run vs. number of input points & test fiedelity.
Parallelize.
Variation over runs/general statistical analysis.

[38]:

# Basic randomize routine, [0,1] interval
data.randomizeParams()

[39]:

data.fit()
data.result

[39]:

Fit Statistics

fitting method	leastsq
# function evals	687
# data points	195
# variables	11
chi-square	0.00178259
reduced chi-square	9.6880e-06
Akaike info crit.	-2240.52413
Bayesian info crit.	-2204.52113

Variables

name	value	standard error	relative error	initial value	min	max	vary
m_PU_SG_PU_1_n1_1	1.0000e-04	0.02878885	(28788.85%)	0.02129591573558398	1.0000e-04	5.00000000	True
m_PU_SG_PU_1_1_n1	2.65953073	0.09359366	(3.52%)	0.5003341770940677	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	1.77943189	0.83227191	(46.77%)	0.6217616830375992	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_1_n1	2.13431007	0.99444910	(46.59%)	0.5114874598511951	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.71674517	0.04721051	(1.74%)	0.3903702867415594	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.05873885	0.12108910	(11.44%)	0.1518467511430327	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.86104140	0.00000000	(0.00%)	-0.8610414024232179	-3.14159265	3.14159265	False
p_PU_SG_PU_1_1_n1	1.14805770	0.06955446	(6.06%)	0.045436187967926034	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	-3.14159265	0.02623102	(0.83%)	0.4122447260821296	-3.14159265	3.14159265	True
p_PU_SG_PU_3_1_n1	3.14159265	0.02336103	(0.74%)	0.09510612061972745	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	-2.02120566	0.05776807	(2.86%)	0.30505876118085085	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	1.77009004	0.15150596	(8.56%)	0.8346487505349893	-3.14159265	3.14159265	True

Correlations (unreported correlations are < 0.100)

m_PU_SG_PU_3_n1_1	m_PU_SG_PU_3_1_n1	-0.9995
m_SU_SG_SU_1_0_0	m_SU_SG_SU_3_0_0	-0.9941
m_SU_SG_SU_3_0_0	p_SU_SG_SU_3_0_0	0.9395
m_SU_SG_SU_1_0_0	p_SU_SG_SU_3_0_0	-0.9376
m_PU_SG_PU_1_1_n1	m_PU_SG_PU_3_n1_1	0.8858
m_PU_SG_PU_1_1_n1	m_PU_SG_PU_3_1_n1	-0.8829
p_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	-0.7629
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_3_0_0	0.7511
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_1_0_0	-0.7489
m_PU_SG_PU_3_n1_1	m_SU_SG_SU_3_0_0	-0.7485
p_PU_SG_PU_3_1_n1	p_SU_SG_SU_1_0_0	-0.7475
m_PU_SG_PU_3_n1_1	m_SU_SG_SU_1_0_0	0.7465
m_PU_SG_PU_3_1_n1	p_SU_SG_SU_3_0_0	0.6694
m_PU_SG_PU_3_n1_1	p_SU_SG_SU_3_0_0	-0.6644
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_1_n1	-0.5807
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_3_1_n1	0.5726
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_1_1_n1	0.5688
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	-0.5174
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_1_0_0	0.5150
p_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_n1_1	0.4827
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_3_0_0	-0.4446
p_PU_SG_PU_3_n1_1	p_SU_SG_SU_1_0_0	0.4399
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_1_0_0	0.4394
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_1_1_n1	0.4066
m_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	-0.3893
p_SU_SG_SU_1_0_0	p_SU_SG_SU_3_0_0	0.3701
p_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_1_n1	-0.3492
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_1_1_n1	0.3439
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_1_1_n1	-0.3308
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	0.3272
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_3_1_n1	-0.2833
m_SU_SG_SU_3_0_0	p_PU_SG_PU_3_1_n1	0.2446
m_SU_SG_SU_1_0_0	p_PU_SG_PU_3_1_n1	-0.2388
m_PU_SG_PU_1_n1_1	p_SU_SG_SU_1_0_0	0.1740
m_PU_SG_PU_1_n1_1	m_PU_SG_PU_1_1_n1	0.1679
m_PU_SG_PU_1_n1_1	p_PU_SG_PU_3_n1_1	-0.1568
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_n1_1	0.1397
m_PU_SG_PU_1_1_n1	p_SU_SG_SU_1_0_0	-0.1345
m_SU_SG_SU_3_0_0	p_PU_SG_PU_3_n1_1	-0.1326
m_SU_SG_SU_1_0_0	p_PU_SG_PU_3_n1_1	0.1259
m_SU_SG_SU_1_0_0	p_PU_SG_PU_1_1_n1	-0.1236
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_3_n1_1	-0.1203
m_PU_SG_PU_1_n1_1	m_PU_SG_PU_3_n1_1	0.1195
m_SU_SG_SU_3_0_0	p_PU_SG_PU_1_1_n1	0.1193
p_PU_SG_PU_3_n1_1	p_SU_SG_SU_3_0_0	0.1174
m_PU_SG_PU_3_n1_1	p_PU_SG_PU_3_n1_1	0.1150
m_SU_SG_SU_3_0_0	p_SU_SG_SU_1_0_0	0.1076
m_SU_SG_SU_1_0_0	p_SU_SG_SU_1_0_0	-0.1062

[40]:

# Note that if keys is not set, BLMfitPlot will show all fit run results.
data.BLMfitPlot()

Dataset: subset, AFBLM
Dataset: 1, AFBLM

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_65_1.png

Timing for the test fit (next cell) - note this is currently just running with defaults, single core only, and we’re not checking for good fiedelity here. Fits take between 1 and 12 mins, approximately.

Results may vary depending on the inputs… for the current test case:

1st test The slowest run took 12.88 times longer than the fastest. This could mean that an intermediate result is being cached. 50.7 s ± 1min 2s per loop (mean ± std. dev. of 7 runs, 1 loop each)
2nd test The slowest run took 105.38 times longer than the fastest. This could mean that an intermediate result is being cached. 4min 13s ± 7min 6s per loop (mean ± std. dev. of 7 runs, 1 loop each)

TODO:

More careful testing & benchmarks.
Timing vs. input dataset size.
Fitting statistics over fits (fidelity vs. fit vs. dataset size etc.).

[41]:

# %%timeit

# data.randomizeParams()
# data.fit()

Setting parameter relations/constraints

If we know that some of the matrix elements (parameters) are related, this can be set using contraints on the parameters.

In this test case, we know some terms are equal… this should speed up the fitting, and also improve fiedelity.

TODO: automatic relation setting in cases derived from computational matrix elements.

[42]:

# With constraints
# Set param constraints as dict
paramsCons = {}
paramsCons['m_PU_SG_PU_1_n1_1'] = 'm_PU_SG_PU_1_1_n1'
paramsCons['p_PU_SG_PU_1_n1_1'] = 'p_PU_SG_PU_1_1_n1'

paramsCons['m_PU_SG_PU_3_n1_1'] = 'm_PU_SG_PU_3_1_n1'
paramsCons['p_PU_SG_PU_3_n1_1'] = 'p_PU_SG_PU_3_1_n1'

data.setMatEFit(paramsCons = paramsCons)

Set 6 complex matrix elements to 12 fitting params, see self.params for details.

name	value	initial value	min	max	vary	expression
m_PU_SG_PU_1_n1_1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	False	m_PU_SG_PU_1_1_n1
m_PU_SG_PU_1_1_n1	1.78461575	1.784615753610107	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	False	m_PU_SG_PU_3_1_n1
m_PU_SG_PU_3_1_n1	0.80290495	0.802904951323892	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.68606212	2.686062120382649	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.10915311	1.109153108617096	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	False	p_PU_SG_PU_1_1_n1
p_PU_SG_PU_1_1_n1	-0.86104140	-0.8610414024232179	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	False	p_PU_SG_PU_3_1_n1
p_PU_SG_PU_3_1_n1	-3.12044446	-3.1204444620772467	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	2.61122920	2.611229196458127	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	-0.07867828	-0.07867827542158025	-3.14159265	3.14159265	True

[43]:

data.randomizeParams()
data.fit()
data.result

[43]:

Fit Statistics

fitting method	leastsq
# function evals	161
# data points	195
# variables	8
chi-square	1.1026e-17
reduced chi-square	5.8961e-20
Akaike info crit.	-8626.26451
Bayesian info crit.	-8600.08051

Variables

name	value	standard error	relative error	initial value	min	max	vary	expression
m_PU_SG_PU_1_n1_1	1.78461575	8.1194e-10	(0.00%)	0.6840385724334991	1.0000e-04	5.00000000	False	m_PU_SG_PU_1_1_n1
m_PU_SG_PU_1_1_n1	1.78461575	8.1194e-10	(0.00%)	0.6840385724334991	1.0000e-04	5.00000000	True
m_PU_SG_PU_3_n1_1	0.80290495	1.8017e-09	(0.00%)	0.5216275685614492	1.0000e-04	5.00000000	False	m_PU_SG_PU_3_1_n1
m_PU_SG_PU_3_1_n1	0.80290495	1.8017e-09	(0.00%)	0.5216275685614492	1.0000e-04	5.00000000	True
m_SU_SG_SU_1_0_0	2.68606212	1.7019e-09	(0.00%)	0.22227352468309725	1.0000e-04	5.00000000	True
m_SU_SG_SU_3_0_0	1.10915311	4.1387e-09	(0.00%)	0.6562471595616396	1.0000e-04	5.00000000	True
p_PU_SG_PU_1_n1_1	-0.65962973	3.2020e-09	(0.00%)	0.6826881478354998	-3.14159265	3.14159265	False	p_PU_SG_PU_1_1_n1
p_PU_SG_PU_1_1_n1	-0.65962973	3.2020e-09	(0.00%)	0.6826881478354998	-3.14159265	3.14159265	True
p_PU_SG_PU_3_n1_1	1.59977333	4.1546e-09	(0.00%)	0.8003201107667578	-3.14159265	3.14159265	False	p_PU_SG_PU_3_1_n1
p_PU_SG_PU_3_1_n1	1.59977333	4.1546e-09	(0.00%)	0.8003201107667578	-3.14159265	3.14159265	True
p_SU_SG_SU_1_0_0	2.15128498	3.2382e-09	(0.00%)	0.801557156402618	-3.14159265	3.14159265	True
p_SU_SG_SU_3_0_0	-1.44199286	3.9100e-09	(0.00%)	0.901150896749868	-3.14159265	3.14159265	True

Correlations (unreported correlations are < 0.100)

m_SU_SG_SU_1_0_0	m_SU_SG_SU_3_0_0	-0.9736
m_PU_SG_PU_1_1_n1	m_PU_SG_PU_3_1_n1	-0.9157
m_SU_SG_SU_1_0_0	p_SU_SG_SU_1_0_0	-0.7969
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_1_0_0	0.7919
m_SU_SG_SU_3_0_0	p_SU_SG_SU_1_0_0	0.7866
p_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	0.7545
p_PU_SG_PU_3_1_n1	p_SU_SG_SU_3_0_0	0.7212
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_3_1_n1	0.6668
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_3_1_n1	-0.6463
m_PU_SG_PU_1_1_n1	p_PU_SG_PU_1_1_n1	-0.6318
m_PU_SG_PU_3_1_n1	p_PU_SG_PU_1_1_n1	0.6010
m_SU_SG_SU_3_0_0	p_SU_SG_SU_3_0_0	-0.5259
m_SU_SG_SU_1_0_0	p_SU_SG_SU_3_0_0	0.4990
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_3_0_0	-0.3999
m_PU_SG_PU_3_1_n1	p_SU_SG_SU_3_0_0	0.3943
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_1_0_0	-0.3921
p_PU_SG_PU_3_1_n1	p_SU_SG_SU_1_0_0	0.3847
m_PU_SG_PU_3_1_n1	m_SU_SG_SU_1_0_0	0.3579
m_PU_SG_PU_1_1_n1	m_SU_SG_SU_3_0_0	0.3445
m_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	-0.3393
m_SU_SG_SU_3_0_0	p_PU_SG_PU_1_1_n1	0.3276
m_SU_SG_SU_1_0_0	p_PU_SG_PU_1_1_n1	-0.3107
p_PU_SG_PU_1_1_n1	p_SU_SG_SU_3_0_0	0.2764
m_SU_SG_SU_3_0_0	p_PU_SG_PU_3_1_n1	-0.1626
m_SU_SG_SU_1_0_0	p_PU_SG_PU_3_1_n1	0.1538

[44]:

# Note that if keys is not set, BLMfitPlot will show only most recent fit run results.
data.BLMfitPlot()

Dataset: subset, AFBLM
Dataset: 2, AFBLM

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_71_1.png

Timing for the test fit (next cell, same method as earlier) - note this is currently just running with defaults, single core only, and we’re not checking for good fiedelity here. Note, also, that these results are in the ~10s range, compared to ~many minutes for the unconstrained case.

1st test 10.7 s ± 2.86 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
2nd test 7.9 s ± 3.5 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

[45]:

%%timeit

data.randomizeParams()
data.fit()

The slowest run took 7.96 times longer than the fastest. This could mean that an intermediate result is being cached.
17.8 s ± 17.1 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

[46]:

# Plot multiple fit sets
# Note that if keys = 'all' is set, BLMfitPlot will show ALL fit run results.
# TODO: use Seaborn/HV for better plotting here!
data.BLMfitPlot(keys = 'all')

Dataset: subset, AFBLM
Dataset: 0, AFBLM
Dataset: 1, AFBLM
Dataset: 2, AFBLM
Dataset: 3, AFBLM
Dataset: 4, AFBLM
Dataset: 5, AFBLM
Dataset: 6, AFBLM
Dataset: 7, AFBLM
Dataset: 8, AFBLM
Dataset: 9, AFBLM
Dataset: 10, AFBLM

../_images/fitting_PEMtk_fitting_basic_demo_030621-full_74_1.png

Versions

[47]:

import scooby
scooby.Report(additional=['epsproc', 'pemtk', 'xarray', 'jupyter'])

[47]:

Sat Jun 05 12:49:56 2021 Eastern Daylight Time
OS	Windows	CPU(s)	32	Machine	AMD64
Architecture	64bit	RAM	63.9 GB	Environment	Jupyter
Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
epsproc	1.3.0-dev	pemtk	0.0.1	xarray	0.15.0
jupyter	Version unknown	numpy	1.19.2	scipy	1.3.0
IPython	7.12.0	matplotlib	3.3.1	scooby	0.5.6
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191125 for Intel(R) 64 architecture applications

[48]:

# Check current Git commit for local ePSproc version
!git -C {Path(ep.__file__).parent} branch
!git -C {Path(ep.__file__).parent} log --format="%H" -n 1

* dev
  master
  numba-tests
da12376cc36f640d8974f5ce2c121be3d391caab

[49]:

# Check current remote commits
!git ls-remote --heads git://github.com/phockett/ePSproc
# !git ls-remote --heads git://github.com/phockett/epsman

da12376cc36f640d8974f5ce2c121be3d391caab        refs/heads/dev
82d12cf35b19882d4e9a2cde3d4009fe679cfaee        refs/heads/master
69cd89ce5bc0ad6d465a4bd8df6fba15d3fd1aee        refs/heads/numba-tests
ea30878c842f09d525fbf39fa269fa2302a13b57        refs/heads/revert-9-master

[50]:

# Check current Git commit for local PEMtk version
import pemtk
!git -C {Path(pemtk.__file__).parent} branch
!git -C {Path(pemtk.__file__).parent} log --format="%H" -n 1

* master
fca744ca18b98ecd49fbf17fc79247ebee6b9c3a

[51]:

# Check current remote commits
!git ls-remote --heads git://github.com/phockett/PEMtk
# !git ls-remote --heads git://github.com/phockett/epsman

fca744ca18b98ecd49fbf17fc79247ebee6b9c3a        refs/heads/master

Basic PEMtk fitting class demo

Prerequisities

Setup

Imports

Set & load parameters

Polarisation geometry/ies

Subselect data

Compute AF-\(\beta_{LM}\) and simulate data

Compute AF-\(\beta_{LM}\)s

AF-\(\beta_{LM}\)s

Fitting the data

Set the data to fit

Setting up the fit parameters

Running a fit…

Fit Statistics

Variables

Correlations (unreported correlations are < 0.100)

Fitting with randomised parameter inputs

Fit Statistics

Variables

Correlations (unreported correlations are < 0.100)

Setting parameter relations/constraints

Fit Statistics

Variables

Correlations (unreported correlations are < 0.100)

Versions