Fit fidelity & analysis

22/06/21

Here we’ll look in detail at a batch of fit results (see the batch run notebook for initial setup & running fits), and stats.

TODO: final uncertainty analysis, bootstrapping and testing with noise etc.

Batch results

Generally, there are many things to consider in terms of the quality/fidelity of the fit results. In particular, we might want to investigate:

  • \(\chi^2\) values.

  • Estimated uncertainties.

  • Uniqueness of fit.

This is generally aided by having a large batch of fit results, with randomised initial parameters, to allow for statistical analysis and ensure a full probing of the solution hyper-space.

For further discussion, see, for example:

[1]

  1. Hockett, Quantum Metrology with Photoelectrons, Volume 2: Applications and advances. IOP Publishing, 2018. doi: 10.1088/978-1-6817-4688-3.

[2]

  1. Hockett, “Photoionization dynamics of polyatomic molecules,” University of Nottingham, 2009. Available: http://eprints.nottingham.ac.uk/10857/

Load sample dataset

[1]:
# If running from scratch, create a blank object first
# # Init blank object
import pemtk as pm
from pemtk.fit.fitClass import pemtkFit
data = pemtkFit()
*** ePSproc installation not found, setting for local copy.
[2]:
# Load sample dataset
# Full path to the file may be required here, in repo/demos/fitting
import pickle
from pathlib import Path

dataFile = 'dataDump_100fitTests_10t_randPhase_130621.pickle'
dataPath = Path(pm.__path__[0]).parent/Path('demos','fitting')

with open( dataPath/dataFile, 'rb') as handle:
    data.data = pickle.load(handle)
[3]:
# The sample data dictionary contains 100 fits, as well as the data used to set things up.
data.data.keys()
[3]:
dict_keys(['orb6', 'orb5', 'ADM', 'pol', 'subset', 'sim', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
[4]:
# Set max fit ind for reference
data.fitInd = 99

# Set reference values from input matrix elements (data.data['subset']['matE'])
data.setMatEFit()
Set 6 complex matrix elements to 12 fitting params, see self.params for details.
name value initial value min max vary
m_PU_SG_PU_1_n1_1 1.78461575 1.784615753610107 1.0000e-04 5.00000000 True
m_PU_SG_PU_1_1_n1 1.78461575 1.784615753610107 1.0000e-04 5.00000000 True
m_PU_SG_PU_3_n1_1 0.80290495 0.802904951323892 1.0000e-04 5.00000000 True
m_PU_SG_PU_3_1_n1 0.80290495 0.802904951323892 1.0000e-04 5.00000000 True
m_SU_SG_SU_1_0_0 2.68606212 2.686062120382649 1.0000e-04 5.00000000 True
m_SU_SG_SU_3_0_0 1.10915311 1.109153108617096 1.0000e-04 5.00000000 True
p_PU_SG_PU_1_n1_1 -0.86104140 -0.8610414024232179 -3.14159265 3.14159265 False
p_PU_SG_PU_1_1_n1 -0.86104140 -0.8610414024232179 -3.14159265 3.14159265 True
p_PU_SG_PU_3_n1_1 -3.12044446 -3.1204444620772467 -3.14159265 3.14159265 True
p_PU_SG_PU_3_1_n1 -3.12044446 -3.1204444620772467 -3.14159265 3.14159265 True
p_SU_SG_SU_1_0_0 2.61122920 2.611229196458127 -3.14159265 3.14159265 True
p_SU_SG_SU_3_0_0 -0.07867828 -0.07867827542158025 -3.14159265 3.14159265 True

Process results

Here we’ll use Seaborn and Holoviews to look at the fit results outputs from lmfit; for this, the results will first be restacked to a long-form Pandas data-frame.

[5]:
# Additional import for data analysis
import xarray as xr
import numpy as np
import pandas as pd
import string

# pd.options.display.max_rows = 50
pd.set_option("display.max_rows", 50)

# For AFBLM > PD conversion
import epsproc as ep

# For plotting
import seaborn as sns
import holoviews as hv
from holoviews import opts

# Some additional default plot settings
# TODO: should set versino for PEMtk, just sets various default plotters & HV backends.
from epsproc.plot import hvPlotters
hvPlotters.setPlotters()