The parameters of pyqz¶
pyqz is designed to be easy and quick to use, but without withholding
any information from the user. As such, all parameters of importance for
deriving the estimates of LogQ
and Tot[O]+12
can be modified via
dedicated keywords. Here, we present some basic examples to clarify what
does what. In addition to these examples, the documentation also
contains a detailed list of the functions of pyqz, along with a brief
description of each keyword.
First things first, let’s import pyqz and the Image module to display the figures.
In [2]:
%matplotlib inline
import pyqz
import pyqz.pyqz_plots as pyqzp
import numpy as np
Parameter 1: srs
¶
srs
defines the size of the random sample of line fluxes
generated by pyqz. This is an essential keyword to the propagation of
observational errors associated with each line flux measurements. In
other words, srs
is the number of discrete estimates of the
probability density function (in the {LogQ
vs. Tot[O]+12
} plane)
associated with one diagnostic grid.
Hence, the joint probability function density function (combining
\(n\) diagnostic grids) will be reconstructed via a Kernel Density
Estimation routine from \(n\cdot\)srs
points. srs=400
is
the default value, suitable for error levels of $:raw-latex:sim`$5%.
We suggest “srs=800` for errors at the 10%-15% level. Basically,
larger errors result in wider probability density peaks, and thus
require more srs
points to be properly discretized - at the cost of
additional computation time of course ! Try changing the value of
my_srs
in the example below, and watch the number of black dots vary
accordingly in the KDE diagram.
In [12]:
my_srs = 800
pyqz.get_global_qz(np.array([[ 1.00e+00, 5.00e-02, 2.38e+00, 1.19e-01, 5.07e+00, 2.53e-01,
5.67e-01, 2.84e-02, 5.11e-01, 2.55e-02, 2.88e+00, 1.44e-01]]),
['Hb','stdHb','[OIII]','std[OIII]','[OII]+','std[OII]+',
'[NII]','std[NII]','[SII]+','std[SII]+','Ha','stdHa'],
['[NII]/[OII]+;[OIII]/[SII]+'],
ids = ['NGC_5678'],
srs = my_srs,
KDE_pickle_loc = './examples/',
KDE_method = 'multiv',
KDE_qz_sampling=201j,
struct='pp',
sampling=1)
# And use pyqz_plots.plot_global_qz() to display the result
import glob
fn = glob.glob('./examples/*NGC_5678*.pkl')
pyqzp.plot_global_qz(fn[0], show_plots = True, save_loc = './examples', do_all_diags = False)
--> Received 1 spectrum ...
--> Dealing with them one at a time ... be patient now !
(no status update until I am done ...)
All done in 0:00:00.312550
Parameter 2: KDE_method
¶
This keyword specifies the Kernel Density Estimation routine used to
reconstruct the individual and joint probability density functions in
the {LogQ
vs. Tot[O]+12
} plane. It can be either gauss
to
use gaussian_kde
from the scipy.stats
module, or multiv
to
use KDEMultivariate
from the statsmodels
package.
The former option is 10-100x faster, but usually results in less
accurate results if different diagnostic grids disagree. The underlying
reason is that with gaussian_kde
, the kernel bandwidth cannot be
explicitly set individually for the LogQ
and Tot[O]+12
directions, so that the function tends to over-smooth the distribution.
KDEMultivariate
should be preferred as the bandwidth of the kernel
is set individually for both the LogQ
and Tot[O]+12
directions
using Scott’s rule, scaled by the standard deviation of the distribution
along these directions.
In the example below, we insert some error in the [OII] line flux -
thereby creating a mismatch between the different line ratio space
estimates. Switch my_method
from 'gauss'
to 'multiv'
, and
watch how the joint PDF (shown as shades of gray) traces the
distribution of black dots in a significantly worse/better manner.
In [11]:
my_method = 'gauss'
pyqz.get_global_qz(np.array([[ 1.00e+00, 5.00e-02, 2.38e+00, 1.19e-01, 2.07e+00, 2.53e-01,
5.67e-01, 2.84e-02, 5.11e-01, 2.55e-02, 2.88e+00, 1.44e-01]]),
['Hb','stdHb','[OIII]','std[OIII]','[OII]+','std[OII]+',
'[NII]','std[NII]','[SII]+','std[SII]+','Ha','stdHa'],
['[NII]/[SII]+;[OIII]/[SII]+','[NII]/[OII]+;[OIII]/[OII]+'],
ids = ['NGC_09'],
srs = 400,
KDE_pickle_loc = './examples/',
KDE_method = my_method,
KDE_qz_sampling=201j,
struct='pp',
sampling=1)
# And use pyqz_plots.plot_global_qz() to display the result
import glob
fn = glob.glob('./examples/*NGC_09*%s*.pkl' % my_method)
pyqzp.plot_global_qz(fn[0], show_plots = True, save_loc = './examples', do_all_diags = False)
--> Received 1 spectrum ...
--> Dealing with them one at a time ... be patient now !
(no status update until I am done ...)
All done in 0:00:00.561540
Parameter 3: KDE_qz_sampling
¶
This sets the sampling of the {LogQ
vs. Tot[O]+12
} plane, when
reconstructing the individual and global PDFs. Set to 101j
by
default (i.e. a grid with
101$:raw-latex:cdot\(101 = 10201 sampling nodes), datasets with small errors (\)<$5%)
could benefit from using twice this resolution for better results (i.e.
KDE_qz_sampling=201j
). Resulting in a longer processing time of
course. In the following example, the influence of KDE_qz_sampling
can be seen in the size of the resolution elements of the joint PDF map,
as well as the smoothness of the (orange) contour at 0.61%.
In [10]:
my_qz_sampling = 101j
pyqz.get_global_qz(np.array([[ 1.00e+00, 5.00e-02, 2.38e+00, 1.19e-01, 5.07e+00, 2.53e-01,
5.67e-01, 2.84e-02, 5.11e-01, 2.55e-02, 2.88e+00, 1.44e-01]]),
['Hb','stdHb','[OIII]','std[OIII]','[OII]+','std[OII]+',
'[NII]','std[NII]','[SII]+','std[SII]+','Ha','stdHa'],
['[NII]/[OII]+;[OIII]/[SII]+'],
ids = ['NGC_00'],
srs = 400,
KDE_pickle_loc = './examples/',
KDE_method = 'multiv',
KDE_qz_sampling=my_qz_sampling,
struct='pp',
sampling=1)
# And use pyqz_plots.plot_global_qz() to display the result
import glob
fn = glob.glob('./examples/*NGC_00*.pkl')
pyqzp.plot_global_qz(fn[0], show_plots = True, save_loc = './examples/', do_all_diags = False)
--> Received 1 spectrum ...
--> Dealing with them one at a time ... be patient now !
(no status update until I am done ...)
All done in 0:00:00.120194
The other parameters¶
Most of the other parameters ought to be straightforward to understand
(e.g. verbose
). To use the maximum number of cpus available when
running pyqz, set nproc = -1
.
Check the page the functions of pyqz in the docs for more details.