This notebook serves as a basic introduction to loading and viewing data released in associaton with the GWTC-2 Parameter Estimation Data Release.
The released data file can be read in using the h5py, PESummary, or astropy libraries*. For general instructions on how to manipulate the data file and/or read this data file with h5py, see the PESummary docs
In this notebook we use as an example the event GW190519_153544
. The tar file containing the data that is used can be found here: https://dcc.ligo.org/LIGO-P2000223/public . We assume the tar file is unarchived in the same directory as this notebook.
The tar file contains several files, which includes two h5
files that contain PE samples. The difference between them is the distance prior used. Files with names of the form event_name.h5
contain samples from parameter estimation performed with $d_L^{2}$ prior, while those with event_name_comoving.h5
have samples that have been reweighted to a prior corresponding to a uniform merger rate per comoving volume in the rest frame of the source. Following the discussion in GWTC-2 paper, we use the results with the comoving prior in this notebook.
* We do not guarantee that the data release files can be read in with other packages.
First we import the key python modules
import matplotlib.pyplot as plt
import pesummary
from pesummary.io import read
print(pesummary.__version__)
import h5py
%matplotlib inline
%config InlineBackend.figure_format='retina'
0.11.0
Note that for pesummary<=0.9.1
, seaborn<=0.10.1
is required.
The samples for each event is stored in the corresponding h5
file. This data file can be read either using h5py
or using in using the pesummary
read
function. Each analysis file will contain several datasets. For a detailed description of what the names mean, see Table III and Table VIII of https://dcc.ligo.org/LIGO-P2000061/public.
file_name = './GW190519_153544/GW190519_153544_comoving.h5'
# Using h5py
with h5py.File(file_name, 'r') as f:
print('H5 datasets:')
print(list(f))
H5 datasets: ['C01:IMRPhenomD', 'C01:IMRPhenomPv2', 'C01:NRSur7dq4', 'C01:SEOBNRv4P', 'C01:SEOBNRv4PHM', 'C01:SEOBNRv4P_nonevol', 'PrecessingSpinIMR', 'PrecessingSpinIMRHM', 'PublicationSamples', 'ZeroSpinIMR', 'history', 'version']
# Using pesummary
data = read(file_name)
print('Found run labels:')
print(data.labels)
Found run labels: ['C01:IMRPhenomD', 'C01:IMRPhenomPv2', 'C01:NRSur7dq4', 'C01:SEOBNRv4P', 'C01:SEOBNRv4PHM', 'C01:SEOBNRv4P_nonevol', 'PrecessingSpinIMR', 'PrecessingSpinIMRHM', 'PublicationSamples', 'ZeroSpinIMR']
See the end of this notebook for more information about the different data sets.
For the remainder of the notebook, we demonstrate how to use pesummary to access and plot various aspects of the analysis.
The posterior samples can be extracted through the samples_dict
property. These posterior samples are stored in a custom table structure. Below we load a particular dataset and show which parameters are available. For a detailed description of the meaning of most parameters, see definition of standard parameters
samples_dict = data.samples_dict
posterior_samples = samples_dict['PrecessingSpinIMRHM']
parameters = sorted(list(posterior_samples.keys()))
print(parameters)
['Npts', 'a_1', 'a_2', 'chi_eff', 'chi_p', 'chirp_mass', 'chirp_mass_source', 'comoving_distance', 'cos_iota', 'cos_theta_jn', 'cos_tilt_1', 'cos_tilt_2', 'dec', 'final_mass', 'final_mass_non_evolved', 'final_mass_source', 'final_mass_source_non_evolved', 'final_spin', 'final_spin_non_evolved', 'geocent_time', 'inverted_mass_ratio', 'iota', 'log_likelihood', 'luminosity_distance', 'mass_1', 'mass_1_source', 'mass_2', 'mass_2_source', 'mass_ratio', 'neff', 'p', 'peak_luminosity', 'peak_luminosity_non_evolved', 'phase', 'phi_1', 'phi_12', 'phi_2', 'phi_jl', 'ps', 'psi', 'psiJ', 'ra', 'radiated_energy', 'radiated_energy_non_evolved', 'redshift', 'spin_1x', 'spin_1y', 'spin_1z', 'spin_2x', 'spin_2y', 'spin_2z', 'symmetric_mass_ratio', 'theta_jn', 'tilt_1', 'tilt_2', 'total_mass', 'total_mass_source']
pesummary
allows for the user to easily make plots. As an example, we show the posterior distribution for chirp_mass_source
plotted as a histogram and as a KDE.
fig = posterior_samples.plot('chirp_mass_source', type='hist')
_ = posterior_samples.plot('chirp_mass_source', type='hist',fig=fig, kde=True)
plt.show()
We may also easily generate a spin disk, showing the most probable direction of the spin vectors
fig = posterior_samples.plot(type='spin_disk', colorbar=True, annotate=False,
show_label=True, cmap='Blues')
Corner plots are very useful for spotting degeneracies between parameters. A corner plot can easily be generated using 'pesummary'
fig = posterior_samples.plot(type='corner',
parameters=['mass_1', 'mass_2', 'luminosity_distance', 'iota'])
In this example, we compare results from 3 diffrent waveforms: IMRPhenomPv2
,SEOBNRv4P
and NRSur7dq4
.
# Set a consistent color scheme
cp = ['#1b9e77','#d95f02','#7570b3']
labels_of_interest = ['C01:IMRPhenomPv2','C01:SEOBNRv4P','C01:NRSur7dq4']
fig=samples_dict.plot('chirp_mass', type='hist', kde=True,labels=labels_of_interest,colors=cp)
A comparison histogram is not the only way to display this data. We may also generate a violin plot showing the posterior distribution for each analysis
fig = samples_dict.plot('chirp_mass', type='violin',labels=labels_of_interest,palette=cp)
Here is an example of generating a triangle plot:
fig = samples_dict.plot(['mass_1', 'mass_2'], type='reverse_triangle',
grid=False,labels=labels_of_interest,colors=cp)
It is also useful to see how degeneracies between certain parameters change for different analysis. This can be investigated by generating a comparison corner plot
fig = samples_dict.plot(type='corner',
parameters=['mass_1', 'mass_2', 'luminosity_distance', 'iota'],
labels=labels_of_interest,colors=cp
)
The 'pesummary' file also stores the PSD that was used for each analysis. This can be extracted and plotted
psd = data.psd['C01:IMRPhenomD']
fig = psd.plot(fmin=20)
The skymaps are stored in 2 different ways for convenience. They are available inside each h5
result file and as a separate fits
file. The first example below shows the automatic plotting of the skymap stored inside the h5
file. The second loads the fits file directly.
Please note that the ligo.skymap
package is needed for plotting the skymaps in the cells below. Note that the fits
files are always generated from samples that use the comoving prior.
import matplotlib
matplotlib.rcParams['text.usetex'] = False
import matplotlib.pyplot as plt
fig = data.skymap['C01:IMRPhenomPv2'].plot(contour=[50, 90])
from ligo.skymap.io import fits
from ligo.skymap import postprocess
fits_file = "./GW190519_153544/GW190519_153544_C01:IMRPhenomPv2.fits"
contour_levels = [50,90]
skymap, metadata = fits.read_sky_map(fits_file, nest=None)
cls = 100 * postprocess.find_greedy_credible_levels(skymap)
ax = plt.axes(projection=('astro hours mollweide'))
cs = ax.contour_hpx(
(cls, 'ICRS'), nested=metadata['nest'],
colors='red', linewidths=0.5, levels=contour_levels)
A comprehensive explanation of the various datasets present in the PE samples can be found in Table III of GWTC-2 paper. In addition, each h5
samples file also contains a PublicationSamples
dataset which represents the data that was actually used to create the results GWTC-2 paper (and contains data as defined explicitly by Table VIII of GWTC-2 paper).
Finally, some events have datasets ending in _nonevol
. These are associated with the SEOBNRv4P(HM) family of waveforms, which for technical reasons must have a reference frequency different from the fiducial reference frequency of 20 Hz for some events. To ensure they can be combined consistently with samples from other waveforms, these are then evolved to the fiducial reference frequency. The results are presented in C01:SEOBNRv4P(HM)
datasets. The _nonevol
datasets are included only for completeness.