Discovery is a next-generation pulsar-timing-array data-analysis package, built for speed on a JAX backend that supports GPU execution and autodifferentiation. If Enterprise is Spock, logical and elegant, Discovery is all Scotty, fast, efficient, and not above a hack if it gets you to warp speed.
Discovery needs a modern Python with numpy, scipy, jax, pyarrow. It will be happier running on an Nvidia GPU with CUDA-enabled JAX.
Discovery's subpackages (such as discovery.flow and the packages under discovery.samplers) require additional dependencies.
The folder examples contains a growing set of usage examples.
The Discovery data model consists of Kernel objects (think of a noise matrix N, which can be inverted and applied to a vector, N^{-1} y, or even sandwiched with it, y^T N^{-1} y) and of GP objects, consisting of a basis F (sized ntoas x ngp) and a prior/kernel Phi. The kernel WoodburyKernel(N, F, P) combines a noise kernel and a GP.
Discovery uses lightweight Pulsar objects saved as Arrow Feather files. To convert an Enterprise Pulsar objects use discovery.Pulsar.save_feather(psr, filename, noisedict), where noisedict is an optional dictionary of default parameter values. Some example datasets (based on NG 15yr) are included in the data folder.
-
residuals(psr): just anumpyarray of residuals for pulsarpsr. -
makenoise_measurement(psr, [noisedict]): returns aKernelobject that implements EFAC + EQUAD measurement noise for pulsarpsr. Parameters are multiplexed to pulsar and backend (e.g.,B1855+09_430_ASP_efac), Enterprise-style. If those parameters appear isnoisedict, their values will be frozen to those specifications. -
makenoise_measurement_simple(psr, [noisedict]): same, but no backends. -
makegp_ecorr(psr, [noisedict]): returns aGPobject that implements ECORR measurement noise for pulsarpsr. The quantization style. Parameters are multiplexed to pulsar and backend, and frozen if included in noisedict. -
makegp_ecorr_simple(psr, [noisedict]): same, but no backends. -
makegp_fourier(psr, prior, components, [T, fourierbasis, common, name]): returns aGPobject that implements a finite GP over a vector basis. Herepriormust be a JAX-ready function with signaturepriorfunc(f, df, arg1, [...]), wherefis a vector of frequencies, anddfa vector of integration weights associated with the frequencies. The resulting GP parameters are named{psrname}_{name}_{argX}, unless they are included in the listcommon. The Fourier basis is generated by callingfourierbasis(psr, components, T)(see below), which must return(f, df, F). In this functionTdefaults to the pulsar span. -
makegp_fourier_global(psrs, prior, orf, components, T, [fourierbasis, name]): returns aGlobalVariableGPobject that implements a GP for all pulsars together, with a matched Fourier basis. Hereprioris the same as formakegp_fourier(with GP parameters named{name}_{argX}), andorfmust be a functionorf(pos1, pos2)of two numpy vectors, such ashd_orf,monopole_orf,dipole_orf, and the diagonaluncorrelated_orf. The Fourier basis is generated by callingfourierbasis(psr, components, T)(see below), which must return(f, df, F). In this functionTdoes not have a default (but seegetspan). Ifpriorandorfare given as equal-length lists of functions, the object implements the composite processsum_i GP(prior[i], orf[i])(with GP parameters named{name}_{orf[i].__name__}_{argX}).
makegp_improper(psr, fmat, [constant, name]): returns aGPobject with improper prior (formally, a constant vector equal toconstant) and basis matrixfmat. Hereconstantdefaults to1e40.makegp_timing(psr, [constant], [variance]): convenience function to callmakegp_improperwith the pulsar design matrixpsr.Mmatand prior diagonal valueconstant. Design-matrix columns are normalized. To obtain "physical" priors, usevarianceto a value in s^2 (e.g.,1e-12``) instead of settingconstant`.
-
fourierbasis(psr, components, [T]): returns(f, df, F)for a basis of interleaved sines and cosines evaluated overpsrTOAs with frequenciesk/T, withk = 1, ..., components. AgainTdefaults to the pulsar span. -
dmfourierbasis(psr, components, [T, fref]): same, but rescales the basis by(fref / psr.freqs)**2, useful to define DMGP. HereTdefaults to the pulsar span andfrefto 1400. -
getspan(psr or psrs): returns the TOA span of the pulsar or iterable of pulsars. Useful formakegp_fourierandmakegp_fourier_global. -
powerlaw(f, df, log10_A, gamma): implements the standard red-noise/GW spectrum10^(2 log10_A) f^(-gamma) (yr^(gamma - 3) / pi^2 / 12) df. This is a JAX-ready function, so one would passpowerlawtomakegp_fourier. -
freespectrum(f, df, log10_rho: typing.Sequence): implements10^(2 log10_rho). Note thatmakegp_fourieruses thelog10_rhoannotation to treat it as a vector; the resulting parameter name would be, e.g.,B1855+09_fourierGP_log10_rho(10)ifcomponents = 10. -
makepowerlaw_crn(components): makes the prior for a combined red-noise/common-process GP, limiting the common-process tocomponentsfrequencies. Returns a function with the signaturepowerlaw_crn(f, df, log10_A, gamma, crn_log10_A, crn_gamma); callingmakegp_fourier(..., common=['crn_log10_A, crn_gamma'], name='rednoise')would then result in parameter names['B1855+09_rednoise_log10_A', 'B1855+09_rednoise_gamma', ..., 'crn_log10_A', 'crn_gamma'].
makedelay(psr, delayfunc, [common, name]): returns a JAX-ready function that implements a deterministic delay forpsr. Heredelayfuncmust be a JAX-ready function with signaturedelayfunc(arg1, ...). The resulting delay parameters are named{psrname}_{name}_{argX}, unless they are included in the listcommon. If the first few arguments are defined attributes ofpsr(e.g.,toasorfreqs), they are automatically passed to the function and excluded fromparams; however they must come before all the other variable parameters.make_solardm(psr)[insolar.py]: calculates the DM delay in a 1/r^2 solar wind density model. Returns a function with signaturesolardm(n_earth).make_chromaticdecay(psr)[insolar.py]: calculates a chromatic exponential-dip delay. Returns a function with signaturedecay(t0, log10_Amp, log10_tau, idx).
-
PulsarLikelihood(signals, concat=True): returns aPulsarLikelihoodobject, with alogLproperty that implements the single-pulsar likelihood as a JAX-ready function. The likelihood takes as a single argument a dictionary of parameters named as discussed above. Heresignalsis an iterable that must contain exactly one residual vector, exactly one noiseKernel, any number ofGPobjects, and any number of deterministic delays (any callable). Ifconcat=False, the likelihood is built by nestingWoodburyKernels, first consumingConstantGPobjects (those without parameters) and thenVariableGP, but otherwise respecting the order insignals. Ifconcat=True, theConstantGPs andVariableGPs are separately concatenated, and then nested. -
GlobalLikelihood(psls, globalgp=None): returnsGlobalLikelihoodobject, with alogLproperty that implements the multi-pulsar likelihood as a JAX-ready function. The likelihood takes as a single argument a dictionary of parameters. Herepslsis an iterable that may contain any number ofPulsarLikelihoodobjects, andglobalgpis aGlobalVariableGPobject, such as returned bymakegp_fourier_global, encoding a joint GP for all pulsars. -
The two likelihood objects have a
sampleproperty—a JAX-ready function that generates a random realization of the data if given a JAX key and a dictionary of parameters. (According to JAX protocol,sampleactually returns a tuple consisting of the "split" key and the data realization.) -
The
PulsarLikelihoodandGlobalLikelihoodobjects have asample_conditional(key, params) -> (key, dict)property–a JAX-ready function that returns a dictionary of GP coefficients drawn from the conditional (normal) probability distribution of the parameter-dependent GPs (forPulsarLikelihood) or of theglobalgp(s), given theparams. This function callsGlobalLikelihood.conditional(params), which returns a concatenated vectormuof the mean GP coefficients and the resultcfof runningscipy.linalg.cho_factoron the matrixPhi^-1 + T^t K T, wherePhiandTare prior and basis of the relevant GP.
makelogprior_uniform(params, [priordict]): returns a functionlogprior(params)that evaluates the total log prior according topriordict(given, e.g., as{'FourierGP_log10_A': [-18, -11]'}). Some standardenterprise_extensionspriors are included by default (e.g.,crn_log10_A, crn_gamma, gw_log10_A, gw_gamma, ...). Parameters that are in the listparamsbut are not inpriordictand have no default are not included in the computation.sample_uniform(params, [priordict]): creates a dictionary of random values for the parameters in the listparams, using the uniform priors inpriordictor in the system default. Fails if a parameter inparamshas no specification.
OS(gbl): creates an optimal statistic object, wheregblis aGlobalLikelihoodin which eachPulsarLikelihoodhas amakegp_fouriercomponent namedgwwith at least the common parametergw_log10_A. The prior and Fourier basis from that GP will be used to build the OS. If theGlobalLikelihoodhas aglobalgp, it is unused.OS.os(params, [orfa]): returns the dictionary{'os': ..., 'os_sigma': ..., 'snr': ..., 'log10_A': ...}as computed forparams.orfadefaults tohd_orfabut can also bedipole_orfaormonopole_orfa—these are all functions of the cosine between pulsars.OS.oscan be jitted and vmapped over params. (If one wishes to use theorfaargument, the jitting/vmapping syntax is more complicated:jax.jit(os.os, static_argnums=1)andjax.vmap(os.os, in_axes=(0, None)).)OS.scramble(params, pos, [orfa]): returns the same dictionary asOS.os, using the pulsar positions inpos(either a list of three-vectors or annpsr x 3array).OS.scramblecan be jitted and vmapped over params (jax.vmap(os.scramble, (0, None))) or positions (jax.vmap(os.scramble, (None, 0))).OS.shift(params, phases, [orfa]): returns the same dictionary asOS.os, shifting the GW basis vectors by the phases inphases(annpsr x ngwarray of floats in [0,2*pi]).OS.shiftcan be jitted and vmapped over params or phases.OS.gx2cdf(params, xs, cutoff=1e-6, limit=100, epsabs=1e-6): returns a vector of the GX2 OS CDF evaluated at the SNRsxs. The CDF is computed by considering only GPs (i.e., not white noise), using Imhof's method. The parameterslimitandepsabsare passed toscipy.integrate.quad. Ifcutoffis a float, GX2 eigenvalues are limited to those larger thancutoff; ifcutoffis an integer, only thecutofflargest eigenvalues are used. Currently this function cannot be jitted or vmapped.
ArrayLikelihood(psls, [commongp=], [globalgp=]): returns anArrayLikelihoodobject analog toGlobalLikelihood, but set up especially for "batched" GPs (given ascommongp) that implement the same process for all pulsars (i.e., red noise with the same number of components and the same power-law prior, but different amplitude and gamma parameters). Herepslsis an iterable that may contain any number ofPulsarLikelihoodobjects,commongpis a vectorized GP (as returned, e.g., bymakecommongp_fourier) or a list of vectorized GPs, andglobalgpis aGlobalVariableGPobject as forGlobalLikelihood. In addition tologL,ArrayLikelihoodimplements a hierarchicalclogLthat takes thecommongpandglobalgpcoefficients as additional parameters (e.g.,'B1855+09_red_noise_coefficients(60)','B1937+21_gw_coefficients(28)', etc.).makecommongp_fourier(psrs, prior, components, [T, fourierbasis, common, name)]): returns a vector GP object that implements a finite GP over a vector basis for a set of pulsars. This function is analog tomakegp_fourierbut takes a list of pulsars, and is meant to be used withArrayLikelihood.
The following implements a standard NANOGrav likelihood + prior for pulsar psr, with free parameters ['{psrname}_rednoise_gamma', '{psrname}_rednoise_log10_A', 'crn_gamma', 'crn_log10_A']:
import discovery as ds
psl = ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, noisedict),
ds.makegp_ecorr(psr, noisedict),
ds.makegp_timing(psr),
ds.makegp_fourier(psr, ds.powerlaw, 30, name='rednoise'),
ds.makegp_fourier(psr, ds.powerlaw, 14, common=['crn_log10_A', 'crn_gamma'], name='crn')])
logl = psl.logL
logp = dp.makelogprior_uniform(logl.params)
Then you would use jax.jit(logl) (and perhaps jax.jit(jax.grad(logl)))) and jax.jit(logp) in inference.
The following implements a standard NANOGrav HD likelihood:
Tspan = ds.getspan(psrs)
gbl = ds.GlobalLikelihood((ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, noisedict),
ds.makegp_ecorr(psr, noisedict),
ds.makegp_timing(psr),
ds.makegp_fourier(psr, ds.powerlaw, 30, T=Tspan, name='rednoise')
]) for psr in psrs),
ds.makegp_fourier_global(psrs, ds.powerlaw, ds.hd_orf, 14, T=Tspan, name='gw'))
logl = gbl.logL
logp = dp.makelogprior_uniform(logl.params)
Here are ArrayLikelihood versions, which will be especially fast on a GPU:
curn = ds.ArrayLikelihood([ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, psr.noisedict),
ds.makegp_ecorr(psr, psr.noisedict),
ds.makegp_timing(psr, svd=True, constant=1e-6)]) for psr in psrs],
ds.makecommongp_fourier(psrs, ds.makepowerlaw_crn(14), 30, T=Tspan, common=['crn_log10_A', 'crn_gamma'], name='red_noise'))
hd = ds.ArrayLikelihood([ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, psr.noisedict),
ds.makegp_ecorr(psr, psr.noisedict),
ds.makegp_timing(psr, svd=True, constant=1e-6)]) for psr in psrs],
ds.makecommongp_fourier(psrs, ds.powerlaw, 30, T=Tspan, name='red_noise'),
ds.makegp_fourier_global(psrs, ds.powerlaw, ds.hd_orf, 14, T=Tspan, name='gw'))
To create a random realization of a model (e.g., gbl) as a function of its parameters:
sampler = gbl.sampler
p0 = ds.sample_uniform(sampler.params)
key = jax.random.PRNGKey(42)
key, y = sampler(key, p0)
To simulate residuals for a pulsar based on an existing dataset:
psr = ds.Pulsar.read_feather('data/NG15yr-B1855+09.feather')
psl = ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, psr.noisedict),
ds.makegp_ecorr(psr, psr.noisedict),
ds.makegp_timing(psr, variance=1e-12),
ds.makegp_fourier(psr, ds.powerlaw, 30, name='rednoise'),
ds.makegp_fourier(psr, ds.powerlaw, 14, common=['crn_log10_A', 'crn_gamma'], name='crn')
])
sampler = psl.sample
# fix parameters to reasonable value
noisedict_red = {p: v for p,v in psr.noisedict.items() if 'rednoise' in p}
params = {**ds.sample_uniform(sampler.params), **noisedict_red, 'crn_gamma': 4.3, 'crn_log10_A': -14.5}
# should jit the sampler if simulating many datasets
key = ds.rngkey(42)
key, res = sampler(key, params)
# update Pulsar object if desired (make PulsarLikelihood anew to analyze)
psr.residuals = res
To simulate residuals for an array:
import glob
psrs = [ds.Pulsar.read_feather(f) for f in glob.glob('data/*.feather')[:5]]
Tspan = ds.getspan(psrs)
gbl = ds.GlobalLikelihood((ds.PulsarLikelihood([psr.residuals,
ds.makenoise_measurement(psr, psr.noisedict),
ds.makegp_ecorr(psr, psr.noisedict),
ds.makegp_timing(psr, variance=1e-14),
ds.makegp_fourier(psr, ds.powerlaw, 30, T=Tspan, name='rednoise')
]) for psr in psrs),
ds.makegp_fourier_global(psrs, ds.powerlaw, ds.hd_orf, 14, T=Tspan, name='gw'))
sampler = gbl.sample
noisedict_red = {p: v for psr in psrs for p,v in psr.noisedict.items() if 'rednoise' in p}
params = {**ds.sample_uniform(sampler.params), **noisedict_red, 'crn_gamma': 4.3, 'crn_log10_A': -14.5}
key = ds.rngkey(43)
key, res = sampler(key, params)
To load and save PTMCMC chain files into Pandas tables:
chain = ds.read_chain('PTMCMC_example_output')
ds.save_chain(chain, 'example_chain.feather')
chain = ds.read_chain('example_chain.feather')
Note that the resulting Pandas tables will contain useful attributes chain.attr['noisedict'] (a Python dict), chain.attr['priors'] (a list of strings), and chain.attr['runtime_info'] (the runtime_info.txt file rendered as a list of strings).
