Absorption lines in the spectra of distant Quasi-stellar objects are our main source of information about physical conditions and composition of matter in the intergalactic medium. However, manual analysis of such spectra is time-consuming and error-prone, in part due to the unknown redshifts of absorption lines and the unknown number of intervening clouds. Automatic analysis has also been challenging for this problem - observational uncertainties make it difficult to use combinatorial search techniques, and approaches based on probabilistic inference typically require prior knowledge of the number of clouds.
We present a new approach that supports a variety of QSO spectra studies while accounting for observational uncertainties as well as the unknown numbers and types of intervening clouds. We use a probabilistic programming language, Bayesian Logic (BLOG), that extends first-order logic semantics with probability theory and allows efficient specifications for physics-based probabilistic models. We show how our system would utilize the equivalent width and position of absorption lines in the spectra as observations. It also provides a rich query language for investigating properties of the possible intervening clouds. Answers to the resulting queries would be consistent with (and supported by) the input physics models. We will evaluate this representational approach using techniques for approximate probabilistic inference on the Hubble Space Telescope (HST) COS data that cover a range of background QSO up to Z = 1.
The flexibility and robustness of this analytical approach would be invaluable in evaluating scientific hypotheses with large datasets such as those from HST as well as with data expected from forthcoming large observatories such as GMT, ELT, and TMT.