Blog

In his infamous The Feynman Lectures on Physics, Richard Feynman says, "If our small minds, for some convenience, divide this glass of wine, this universe, into parts—physics, biology, geology, astronomy, psychology, and so on—remember that nature does not know it! So let us put it all back together, not forgetting ultimately what it is for. Let it give us one more final pleasure: drink it and forget it all!" In a beautiful and poetic manner Feynman brings out that all division of science into various disciplines is artificial. At some point each discipline is bound to reach a point where it should not be possible to do anything useful without blurring these artificial boundaries.

Time domain astronomy has taken major strides in the last decade. With the help of robotized telescopes and advancements in image processing and database technologies, astronomers have organized several systematic surveys of the sky searching for transient phenomena.

In this spirit, the International Centre for Theoretical Sciences (ICTS) and the Statistical and Applied Mathematical Sciences Institute (SAMSI) organized a joint workshop titled "Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy" which brought together experts from three major Disciplines, namely Gravitational Wave Astronomy, Time Domain Astronomy and Statistics. The aim of this workshop was to initiate a dialog between the domains from the three experts so that each other's strengths may be leveraged to solve the most difficult problems faced in these disciplines.

Time domain astronomy has taken major strides in the last decade. With the help of robotized telescopes and advancements in image processing and database technologies, astronomers have organized several systematic surveys of the sky searching for transient phenomena. These surveys scan the sky at regular intervals motivated by the scientific objectives of the survey. In each scan an image of the same region of the sky is captured and this image is analyzed by sophisticated computer algorithms. These programs auto detect all the objects captured in a given image and measure their brightness. This way, one gets a time series measurement of the brightness of the millions and millions of objects in the sky.

Some of these objects are interesting — such as supernovae which are stars ending their life literally with a bang, variable stars whose brightness changes over periods of hours, days or weeks slavishly obeying the not-yet-fully-understood laws of Physics and many others. To fully understand these objects however, it is not possible to use images alone — one needs spectra. Unlike images, spectra give not just the total light coming out from the object but variation of the light as a function of wavelength. This information allows us to measure the composition of these objects and its surrounding medium. However, spectra are very 'expensive' to obtain. By splitting the light into various wavelengths, we impose great demands on our light capturing devices — they need to capture fainter levels of light. This in turn requires that telescopes allow the light to fall for longer times (technically called exposure time). Owing to this, only a handful of objects may be studied in great detail.

Finding the true signal in the data is like finding a needle in the haystack — maybe even worse, the needle has the same color as the hay!

Clearly then, it is not a wise idea to randomly study all objects found to be varying — only those exhibiting a certain pattern should be identified. But with millions of objects detected every night, it is impossible for humans to sift through these data. This is where Machine Learning, a technique where a machine mimics human learning capabilities through mathematical models, becomes very important. Experts in the workshop explained the difficulties faced during the process of application of Machine Learning techniques to time domain data to identify interesting and important phenomena among millions. These problems include questions such as "What information is useful for a machine and what isn't?", "What is the model that works best?" and "How do I help a machine distinguish between bad data (technically, data with high error bars) from good data (with lower error bars)?"

Experts from gravitational wave astronomy highlighted similar problems.
Gravitational waves are perturbations in space time itself, created by massive objects such as black holes moving at immense speeds. The Laser Interferometer Gravitational-Wave Observatory (LIGO) uses a very special technique known as inteferometry to detect these waves. Finding the true signal in the data is like finding a needle in the haystack — maybe even worse, the needle has the same color as the hay! Glitches may arise in the detectors which can mimic signals. The very process of detecting signals involves matching the signal with a whole collection of 'signal patterns', computed painstakingly from solving General Relativity equations for a variety of astrophysical phenomena. Here too, it was pointed out that 'intelligent machines' can be useful in categorizing detections as probably false and probably true.

Should at the end of the painstaking search, a gravitational wave signal be detected, to fully understand the source it is important to study it using traditional astronomical methods. But which source do you study? The error margin in the reconstructed  position of a gravitational wave source is so high that a telescope will find Thousands of sources in this error region. Some of these sources are so invariable that they are unlikely to be the gravitational wave sources. But how does one know how a source varies without having any access to its past measurements? This is where time domain astronomers can offer insights by helping in development of routines which can leverage time series data from multiple surveys and employing variability detection methods.

Whether it is time domain astronomy or gravitational wave astronomy, domain experts have one major entity to deal with — data! Which data are good? Which are not? What are the patterns in the data? What is the chance that a given pattern is not real? These are the questions that statisticians spend an entire lifetime battling with. While astronomers do know how to handle these questions, they use conventional approaches which make assumptions about the data. For example, if there is a noise in your data, its properties can be described using a Gaussian or normal distribution. But increasingly, it has been realized that such assumptions are incorrect and thus present roadblocks to extracting science from such data.

This workshop helped form several collaborations to enable what will likely prove to be a fruitful collaboration among people from diverse backgrounds that can propel the progress of science.

At this workshop, various statisticians presented how they have recently started working with astronomers to employ newer techniques where such assumptions are replaced with more realistic ones. It is clear that many problems in Astronomy can be solved only if such synergies become more commonplace. But these synergies are not straightforward to achieve. Each domain has worked in isolation for so long that a completely different jargon is employed by the experts in different domains. Depending on who you are, you might use the word 'independent variable,' 'descriptive feature,' 'experimental variable' or 'predictor variable' and yet mean the same thing! Thus building a synergy requires constant communication between experts from different domains. This workshop helped form several collaborations to enable what will likely prove to be a fruitful collaboration among people from diverse backgrounds that can propel the progress of science.

Kaustubh Vaghmare

is a data scientist at the Inter University Center for Astronomy and Astrophysics (IUCAA), Pune, India