Modeling Light Curves for Improved Classification of Astronomical Objects

Sky surveys provide a wide-angle view of the universe. Instead of focusing on a single star or galaxy, they provide a snapshot of a wider portion of the sky. Their primary purpose is to search for asteroids, so they take a sequence of images over a short timespan: minutes or hours.  By seeing what has changed, the asteroid can be detected. But meanwhile, much further away, other things are changing. Most objects in the night sky do not change much on human timescales, but some of the most interesting stars and galaxies are performing for us. As the survey returns to the same location from time to time, we can see those objects that have changed. But how do we detect and categorise these objects? Here is a plot of four such objects:

lceg

The first of the four is an example of an active galactic nucleus. The plot shows how the brightness of the object varies over time. The second plot shows a supernova – most of the time we see nothing because the galaxy is below the detection limit but briefly we see a spark of light for a few weeks. The third object is a flare – it’s mostly quiet but burst into life from time to time. The fourth object is like our sun – it varies a little but enough to be interesting (fortunately for us!).

These are examples of light curves. As technology improves, increasingly large numbers of these curves will be recorded. We need to quickly identify and categorise the interesting ones so that they can be rapidly followed up (before the show is over!). In “Modeling lightcurves for improved classification of astronomical objects”, we describe how this can be done. You can also read about it on ArXiv and examine the software and data.

Confidence Bands for Smoothness

In What’s wrong with simultaneous confidence bands, I discussed the deficiencies of standard confidence bands, but how can we do better? Suppose we have a Gaussian process for the curve with some mean and a covariance kernel:

k(x,x^\prime) = \sigma^2_f \exp \left( - {1 \over 2l^2} (x-x^\prime)^2 \right) + \sigma^2_n \delta(x - x^\prime)

where δ is a delta function. The three hyper parameters, \sigma_f, \sigma_n and control the appearance of the curve. We can put priors on these parameters along with a prior on the mean function and calculate the posterior using MCMC methods. The most important parameter l controls the smoothness of the curve. We can view the uncertainty in this smoothness by computing a 95% credible interval for this parameter and plot the two curves corresponding to the endpoints of this interval. The other parameters are set at the posterior means. An example of these bands can be seen in the figure:smoothband

The solid line corresponds to the lower end of 95% interval for the smoothing parameter giving a rougher fit while the dashed line is the upper end of the interval which is a smoother fit. We are fairly sure that the correct amount of smoothness lies between these two limits.

Notice this makes a big difference to how many peaks and valleys we see in the function. The uncertainty about these features is more interesting than any uncertainty about the vertical position of the curve expressed by the traditional bands.

You can read all the details in my article: Confidence bands for smoothness in nonparametric regression and also view the R code.

What’s wrong with simultaneous confidence bands

Here are some 95% confidence bands for a fitted curve shown on the same data. The first uses spline smoothing from the mgcv package while the second uses a loess smoother from the ggplot2 package.

lcmglc

If all you want is a graphical expression of the uncertainty in these two estimates, these are just fine. But suppose you want to check whether some proposed function fits entirely within the bands, then you will need to do more work. The bands above are pointwise meaning that the confidence statement is true at any given point but not true for the entire curve. For that, you will need simultaneous confidence bands. These are more work to produce and involve all sorts of intricate calculations. Hundreds of papers have been written on the topic because of the fascinating theoretical challenges it raises. I’m responsible for a few of these papers myself.

But are these bands really useful? Properly constructed, they may tell us there’s a 95% chance that the bands capture the true curve. But what is the value in that? There will be some users who are interested in replacing the smooth fit with some particular parametric form, say a quadratic. Such users would be better off embedding their search within a family of increasingly complex alternatives and choosing accordingly. SCBs would not be an efficient strategy.

In SCB papers that have a data example, the usual motivation is that the user is interested in whether some particular feature exists, say a secondary maximum, for example. But users looking for particular features are better off with a method designed to look for those features, such as SiZer.

The greatest uncertainty is demonstrated by comparing the two figures above. How much smoothing should be applied? This makes a crucial difference to our interpretation. In the typical SCB construction, the amount of smoothing is chosen by some algorithm and the bands only reflect the uncertainty in the amplitude of the curves.

We need bands that tell us about the uncertainty in the smoothing. I will explain how to do this in the next blog post.