One man's trash is another man's treasure, or so the old saying goes. And in astronomy, one man's signal is another man's noise.
The classic example of this is dust in the Milky Way. If you're interested in dust, then you're a deeply weird person... or just interested in star formation. Dust "grains", which are actually about the size of smoke particles, are thought to be critical sites for star formation because they allow atomic gas to lose energy, cool, and collide with each other to form molecular gas. This is much denser than atomic gas, so eventually this can lead to the cloud collapsing to form a star.
But if you're less of a weirdo, dust just gets in the way. It ruins our majestic sky by blocking our view of all the stars, especially along the plane of the disc of the Galaxy. Some regions are much worse than others, but it's present at some level pretty much everywhere across the sky.
In radio astronomy we have a much more subtle and interesting problem. Of course we always want our observations to be as deep and sensitive as possible. But sometimes, it turns out, the noise in our data can actually be to our advantage – though it comes at a price.
Consider a typical spectrum of a galaxy as detected in the HI line. If you aren't familiar with this, take a look at my webpage if you want details. Basically, it shows us how bright the gas from a galaxy is emitting (that is, how dense it is) at any particular velocity. Even without knowing this rudimentary bit of information though, you can probably immediately identify the feature of interest in the signal :
![]() |
All the spectra shown in this post are artificial, generated with a simple online code you can use yourself here. |
We don't need to worry here about why the signal from the galaxy has the particular structure that it does. No, what I want to talk about today is the noise. That's those random variations outside the big bright bit in the middle.
This example shows a pretty nice detection. It's easy to see exactly where the profile of the galaxy ends and the noise begins. But even within the galaxy, you can see those variations are still present : they're just lifted up to higher values by the flux from the galaxy. Basically the galaxy's signal is simply added to the noise.
Now if you still have an analogue radio, you'll know that if you don't get the tuning just right, you'll hear the sounds from your station but only against a loud and annoying background hiss. The worse the tuning, the worse the noise. So you might well think that the following claim is more than a little dubious :
Fainter signals can be easier to detect in noisier data.
So counterintuitive is this that one referee said it "makes no sense whatsoever", doubling down to label it "bizarre" and "not just counterintuitive, but nonsensical".
This is wrong. I'll point out that the claim comes not from me but from Virginia Kilborn's PhD thesis (now a senior professor). So how does it work ?
The answer is actually very simple : signal is added to noise. That is, regardless of how noisy the data is, the signal from the galaxy is still there. Let me try and do this one illustratively. Suppose we have a pure signal, completely devoid of noise, and for argument's stake we'll give it a top-hat profile (about a quarter of galaxies have this shape, so this isn't anything unusual) :
![]() |
The "S/N" axis measures the signal to noise, a measure of how bright things look given the sensitivity of the data. The numbers in this case are garbage because I set the noise to zero. |
Now let's add it to two different sets of noise, purely random (Gaussian), of exactly the same statistical strength but just different in their exact channel-to-channel values :
Noise is typically random. That means that some parts of it will have bits of higher flux while other parts will be a bit lower. If we add our signal to the higher flux bits, the total apparent flux in our source gets higher. That is, the real flux in our source obviously doesn't change, but what we would measure would be greater than if the noise wasn't there. And of course the opposite could happen too : we could have noise dimming that makes the source harder to detect as well as easier.
Noise boosting (shown in the carefully-chosen example above), on the other hand, is no less important but far less expected. Every once in a while, a faint signal will happen to align with some bright parts of the noise and turn a marginal signal into a clearer one. This doesn't really work in the sorts of audio radio signals you get on a household radio set as these are much too complex, but the signals of a galaxy are a good deal simpler. And all we need to detect them is (for this basic example at least) pure flux, which the noise can readily provide.
(As an aside, you might notice that the the actual peak levels in these cases aren't much different, though the average level inside the source profile is higher in the second case. While peak levels most certainly can be affected by noise boosting and dimming, what's absolutely crucial here is what detection method we're using to find the signals. I'll return to this below.)
Of course, there are limits to this : it will only work for signals which are comparable in strength to the noise. As the noise level gets higher, the random variations will increasingly tend to "wash out" our signal. Now the level of the signals we can receive varies hugely depending on the nature of our data set, but the signal-to-noise ratios (S/N or sometimes SNR) tend to be fairly constant. That is, a signal which is ten times the typical noise value (the rms) has the same statistical significance in any data set, but the actual flux value it corresponds to can be totally different. But expressing signal strength in terms of the noise level makes things very convenient : a five sigma (5σ) source just means something that's five times brighter than the typical noise level.
So suppose we have a 2σ source which happens to align with a 3σ peak in the noise : bam, we've got ourselves a quite respectable 5σ detection*. But if we keep the flux level of the signal we're adding the same and increase the noise level, then the ratio of S/N will go down. Instead of adding 2σ to 3σ, we'll be adding ever lower and lower values : the "sigma level" of the noise won't change, but that of the signal certainly will. Pretty quickly we won't be shifting that 3σ peak from the noise by any appreciable degree. We'll be adding the same flux value but to an ever-greater starting level.
* Sometimes five sigma is quoted as a sort of scientifically universal gold-standard discovery threshold. This is simply not true at all, because if you have enough data, you'll get that level of signal just by chance alone. Far more importantly, the noise in real data is often far from being purely random, so choosing a robust discovery threshold requires a good knowledge of the characteristics of the data set.
As a possibly pointless analogy, consider lions. If you go from having no lions to one lion, you've just put yourself in infinitely more danger. If you add a second lion you're in even more trouble. But if you've got ten lions and add one more, you won't really notice the difference.
"But hang on," you might say, "surely that means that your earlier claim that fainter signals are more detectable in nosier data can't possibly be correct ?". A perfectly valid question ! The answer is that it depends on how we go about detecting the signals. The details are endless, but two basic techniques are to search for either a S/N ratio or a simple flux threshold. These can give very different results to each other.
Now if you use S/N, which is generally a good idea, then indeed signals of lower flux levels generally don't do well in increasingly nosier data because of the ever-smaller relative increase in the signal. And of course, the effect of the noise to not merely obscure but actually suppress the signal will get ever greater, since there's just as much chance of aligning with a low-value region of the noise as a high-value region.
But S/N is not the only way of detecting signals. You might opt instead to use a simple flux threshold instead : it's computationally cheaper, easier to program, and most importantly of all it gives you more physically meaningful results. If you do it that way, then it's a different story. When you add a signal to noise the flux level always increases, making it much easier to push the flux above your detection threshold by this method. Which makes noise boosting very much easier to explain.
Even more interesting is the so-called Eddington bias. What this means is that any survey will tend to overestimate the flux of its weakest signals : those signals which are so faint that they can only be detected at all thanks to chance alignments with the noise. In this case, that when someone comes along and does a deeper survey, they'll often find that those sources have less flux than reported in the earlier, less-sensitive data.
There are plenty of other subtleties. The signal might not need to be perfectly superimposed on a noise peak : if it's merely adjacent to it, that can create the appearance of a wider, brighter signal which can be easier to detect to some algorithms (and people !). And of course while we'd like noise to be perfectly uniform and random, this isn't always the case. Importantly, the rms value doesn't tell us anything at all about the coherency of structures in the data, as so powerfully shown by the ferocious Datasaurus.
For the eye the effects of this can be extremely complex, and are poorly understood in astronomy. If you have very few coherent noise structures, for example, you might think this nice clean background would make fainter structures easier to spot. But actually, I have some tentative evidence that this isn't always the case : the eye can be lulled into a false sense of emptiness, whereas if there are a few obvious structures to attract attention, you start to believe that things are present so you're more likely to identify structures. No doubt if your data was dominated by structures then the eye would, in effect, perceive them as background noise again and the effect would diminish, but this is something that needs more investigation. My guess is there's a zone in which you have enough to encourage a search but few enough that they don't obscure the view.
Ultimately, getting deeper data is always the better option. For every source noise-boosted to detectability, there'll be another which is suppressed and hidden. But this explains very neatly that supposedly "nonsensical" result*, especially if your source-finding routine is based on peak flux : of course if you add signal to noise and have the same flux threshold in your search, you're more likely to find the fainter signals in the nosier data... up to a point. Set your threshold too low and you'll just find spurious detections galore, but hit the sweet spot and you'll find signals you otherwise couldn't.
* Kudos to the referee for accepting the explanation; I never heard of any of this until a couple of years ago either. Extragalactic astronomy is full of stuff which isn't that difficult but seems feckin' confusing when you first encounter it because it isn't formally taught in any lectures !
There's nothing weird about noise boosting then. Mathematically it makes complete sense. But when you first hear about it it sounds perplexing, which just goes to show how deceptively simple radio astronomy can be. Noise boosting, at least at a basic level, is quite simple, but simple isn't the same as intuitive.
No comments:
Post a Comment