The correct estimate of the probability of false detection of the matched filter in weaksignal detection problems
II. Further results with application to a set of ALMA and ATCA data
^{1} Chip Computers Consulting s.r.l., Viale Don L. Sturzo 82, S. Liberale di Marcon, 30020 Venice, Italy
email: robertovio@tin.it
^{2} ESO, Karl Schwarzschild strasse 2, 85748 Garching, Germany
email: pandrean@eso.org
^{3} École polytechnique, Route de Saclay, 91120 Palaiseau, France
Received: 17 July 2016
Accepted: 6 May 2017
The matched filter (MF) is one of the most popular and reliable techniques to the detect signals of known structure and amplitude smaller than the level of the contaminating noise. Under the assumption of stationary Gaussian noise, MF maximizes the probability of detection subject to a constant probability of false detection or false alarm (PFA). This property relies upon a priori knowledge of the position of the searched signals, which is usually not available. Recently, it has been shown that when applied in its standard form, MF may severely underestimate the PFA. As a consequence the statistical significance of features that belong to noise is overestimated and the resulting detections are actually spurious. For this reason, an alternative method of computing the PFA has been proposed that is based on the probability density function (PDF) of the peaks of an isotropic Gaussian random field. In this paper we further develop this method. In particular, we discuss the statistical meaning of the PFA and show that, although useful as a preliminary step in a detection procedure, it is not able to quantify the actual reliability of a specific detection. For this reason, a new quantity is introduced called the specific probability of false alarm (SPFA), which is able to carry out this computation. We show how this method works in targeted simulations and apply it to a few interferometric maps taken with the Atacama Large Millimeter/submillimeter Array (ALMA) and the Australia Telescope Compact Array (ATCA). We select a few potential new point sources and assign an accurate detection reliability to these sources.
Key words: methods: data analysis / methods: statistical / methods: numerical
© ESO, 2017
1. Introduction
In many research and engineering areas the matched filter (MF) represents the standard tool for the detection of signals with amplitude smaller than the level of the contaminating noise (Kay 1998; Tuzlukov 2001; Levy 2008; Macmillan & Creelman 2005). The MF is a linear filter that optimally filters out the Fourier frequencies where the noise is predominant, preserving those where the searched signal gives a greater contribution. In many situations, the MF displays the best performance, providing the greatest probability of true detection subject to a constant probability of false detection or false alarm (PFA). The widespread success of the MF is due to its reliability and resulting robustness. The main limitation of this approach lies in the implicit assumption that the position of the signal within a sequence of data (e.g., an emission line in a spectrum) or on an image (e.g., a point source in an astronomical map) is known. In most practical applications this is not the case. For this reason, the MF is used assuming that, if present, the position of a signal corresponds to a peak of the filtered data. In a recent paper, Vio & Andreani (2016) have shown that, when based on the standard but wrong assumption that the probability density function (PDF) of the peaks of a Gaussian noise process is a Gaussian, this approach may lead to a severe underestimation of the PFA. The same authors have provided an alternative procedure to correctly compute this quantity.
The PFA is a useful tool to fix a preliminary detection threshold, but with this quantity alone it is not possible to assess the reliability of a specific detection. This is because it does not provide the probability that a given detection is spurious but rather the probability that a generic peak due to the noise in the filtered signal can exceed, by chance, a fixed detection threshold. However, there are often cases in which the knowledge of the probability for a peak to be a source is crucial to plan followup observations to identify the source itself. For this reason, in this work we introduce the socalled specific probability of false alarm (SPFA), which is able to provide a precise (within a few percent) quantification of the reliability of a detection.
In Sect. 2, the main characteristics of MF for onedimensional signals are reviewed as well as the reason why it underestimates the PFA. In the same section the method suggested by Vio & Andreani (2016) to correctly compute this quantity is briefly reconsidered. In Sect. 3 we address the question of the statistical meaning of the PFA and introduce the SPFA. The arguments are extended to the twodimensional signals in Sect. 4. Finally, in Sect. 5 the procedure is applied to a simulated map, whereas in Sects. 6 and 7 it is applied to two interferometric maps obtained with the Atacama Large Millimeter/submillimeter Array (ALMA) and to one map obtained with the Australia Telescope Compact Array (ATCA). The final remarks are deferred to Sect. 8.
2. Matched filter for onedimensional signals
2.1. The basics
Given a discrete observed signal x = [ x(0),x(1),...,x(N − 1) ] ^{T} of length N, the model assumed in the MF approach is x = s + n, where s is the deterministic signal to detect and n a zeromean Gaussian stationary noise with known covariance matrix (1)Here, symbols E [ . ] and ^{T} denote the expectation operator and the vector or matrix transpose, respectively.
Under these conditions, according to the NeymanPearson theorem (Kay 1998; Vio & Andreani 2016), a detection is claimed when (2)with (3)Here f_{s}, an N × 1 array, represents the matched filter. The main characteristic of the MF is that it maximizes of the probability of detection (PD) under the constraint of a fixed PFA.
2.2. Matched filter in practical applications
The MF has been obtained under two assumptions: first, signal s is known and, second, x and s have the same length N. This last point implicitly means that the position of s within x is known.
Very often only the shape g of the signal s = ag is known but not its amplitude a. In this case, the relaxation of the first assumption does not have important consequences given that the test in Eq. (2) can be rewritten in the form (4)where γ′ = γ/a and the MF becomes (5)Hence, the resulting is a statistic that is independent of a. This means that the unavailability of a does not affect the PFA but only the PD.
If the second assumption is loosened, the signal s has a length N_{s} that is smaller than the length of the observed data (e.g. an emission line in an experimental spectrum). If the amplitude a is also unknown, the standard approach is to claim a detection when (6)where (7)where î_{0} is the estimate of the unknown position i_{0} of the signal, u a value typically in the range [ 3,5 ], and the standard deviation of the sequence (8)Namely, the observed signal x is crosscorrelated with the MF (5); see Eq. (8). The greatest peak detected in ; see Eq. (7). Finally this latter is tested if it exceeds a threshold set to u times the standard deviation of the sequence ; see Eq. (6). In the affirmative case the peak corresponds to a detection, otherwise it is assumed to be due to the noise. If the number of signals s present in x is also unknown, this procedure has to be applied to all the peaks in .
It is widespread practice that the PFA corresponding to the statistics (6)is given by (9)where Φ_{c}(u) = 1 − Φ(u) and Φ(u) is the standard Gaussian cumulative distribution function^{1}. However, as shown by Vio & Andreani (2016), such practice can lead to underestimate this quantity severely. The point is that the PDF of the peaks of a stationary Gaussian random signal is not a Gaussian as implicitly assumed in Eq. (9). For this reason, the correct PFA has to be estimated by means of (10)where (11)with (12)and (13)providing the PDF^{2} of the local maxima of a zeromean, unitvariance, smooth stationary onedimensional Gaussian random field (Cheng & Schwartzman 2015a,b). Here, (14)where ρ′(0) and ρ′′(0) are, respectively, the first and second derivative with respect to r^{2} of the twopoint correlation function ρ(r) at r = 0, where r is the interpoint distance. When κ = 1, the functional form of the twopoint correlation function is a Gaussian. The condition of smoothness for the random fields requires that ρ(r) be differentiable at least six times with respect to r. In a recent work, Cheng & Schwartzman (2016) have shown that, contrary to what is claimed in Cheng & Schwartzman (2015a,b) and in Vio & Andreani (2016), the constraint κ ≤ 1 actually is not necessary for the validity of the PDF (13).
The main problem in using Eq. (10)is the estimation of parameter κ. One possibility, suggested by Vio & Andreani (2016), is to obtain such a quantity from the fit of the discrete sample twopoint correlation function of with an appropriate analytical function. The reason is that the estimation of the correlation function of the noise is also required by the MF and, therefore, it is not an additional condition of the procedure. However, a reliable estimate of ρ′(0) and ρ′′(0) is a very delicate issue. A robust alternative is to estimate κ through a maximum likelihood approach (15)where { z_{i} }, i = 1,2,...,N_{p}, are the local maxima of . This approach cannot be adopted if the number and/or the amplitude of the point sources modify the statistical characteristics of the noise background. The latter case is simple to deal with since it is sufficient to mask all the sources that clearly stand out from the noise. Conversely, in the first case the approach is unfeasible. A comparison of the histograms H(x) of the pixel values and H(z) of the peak amplitudes with the respective PDFs φ(x) and ψ(z) permits us to check if this condition is fulfilled (see more in Sects. 5–7).
Once and N_{p} are fixed, an additional benefit of the maximum likelihood approach is that by means of Eq. (14)and of the expected number of peaks per unit length (Cheng & Schwartzman 2015b), (16)it is possible to obtain an estimate of ρ′(0) and ρ′′(0) as, In turn, by means of the Taylor expansion, (19)it is possible to evaluate the form of ρ(r) for r ∈ (0,1), an interval that in the case of discrete signals where r can take only integer values is not computable by means of the classical estimators of the correlation function.
2.3. Morphological analysis of the peaks
It is a common believe that it is possible to improve the rejection of spurious detections by means of a morphological analysis of the shape of the peaks in the filtered map . Such a conviction is based on the assumption that a peak due to a signal s looks different from a peak due to noise. Unfortunately, the situation is more complex.
Let us assume, for the moment, that n is a standard Gaussian white noise. In the case of no signal, is obtained through the correlation of n with f = g. As a consequence, it is a Gaussian stochastic process with autocorrelation function ρ(τ) given by (20)where g_{⋆} = g ⊗ g with symbol ⊗ denoting the correlation operator. Here, the point is that g_{⋆} is also the shape of the template g after the matched filtering. Moreover, the conditional expectation of given x [ i_{p} ], with i_{p} the position of the peak, is (21)where is the peak value. This means that in the expected shape of a peak due to the noise is identical to that of the signal s after the matched filtering.
Something similar holds when the noise is of nonwhite type. Indeed, in the case of no signal, the quantity in Eq. (4)can be written in the form where y^{T} = x^{T}C^{− 1 / 2} and h = C^{− 1 / 2}g. Now, E [ yy^{T} ] = E [ C^{− 1 / 2}xx^{T}C^{− 1 / 2} ] = C^{− 1 / 2}E [ xx^{T} ] C^{− 1 / 2} = C^{− 1 / 2}CC^{− 1 / 2} = I. This means that the nonwhite case can be brought back to a white case where now the matched filter takes the form f = h.
2.4. Previous similar works on this issue
To the best of our knowledge to date, the beforehand detection procedure is the first that makes use of the statistical characteristics of the peaks of a Gaussian noise. In the past, LópezCaniego et al. (2005a,b) have proposed two modifications of the MF, known as the scaleadaptive filter (SAF) and the biparametric scaleadaptive filter (BSAF), which are claimed to be able to outperform the MF in the sense of maximizing the probability of detection subject to a constant probability of false detection. Contrary to the MF, these filters work not only with the amplitude of the peaks but also with the corresponding curvature. However, such a claim is incompatible with the NeymanPearson theorem given that, as seen in Sect. 2.1, in the case of stationary Gaussian noise and conditioned on the a priori knowledge of the true position of the signal, the filter that maximizes the probability of detection subject to a constant probability of false detection is the MF. This means that, under the same conditions, no other filter can outperform it. A careful reading of the abovementioned works permits to realize that the optimal properties of SAF and BSAF hold only under an additional condition with respect to the MF, namely that the true position of the source coincides with a peak of the filtered noise. It is worth noting that this is an unrealistic condition. In fact, since the arguments supporting SAF and BSAF are developed in the framework of continuous signals, such a coincidence represents an event with probability zero. As a consequence, the plain and uncritical extension of SAF and BSAF to the case of discrete signals operated by LópezCaniego et al. (2005a,b) is questionable. The same also applies to their numerical simulations since the comparison of the performance of the SAF, BSAF, and MF should have been based on all the sources present in the simulated signals and not only on the subset of the sources whose true position, after the filtering, coincide with that of an observed peak.
3. Specific probability of false alarm (SFPA)
Contrary to what one could believe at first glance, the PFA given by Eq. (10)does not provide the probability α that a specific detection is spurious but the probability that a generic peak due to the noise in can exceed, by chance, the threshold u. If N_{p} peaks due to the noise are present in , then a number α × N_{p} among them is expected to exceed the prefixed detection threshold. For example, if in there are 1000 peaks, then there is a high probability that a detection with a PFA equal to 10^{3} is spurious. As a consequence, in spite of the low PFA, the reliability of the detection is actually small. A possible strategy to avoid this problem is to fix a threshold u such as α × N_{p} ≪ 1. However, in this way there is the concrete risk to be too conservative and miss some true detections. In literature some procedures are available to alleviate this kind of problem. They are essentially of nonparametric type (i.e. they are not able to exploit all the available information). A popular approach is represented by the false discovery rate (FDR) (Miller et al. 2001; Hopkins et al. 2002). Here, we propose a parametric solution that consists in a preselection based on the PFA and then in the computation of the probability of false detection for each specific detection. We call this specific probability of false alarm (SPFA). This quantity can be computed by means of the order statistics, in particular by exploiting the statistical characteristics of the greatest value of a finite sample of identical and independently distributed (iid) random variable from a given PDF (Hogg et al. 2013). Under the iid condition, the PDF g(z_{max}) of the largest value among a set of N_{p} peaks { z_{i} } is given by (22)Hence, the SPFA can be evaluated by means of (23)Actually, since the peaks of a generic isotropic Gaussian random field tend to cluster, the iid condition is not necessarily valid. However, in situations where the twopoint correlation function ρ(r) of the noise is narrow with respect the area spanned by the data (a basic situation for the application of the MF), this condition can be expected to hold with good accuracy. The rationale is that two points with a distance r such that ρ(r) ≈ 0 are essentially independent. The same holds for two generic peaks. Hence, most of the peaks can be expected to be approximately iid and Eq. (22)is still applicable but possibly with an effective number (see Sect. 6 in Majumdar & Comtet 2005). This last point is due to the dependence among a set of random variable which lowers its number of degrees of freedom^{3}.
Fig. 1 PDF g(z_{max}) of the greatest value of a finite sample of N_{p} = 10^{2},10^{3}, and 10^{4} identical and independently distributed random variable from the PDF ψ(z) of the peaks of a onedimensional zeromean unitvariance stationary Gaussian process field with κ = 1. The colorfilled areas provide the respective SPFA for a detection threshold u (dashed line) corresponding to a PFA (10)equal to 10^{4}. It is evident that a detection threshold independent of N_{p} is not able to quantify the risk of a false detection. 

Open with DEXTER 
Fig. 2 PDF g(z_{max}) of the greatest value of a finite sample of N_{p} = 10^{2},10^{3}, and 10^{4} identical and independently distributed random variable from the PDF ψ(z) of the peaks of a twodimensional zeromean unitvariance isotropic Gaussian random field with κ = 1. The colorfilled areas provide the respective SPFA for a detection threshold u (dashed line) corresponding to a PFA (10)equal to 10^{4}. It is evident that a detection threshold independent of N_{p} is not able to quantify the risk of a false detection. 

Open with DEXTER 
Fig. 3 a) Simulated 30 point sources on a map of 300 × 300 pixels; b) same as in panel (a) with the addition of a Gaussian whitetype noise. The amplitude of the sources is well below the noise level; c) zeromean unitvariance version of the noise map after the application of the MF; d) zeromean unitvariance version of the map obtained after the application of the MF to the map in panel (b). 

Open with DEXTER 
Fig. 4 a) Histogram H(x) of the pixel values of the map in Fig. 3d. For reference, the standard Gaussian PDF φ(x) is also plotted; b) histogram H(z) of the values of the peaks in the same map vs. the corresponding PDF ψ(z) is shown. There is good agreement between φ(x) and ψ(x) with the corresponding histograms. 

Open with DEXTER 
Fig. 5 Twopoint correlation function ρ_{p}(r) for the peaks in Fig. 3d. The interpoint distance r is in pixels units. The two red lines define the 95% confidence band. They were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal but whose values were randomly permuted. 

Open with DEXTER 
Fig. 6 Detection results for the peaks in Fig. 3d. The black circles highlight the peaks with a PFA smaller than 2.55 × 10^{3}. Each of such peaks is indexed with an increasing number according to its amplitude. The black crosses highlight the true position of the point sources. 

Open with DEXTER 
Fig. 7 a) Probability distribution function (PDF) of the number N_{FD} of false detections among the peaks #1 – #11 in Fig 6; b) cumulative distribution function (CDF) is shown corresponding to the PDF in the previous panel; c) PDF is shown of the number N_{FD} of false detection among the peaks #8 – #11 in Fig 6; d) CDF is shown corresponding to the PDF in the previous panel. 

Open with DEXTER 
Fig. 8 Rank fraction vs. pvalues (left axis) and peak amplitudes (right axis) of the peaks in Fig. 3d (see text). Here the pvalues correspond to the PFA of the peaks. The dashed lines provide the detection threshold for the corresponding false discovery rate α_{FDR}. Any cross below a detection threshold corresponds to a detection. The detection thresholds are determined by the greatest pvalue below the corresponding continuous line (for details, see Miller et al. 2001). 

Open with DEXTER 
A way to measure the degree of dependence of the peaks is the twopoint correlation function ρ_{p} [ d ]. This discrete function is computed on a set of nonoverlapping and contiguous distance bins of size Δd(24)with N_{d} the number of peak pairs with a distance within the range (d − Δd/ 2,d + Δd/ 2 ]. It measures the tendency of two peaks with similar value to be next to each other.
The numerical evaluation of integral (23)does not present particular difficulties since If the number of signals present in is unknown, the above procedure can be applied, in order of decreasing amplitude, to all peaks with a PFA smaller than a prefixed α, and reducing N_{p} by one unit after any confirmed detection. The last step is based on the rationale that if a peak can be assigned to a signal s in x, it can be removed from the set of the peaks related to the noise.
The importance of the SPFA is demonstrated by Fig. 1 where the PDF g(z_{max}), corresponding to the PDF ψ(z) of the peaks of a stationary zeromean unitvariance Gaussian random process with κ = 1, is plotted for three different values of the sample size N_{p}, i.e., 10^{2}, 10^{3}, and 10^{4}. The colorfilled areas provide the respective SPFA for a detection threshold u corresponding to a PFA equal to 10^{4}. It is evident that a detection threshold independent of N_{p} is not able to quantify the risk of a false detection. This figure also shows that the determination of the number is not a critical operation since g(z_{max}) is a slow changing function of N_{p} and for weakly dependent peaks it is . The SPFA is also useful because, by means of the Poissonbinomial distribution^{4}, it is possible to estimate the probability that the number of false detection N_{FD} is equal or less than a given integer k (see below).
4. Extension of matched filter to the twodimensional case
In principle, the extension of MF to the twodimensional signals , , and does not present particular difficulties. Indeed, it is sufficient to set where VEC [ ℰ ] is the operator that transforms a matrix ℰ into a column array by stacking its columns one underneath the other, to obtain a problem similar to that in Sect. 2. There are only two differences. The first is the structure of matrix C. In the onedimensional case matrix C is of Toeplitz type, i.e., a matrix in which each descending diagonal from left to right is constant. In the twodimensional case, however, it becomes a blockToeplitz with Toeplitzblock (BTTB) type, i.e., a matrix that contains Toeplitz blocks that are repeated down the diagonals of the matrix, as a Toeplitz matrix has elements repeated down the diagonal. The second one concerns the PDF of the local maxima and their expected number per unit area that, in the case of a zeromean unitvariance Gaussian isotropic noise, are given by (Cheng & Schwartzman 2015a,b) (31)and (32)respectively. However, a computational problem arises because, even for maps of moderate size, the covariance matrix C becomes rapidly huge. Hence, some efficient numerical methods based on a Fourier approach have to be used as in Vogel (2002, Chap. 5), Jain (1989, page 145, ElSamie et al. (2013, Appendix B), and Lagendijk & Biemond (1991, Appendix A).
In comparison with Fig. 1, Fig. 2 also shows the importance of the SPFA for the twodimensional case. Here, the PDF g(z_{max}), corresponding to the PDF ψ(z) of the peaks of an isotropic twodimensional zeromean unitvariance Gaussian random field with κ = 1, is plotted for three different values of the sample size N_{p}, 10^{2}, 10^{3}, and 10^{4}. Again, the colorfilled areas provide the respective SPFA for a detection threshold u corresponding to a PFA equal to 10^{4}.
A final note concerns the condition of isotropy of the noise. There are situations where such condition can be relaxed. In particular this happens with correlation functions of type (33)where f(.) is a real function and A = B^{T}B with B a nondegenerated matrix. The random fields corresponding to this correlation function are not isotropic but only homogeneous (i.e., the statistical characteristics are constant along a given direction). The arguments presented above can also be applied to this case given that an opportune rotation and rescaling of the axes can convert r^{T}Ar into r^{T}r. Since these operations on the coordinate system do not modify the values of the random field, the PDF of the peaks does not change^{5}.
5. A numerical experiment
To illustrate the usefulness of the proposed method, this is applied to a simulated 300 × 300 pixels map where 30 point sources, with a uniform random spatial distribution and Gaussian profile with standard deviation along the horizontal and vertical directions set to 1.3 and 1.8 pixels, respectively, (see Fig. 3a), are embedded in a Gaussian white noise. All the point sources have the same amplitude set to the standard deviation of the noise. This experiment simulates a very difficult situation where the amplitude of the point sources is well below the level of the noise. Indeed, in Fig. 3b, which shows the observed map, the sources are not even visible. In situations like this a matched filtering operation is unavoidable. As seen above, the MF represents an optimal solution. However, a comparison of Figs. 3c and d, which show the zeromean unitvariance matched filtered versions of the noise component and observed map, respectively, indicates that also after the MF operation the detection of the point sources is still a problematic issue since no strong peaks are evident. Moreover, as discussed in Sect. 2.3, after the matched filtering, the blobshaped structures due to the noise have a shape similar to that of the filtered point sources, i.e., the morphological analysis cannot be used to detect the searched point sources.
Because of the asymmetric shape of the point sources, the noise background is nonisotropic but simply homogeneous. This does not represent a problem given that the corresponding twopoint autocorrelation function is a twodimensional Gaussian function, i.e., of the type ρ(r) = f(r^{T}Ar) as in Sect. 4. A situation like this one reflects the results found in ALMA observations (see below).
The histogram of the pixel values of the matched filtered map in Fig. 4a is clearly compatible with a Gaussian PDF φ(x). The number of identified peaks is 1601 and the estimated is about 1. Figure 4b shows that the corresponding PDF ψ(z) is in good agreement with the histogram H(z) of the peak values. The iid condition for the peaks, which is necessary for the computation of the PSFA, is demonstrated in Fig. 5, which shows the twopoint correlation function of the peaks. As a preliminary detection threshold, a value of u = 3.72 was chosen. Eleven peaks exceed it. In Figs. 6 they are highlighted and indexed with an increasing number according to the amplitude. Among these, only five ( #4, #6, #8, #10, and #11) correspond to a point source. In a situation like this one, an accurate quantification of the detection reliability is critical.
Since for the PFA (9)it is Φ_{c}(3.72) ≈ 10^{4}, according to the standard procedure all the peaks should be considered true detections with a high confidence level. To a minor extent, the same also holds for the PFA (10)since Ψ_{c}(3.72) ≈ 2.55 × 10^{3}. On the other hand, with u = 5, an often used threshold, no peak should have been selected. This supports the unreliability of a detection based only on the threshold u. The situation with SPFA is different. From Fig. 7a, which shows the Poissonbinomial distribution corresponding to the SPFAs of the selected peaks, it appears that the most probable number N_{FD} of false detections is N_{FD} = 7. Moreover, Fig. 7b shows that the probability that N_{FD} ≤ 7 is about 0.54. This result is in good agreement with the true N_{FD} = 6. If only the four highest peaks #8 – #11 are considered, with values of the SPFA equal to 0.60, 0.42, 0.40, and 0.24, the most probable N_{FD} is 2 and the probability that N_{FD} ≤ 2 is about 0.81 (see Figs. 7c–d). Among the four peaks, only the peak #9 corresponds to a false detection, again in good agreement with the expected number.
It is useful to compare the proposed method with the popular approach of the leastsquares fit, on a small submap centered on a given peak, of a model constituted by the template g superimposed on a twodimensional lowdegree polynomial . According to this approach, the reliability of the detection is tested by comparing the estimated amplitude with a multiple of the standard deviation σ_{r} of the residuals. When applied to the 11 peaks selected above, assuming a twodimensional polynomial of first degree and using a submap of size 11 × 15 pixels, in all of the cases the estimated amplitude is greater than the threshold set to 5σ_{r}. As a consequence, all of the peaks should be considered true detections with an high degree of reliability. Two more true point sources have been detected, but at the cost of six false detections.
Fig. 9 uvplane coverage of the ALMA band 3 observations of the radio source field PKS074519. 

Open with DEXTER 
Another useful comparison is with the FDR technique mentioned in Sect. 3. With this approach it is possible to control the expected number of detections falsely declared significant as a proportion α_{FDR} of the number of all tests declared significant. Figure 8 shows that with α_{FDR} = 0.2 and 0.4 the number of detected point sources is 9 and 12, respectively. However, in the first case the number of false detection is 4, whereas it is 7 in the second case. In both cases more true point sources have been detected than the methodology based on the PDF g(z_{max}), but at the cost of a larger fraction of false detections. Moreover, also here, no quantification is possible for the reliability of each specific detection. Finally, it is worth noticing that the actual fraction of false detections is 0.44 and 0.58, which is well above the corresponding expected value α_{FDR}.
The conclusion is that only the SPFA is able to really quantify the true reliability of a given detection and should be considered before followup observations to identify the source itself.
Detected sources.
6. Application to real ALMA maps
We apply the aforementioned procedure to two interferometric maps, obtained at two frequencies with ALMA, with the aim to detect the faint point sources in the field of the radio source PKS0745191. These maps are characterized by an excellent uvplane coverage (see Fig. 9). For this reason, the noise is expected to have a uniform spatial distribution over the entire area of interest that is the basic condition for the proposed method to work. The first map, (hereafter M1) was obtained at a frequency of 100 GHz (3 mm, ALMA band 3), whereas the second map (hereafter M2) was obtained at a frequency of 340 GHz (0.87 mm, ALMA band 7). Both maps, with a size of 256 × 256 pixels, are centered on the bright cluster galaxy (BCG) of PKS074519 (RA 07:47:31.3, Dec −19:17:39.94, J2000), which was the target of the observations we used (ALMA ID = 2012.1.00837.S, PI R. McNamara). The data reduction was carried out with CASA version 4.2.2 and the ALMA reduction scripts (McMullin et al. 2007). To produce the continuum maps, we selected and averaged only the channels free of the CO line emission, which was the original observation target. The images were reconstructed via the CASA task clean with Briggs weighting and a robust parameter of 1.5 in band 3 and 2 in band 7. The beam size is 1.9′′ × 1.4′′ in band 3, and 0.27′′× 0.19′′ in band 7.
In Figs. 10a and d a bright source is visible in the central position of both maps, which corresponds to the BCG. Figures 10e and b show that, when this is masked in M2 no additional sources become visible whereas a bright source appears in M1. This additional source has been identified as an IR source already known in the cluster, i.e., 2MASX J074730021917503 (RA 17:47:30.130, Dec –19:17:50.60, J2000). If this is also masked in M1 no additional sources are obviously visible in Fig. 10c. Since we are interested in the detection of point sources with amplitudes comparable to the noise level, the pixels in the masked areas are not used in the analysis.
Most of the structures visible in Figs. 10c and e are certainly not due to physical emission but to the noise. In Figs. 11a–b, the histograms H(x) of the pixel values of these maps, standardized to zero mean and unit variance, indicate that the noise is Gaussian but, as shown by the twopoint autocorrelation functions in Figs. 12a–b, not white.
As a first step, the two maps were independently analyzed. As already underlined by Vio & Andreani (2016), the use of the MF with these kinds of ALMA maps is unfeasible because the MF filters out the Fourier frequencies where the noise is predominant, preserving those where the searched signals give a greater contribution. However, the unresolved point sources and the blobshaped structures due to the noise have similar appearances and the process of filtering cannot work. In this case and the detection test becomes a thresholding test according to which a peak in the map can be attributed to a point source if it exceeds a given threshold. To apply the procedure introduced in Sect. 4 it is necessary to test the isotropy of the noise field. From Figs. 12a,b this condition appears to be approximately satisfied. The small differences between the autocorrelation functions along the vertical and horizontal directions is due to the elliptical shape of the ALMA beam.
We identified 328 peaks in M1 and 948 in M2. The iid condition for the peaks is supported by the twopoint correlation functions ρ_{p}(r) in Figs. 13a and b, which are almost completely contained in their 95% confidence band. We obtained the confidence band by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions computed from 1000 resampled sets of peaks with the same spatial coordinates as in the original map but whose values have been randomly permuted. The reliability of the iid condition is confirmed by Fig. 14, which shows the PDF of the largest peak value from a set of 256 × 256 pixels Gaussian random fields obtained by filtering 5000 discrete white noise maps by means of a Gaussian filter with dispersion of 3.7 and 2.7 pixels in such a way to approximately reproduce the noise in M1 and M2, respectively.
The maximum likelihood estimate (15)provides for both M1 and M2, which is a value that is typical of Gaussian twopoint autocorrelations functions. This is an expected result for ALMA given that, independently of the frequency, its PSF can be well approximated by bivariate Gaussian functions. The leastsquares fit of the twopoint correlation function ρ(r) in Figs. 12a,b with a Gaussian function supports this circumstance. As seen at the end of Sect. 4, this fact makes even more irrelevant the small anisotropy of noise background observed above. The PDFs ψ(z) are shown in Figs. 15c and d. The agreement with the respective histograms H(z) is good. A threshold u ≈ 3.98, corresponding to a PFA (10)equal to 10^{3}, provides two candidate point sources in M1 and four in M2. All these detections are highlighted in Figs. 15a and b. They are indexed with an increasing number according to the source amplitude.
Even though the PFA is identical for all the sources, their detection reliability is different. Indeed, in M1 the SPFA is 8.8 × 10^{2} and 2.0 × 10^{2} for the sources 1a and 2a, whereas in M2 it is 5.7 × 10^{1}, 5.0 × 10^{1}, 1.4 × 10^{1}, and 1.3 × 10^{1} for the sources 1b4b, respectively (see also Fig. 16). The Poissonbinomial distribution corresponding to these SPFAs indicates that for M1 the probability of N_{FD} = 0 is about 0.89, whereas for M2 the probability that N_{FD} ≤ 1 is 0.58 and becomes 0.92 for N_{FD} ≤ 2.
As a final comment, we note that if the standard PFA (9)based on the Gaussian PDF φ(z) had been used, the chosen detection threshold u should have produced 104 detections in M1 and 294 in M2.
Source fluxes computed on the original images.
Source 1a in M1 was identified as an already known object USNO B1.0 070710151219 (RA 07:47:30.817, Dec –19:17:18.48, J2000). On the map M2, the two peaks 1b and 3b look like a single extended source or two very close sources. However, the separation of the two peaks is only 0.022′′ × 0.04′′, which is smaller than the beam size and the two sources cannot be resolved. Therefore, these two peaks are probably due to a single extended source. Table 1 summarizes our results.
Fig. 10 a) Original ALMA map M1 is shown; b) map M1 with the brightest sources masked is shown; c) map M1 with both brightest sources masked is shown; d) original ALMA map M2 is shown; e) map M2 with the brightest sources masked is shown; f) map M3, obtained by composing M1 and M2 with the Feather algorithm implemented in CASA, is shown. 

Open with DEXTER 
Fig. 11 a) Histogram H(x) of the pixel values of the map M1, b) M2 and c) M3. For reference, the standard Gaussian PDF φ(x) is also plotted. 

Open with DEXTER 
Fig. 12 a) Autocorrelation function along the vertical and horizontal directions, twopoint correlation function, and the corresponding leastsquares fit with a Gaussian function of the map M1 are shown. The interpixel distance r is in pixels units; b)–c) the same for the maps M2 and M3. 

Open with DEXTER 
Fig. 13 a) Twopoint correlation function ρ_{p}(r) for the peaks in the map M1 is shown. The interpoint distance r is in pixels units. b)–c) The same for the maps M2 and M3. The two red lines define the 95% confidence band. These were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal but whose values were randomly permuted. 

Open with DEXTER 
Fig. 14 a)–b) Histograms vs. the PDFs g(z_{max}) of the largest peak value from 5000 Gaussian random fields obtained by filtering a 256 × 256 discrete white noise map by means of a Gaussian filter with dispersion of 3.7 and 2.7 pixels in such a way as to reproduce approximately the noise in M1 and M2, respectively. Since each simulated map is characterized by a different number of peaks, the corresponding PDFs are slightly different from one another. For this reason, the displayed g(z_{max}) plotted in green corresponds to the mean number of peaks which are 384 and 960, respectively. In panel b the resulting PDF is in reasonably good agreement with the corresponding histogram, but a better result is obtainable if an effective number is used. To stress the fact that, although is about 17% smaller than N_{p}, the resulting PDF is only slightly different. All this is in accordance with the arguments in Sect. 3. 

Open with DEXTER 
Fig. 15 a)–c) Detected sources with a probability of false detection smaller than 10^{3} on the maps M1, M2 and M3, respectively; e)–f) corresponding histograms H(z) and PDFs ψ(z) of the peaks are shown. For reference, the Gaussian PDF φ(z) is also plotted. 

Open with DEXTER 
Fig. 16 Graphical representation (redfilled areas) of the SPFA for the detections with PFA ≤ 10^{3} in M1, M2, and M3. 

Open with DEXTER 
Fig. 17 a) Original ATCA 500 × 500 pixels subimage, standardized to zero mean and unit variance, used for the detection of the point sources; b) same image with the brightest sources masked (blue circles) and the detected point sources are indicated with white circles and indexed with an increasing number according to the source amplitude. 

Open with DEXTER 
Fig. 18 a) Histogram H(x) of the pixel values of the map in Fig. 17b. For reference, the standard Gaussian PDF φ(x) is also plotted; b) histogram H(z) is shown of the values of the peaks in the map in Fig. 17b vs. the corresponding PDF ψ(z). 

Open with DEXTER 
Fig. 19 Autocorrelation function along the vertical and horizontal directions, twopoint correlation function, and the corresponding leastsquares fit with a Gaussian function of the ATCA map. The interpixel distance r is in pixels units. 

Open with DEXTER 
We computed the fluxes with the CASA statistics tool. Since our sources are not resolved, to estimate the flux we measured the integrated flux over the PSF, which is assumed to have a Gaussian shape. For each source, we assign a root mean square (RMS) value computed over the pixels around the source, which is a rough indication of the noise level. Results are presented in Table 2. It is impossible to compute an integrated flux over sources 1b+3b as a single extended source because the shape of the resulting source is neither circular nor elliptic. The fluxes measurement are therefore subject to a large uncertainty that depends on the chosen source shape.
The correlation coefficient between maps M1 and M2 is only 2.1 × 10^{2}, hence these maps can be considered statistically independent. Therefore, they can be combined in a map (hereafter M3) where the noise level is reduced, although this does not mean the improvement of the S/N for each point source. Such operation was carried out by the algorithm Feather implemented in CASA (McMullin et al. 2007). This algorithm performs a weighted addition in the Fourier domain of M1 and M2 in such a way as to obtain a map with the highest resolution among the two.
The zeromean unitvariance version of M3 is shown in Fig. 10f. Given the different areas covered by M1 and M2, M3 shows an area of size that is similar but slightly smaller than that of M2. The corresponding histogram H(x) of the pixel values is shown in Fig. 11c, whereas the twopoint correlation function is given in Fig. 12c. In this map 948 peaks are identified which, as shown by ρ_{p}(r) in Fig. 13c, can be assumed to be independent and identically distributed with good accuracy. These peaks provide a maximum likelihood estimate . However, only three of these peaks have a PFA smaller than 10^{3}. They are highlighted in Fig. 15f. The SPFA is equal to 3.2 × 10^{1}, 2.7 × 10^{1} and 4.0 × 10^{3} for the detections 1c3c, respectively. The corresponding probability to have N_{FD} ≤ 1 is about 0.90. As before, if the standard PFA (9)had been adopted, the number of detections should have been 326. Two additional sources are detected in M3, while the source 3c is the same source as 3b (or 1b+3b) detected in M2. The positions and size of the detected source in M3 are reported in the bottom rows of Table 1.
The outcome of our investigation implies the following: the sources detected in M2 are not visible in M1 because of the lower spatial resolution in band 3. Source 3c in M3 is the same as source 3b (1b+3b) in M2, while the sources 1c and 2c in M3 are undetected both in M1 and M2, even if they are visible in M2 at the same spatial position. Moreover, contrary to the PFA, the SPFA permits us to quantify the real risk of a detection claim. Although the PFA of the sources 1b2b in M2 is ≈8.9 × 10^{4} and ≈7.4 × 10^{4}, respectively, the SPFA indicates that these detections have a confidence level of 43% and 50% (see also Fig. 16).
7. Application to a real ATCA map
We apply the above method to quantify the detection reliability of point sources extracted from a 500 × 500 pixels map cropped from a radio image taken with the ATCA array toward the Large Magellanic Cloud at a frequency of 4.8 GHz by Dickel et al. (2005). This map is shown in Fig. 17a. The same map, standardized to zero mean and unit variance and with some bright sources masked, is shown in Fig. 17b.
The uvplane coverage of this instrument is less homogeneous than that of ALMA (see Fig. 2 in Dickel et al. 2005 for comparison with Fig. 9), hence the noise background is expected to be less uniform. This is also visible in the map itself, which shows artificial structures introduced by the gridding algorithm. In spite of this, as Fig. 18a shows, the histogram H(x) of the value of the pixels indicates that the noise is Gaussian. In the map there are 2580 peaks whose histogram H(z) is shown in Fig. 18b. The isotropy of the noise background is supported by Fig. 19, which compares the autocorrelation functions along the vertical and horizontal directions with the twopoint correlations function ρ(r). The twopoint correlation function ρ_{p}(r) of the peaks is shown in Fig. 20. A weak correlation, probably due to the weighting method applied to the map to fill the gaps in the uvplane coverage, is present only for small interpoint distances. Hence, also the iid condition for the peaks is approximately satisfied. The estimated is ≈1 with the corresponding PDFs ψ(z) shown in Fig. 18b. All these results, together with the good agreement of H(z) with ψ(z), indicate that in spite of the poorer uvplane coverage of the ATCA map the method can be applied to this map too.
Fig. 20 Twopoint correlation function ρ_{p}(r) for the peaks in the map in Fig. 17b. The interpoint distance r is in pixels units. The two red lines define the 95% confidence band. These were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal, but whose values were randomly permuted. 

Open with DEXTER 
With a threshold u set to ≈4.5, corresponding to a PFA equal to 1.25 × 10^{4}, 11 candidate point sources are detected with a SPFA of 0.20, 0.13, 0.10, 0.02, and 0.01, respectively, and the remaining with values well below 10^{3}. These value correspond to a small risk of false detection given that that probability of N_{FD} = 0 is about 0.6, which becomes 0.93 for N_{FD} ≤ 1. As a consequence, these peaks are good candidate for a followup observations. The sources #1, #3, #6, and #7 can be identified with the SSTSL2 sources reported by Mauch et al. (2003). The corresponding identifiers are SUMSS J051432685446, SSTSL2 J051230.46682802.0, SSTSL2 J051555.19690746.2, and SSTSL2 J051230.46682802.0.
8. Final remarks
In this paper we have reconsidered the procedure suggested by Vio & Andreani (2016) for the correct computation of the probability of false detection (PFA) of the matched filter (MF) to the case of weak signals with unknown position. In particular, we showed that although the PFA is useful for a preliminary selection of the candidate detections, it is not able to quantify the real risk in claiming a specific detection. For this reason we introduced a new quantity called the specific probability of false alarm, which can provide this kind of information.
We applied this procedure to two ALMA maps at two different frequencies and we highlighted the presence of 7 potential new point sources (2 in M1, 3 in M2, and 2 in M3). The same procedure applied to an ATCA map provided 11 potential point sources.
In Vio & Andreani (2016)Φ_{c}(.) is denoted as Φ(.).
In Vio & Andreani (2016)Ψ_{c}(.) is denoted as Ψ(.) and in Eqs. (24), (25) the function Φ(.) has to be intended as the Gaussian cumulative distribution function and not its complementary function as it erroneously appears.
We recall that the Poissonbinomial distribution is the probability distribution of the number of successes in a sequence of N independent experiments having only two possible outcomes (yes/no) with success probabilities p_{1},p_{2},...,p_{N}. When p_{1} = p_{2} = ... = p_{N}, it coincides with the binomial distribution.
Acknowledgments
This research has been supported by a ESO DGDF Grant 2014 and R.V. thanks ESO for hospitality. This paper makes use of the following ALMA data: ADS/JAO.ALMA#2012.1.00837.S. ALMA is a partnership of ESO (representing its member states), NSF (USA) and NINS (Japan), together with NRC (Canada) and NSC and ASIAA (Taiwan) and KASI (Republic of Korea), in cooperation with the Republic of Chile. The Joint ALMA Observatory is operated by ESO, AUI/NRAO and NAOJ.
References
 Cheng, D., & Schwartzman, A. 2015a, Extremes, 18, 213 [CrossRef] (In the text)
 Cheng, D., & Schwartzman, A. 2015b, ArXiv eprints [arXiv:1503.01328] (In the text)
 Cheng, D., & Schwartzman, A. 2016, ArXiv eprints [arXiv:1511.06835] (In the text)
 Dickel, J., Gruendl, R., McIntyre, V., & Milne, D. K. 2005, AJ, 129, 790 [NASA ADS] [CrossRef] (In the text)
 ElSamie, F. E. A., Hadhoud, M. M., & ElKhamy, S. E. 2013, Image SuperResolution and Applications (London: CRC Press) (In the text)
 Galatsanos, N. P., Wernick, M. N., Katsaggelos, A. K., & Molina, R. 2005, in Handbook of Image & Video processing, ed. A. Bovik (New York: Academic Press), 203 (In the text)
 Hogg, R. V., McKean, J. W., & Craig, A. T. 2013, Introduction to Mathematical Statistics (New York: Pearson) (In the text)
 Hopkins, A. M., Miller, C. J., & Connolly, A. J. 2002, AJ, 123, 1086 [NASA ADS] [CrossRef] (In the text)
 Katsaggelos, A. K., Lay, K. T., & Galatsanos, N. P. 1993, IEEE Transaction On Image Processing, 2, 417 [NASA ADS] [CrossRef] (In the text)
 Kay, S. M. 1998, Fundamentals of Statistical Signal Processing: Detection Theory (London: Prentice Hall) (In the text)
 Jain, A. K. 1989, Fundamentals of Digital Image Processing (London: Prentice Hall) (In the text)
 Lagendijk, R. L, & Biemond, J. 1991, Iterative Identification and Restoration of Images (New York: Springer Science+ Business Media) (In the text)
 Levy, B.C. 2008, Principles of Signal Detection and Parameter Estimation (New York: Springer Science + Business Media) (In the text)
 LópezCaniego, M., Herranz, D., Barreiro, R. B., & Sanz, J. L. 2005a, MNRAS, 2005, 359, 993 (In the text)
 LópezCaniego, M., Herranz, D., Barreiro, R. B., & Sanz, J. L. 2005b, J. Applied Signal Processing, 15, 2426 [NASA ADS] (In the text)
 Macmillan, N. A., & Creelman, C. D. 2005, Detection Theory: a User’s Guide (Mahwah: Lawrence Erlbaum Associates) (In the text)
 Majumdar, S. N., & Comtet, A. 2005, J. Stat. Phys., 119, 777 [NASA ADS] [CrossRef] (In the text)
 Mauch, T., Murphy, T., Buttery, H. J., et al. 2003, MNRAS, 342, 1117 [NASA ADS] [CrossRef] (In the text)
 McMullin, J. P., Waters, B., Schiebel, D., Young, W., & Golap, K. 2007, Astronomical Data Analysis Software and Systems XVI, eds. R. A. Shaw, F. Hill, & D. J. Bell (San Francisco: ASP), ASP Conf. Ser., 376, 127 (In the text)
 Miller, J. C., Genovese, C., Nichol, R. C., et al. 2001, AJ, 122, 3492 [NASA ADS] [CrossRef] (In the text)
 Tuzlukov, V. P. 2001, Signal Detection Theory (New York: Springer Science + Business Media) (In the text)
 Vio, R., & Andreani, P. 2016, A&A, 589, A20 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Vio, R., Tenorio, L., & Wamsteker, W. 2002, A&A, 391, 789 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Vio, R., Nagy, J. G., Tenorio, L., et al. 2003, A&A, 401, 389 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Vio, R., Andreani, P., & Wamsteker, W. 2004, A&A, 414, 17 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Vogel, C. R. 2002, Computational Methods for Inverse Problems (Philadelphia: SIAM) (In the text)
All Tables
All Figures
Fig. 1 PDF g(z_{max}) of the greatest value of a finite sample of N_{p} = 10^{2},10^{3}, and 10^{4} identical and independently distributed random variable from the PDF ψ(z) of the peaks of a onedimensional zeromean unitvariance stationary Gaussian process field with κ = 1. The colorfilled areas provide the respective SPFA for a detection threshold u (dashed line) corresponding to a PFA (10)equal to 10^{4}. It is evident that a detection threshold independent of N_{p} is not able to quantify the risk of a false detection. 

Open with DEXTER  
In the text 
Fig. 2 PDF g(z_{max}) of the greatest value of a finite sample of N_{p} = 10^{2},10^{3}, and 10^{4} identical and independently distributed random variable from the PDF ψ(z) of the peaks of a twodimensional zeromean unitvariance isotropic Gaussian random field with κ = 1. The colorfilled areas provide the respective SPFA for a detection threshold u (dashed line) corresponding to a PFA (10)equal to 10^{4}. It is evident that a detection threshold independent of N_{p} is not able to quantify the risk of a false detection. 

Open with DEXTER  
In the text 
Fig. 3 a) Simulated 30 point sources on a map of 300 × 300 pixels; b) same as in panel (a) with the addition of a Gaussian whitetype noise. The amplitude of the sources is well below the noise level; c) zeromean unitvariance version of the noise map after the application of the MF; d) zeromean unitvariance version of the map obtained after the application of the MF to the map in panel (b). 

Open with DEXTER  
In the text 
Fig. 4 a) Histogram H(x) of the pixel values of the map in Fig. 3d. For reference, the standard Gaussian PDF φ(x) is also plotted; b) histogram H(z) of the values of the peaks in the same map vs. the corresponding PDF ψ(z) is shown. There is good agreement between φ(x) and ψ(x) with the corresponding histograms. 

Open with DEXTER  
In the text 
Fig. 5 Twopoint correlation function ρ_{p}(r) for the peaks in Fig. 3d. The interpoint distance r is in pixels units. The two red lines define the 95% confidence band. They were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal but whose values were randomly permuted. 

Open with DEXTER  
In the text 
Fig. 6 Detection results for the peaks in Fig. 3d. The black circles highlight the peaks with a PFA smaller than 2.55 × 10^{3}. Each of such peaks is indexed with an increasing number according to its amplitude. The black crosses highlight the true position of the point sources. 

Open with DEXTER  
In the text 
Fig. 7 a) Probability distribution function (PDF) of the number N_{FD} of false detections among the peaks #1 – #11 in Fig 6; b) cumulative distribution function (CDF) is shown corresponding to the PDF in the previous panel; c) PDF is shown of the number N_{FD} of false detection among the peaks #8 – #11 in Fig 6; d) CDF is shown corresponding to the PDF in the previous panel. 

Open with DEXTER  
In the text 
Fig. 8 Rank fraction vs. pvalues (left axis) and peak amplitudes (right axis) of the peaks in Fig. 3d (see text). Here the pvalues correspond to the PFA of the peaks. The dashed lines provide the detection threshold for the corresponding false discovery rate α_{FDR}. Any cross below a detection threshold corresponds to a detection. The detection thresholds are determined by the greatest pvalue below the corresponding continuous line (for details, see Miller et al. 2001). 

Open with DEXTER  
In the text 
Fig. 9 uvplane coverage of the ALMA band 3 observations of the radio source field PKS074519. 

Open with DEXTER  
In the text 
Fig. 10 a) Original ALMA map M1 is shown; b) map M1 with the brightest sources masked is shown; c) map M1 with both brightest sources masked is shown; d) original ALMA map M2 is shown; e) map M2 with the brightest sources masked is shown; f) map M3, obtained by composing M1 and M2 with the Feather algorithm implemented in CASA, is shown. 

Open with DEXTER  
In the text 
Fig. 11 a) Histogram H(x) of the pixel values of the map M1, b) M2 and c) M3. For reference, the standard Gaussian PDF φ(x) is also plotted. 

Open with DEXTER  
In the text 
Fig. 12 a) Autocorrelation function along the vertical and horizontal directions, twopoint correlation function, and the corresponding leastsquares fit with a Gaussian function of the map M1 are shown. The interpixel distance r is in pixels units; b)–c) the same for the maps M2 and M3. 

Open with DEXTER  
In the text 
Fig. 13 a) Twopoint correlation function ρ_{p}(r) for the peaks in the map M1 is shown. The interpoint distance r is in pixels units. b)–c) The same for the maps M2 and M3. The two red lines define the 95% confidence band. These were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal but whose values were randomly permuted. 

Open with DEXTER  
In the text 
Fig. 14 a)–b) Histograms vs. the PDFs g(z_{max}) of the largest peak value from 5000 Gaussian random fields obtained by filtering a 256 × 256 discrete white noise map by means of a Gaussian filter with dispersion of 3.7 and 2.7 pixels in such a way as to reproduce approximately the noise in M1 and M2, respectively. Since each simulated map is characterized by a different number of peaks, the corresponding PDFs are slightly different from one another. For this reason, the displayed g(z_{max}) plotted in green corresponds to the mean number of peaks which are 384 and 960, respectively. In panel b the resulting PDF is in reasonably good agreement with the corresponding histogram, but a better result is obtainable if an effective number is used. To stress the fact that, although is about 17% smaller than N_{p}, the resulting PDF is only slightly different. All this is in accordance with the arguments in Sect. 3. 

Open with DEXTER  
In the text 
Fig. 15 a)–c) Detected sources with a probability of false detection smaller than 10^{3} on the maps M1, M2 and M3, respectively; e)–f) corresponding histograms H(z) and PDFs ψ(z) of the peaks are shown. For reference, the Gaussian PDF φ(z) is also plotted. 

Open with DEXTER  
In the text 
Fig. 16 Graphical representation (redfilled areas) of the SPFA for the detections with PFA ≤ 10^{3} in M1, M2, and M3. 

Open with DEXTER  
In the text 
Fig. 17 a) Original ATCA 500 × 500 pixels subimage, standardized to zero mean and unit variance, used for the detection of the point sources; b) same image with the brightest sources masked (blue circles) and the detected point sources are indicated with white circles and indexed with an increasing number according to the source amplitude. 

Open with DEXTER  
In the text 
Fig. 18 a) Histogram H(x) of the pixel values of the map in Fig. 17b. For reference, the standard Gaussian PDF φ(x) is also plotted; b) histogram H(z) is shown of the values of the peaks in the map in Fig. 17b vs. the corresponding PDF ψ(z). 

Open with DEXTER  
In the text 
Fig. 19 Autocorrelation function along the vertical and horizontal directions, twopoint correlation function, and the corresponding leastsquares fit with a Gaussian function of the ATCA map. The interpixel distance r is in pixels units. 

Open with DEXTER  
In the text 
Fig. 20 Twopoint correlation function ρ_{p}(r) for the peaks in the map in Fig. 17b. The interpoint distance r is in pixels units. The two red lines define the 95% confidence band. These were obtained by means of a bootstrap method based on the 95% percentile envelopes of the twopoint correlation functions obtained from 1000 resampled sets of peaks with the same spatial coordinates as in the original signal, but whose values were randomly permuted. 

Open with DEXTER  
In the text 