EDP Sciences
Open Access
Issue
A&A
Volume 617, September 2018
Article Number A92
Number of page(s) 12
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/201732119
Published online 25 September 2018

© ESO 2018

Licence Creative Commons
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Clusters of galaxies are the most massive matter halos. They formed last in the history of the Universe by a hierarchical growth of structures in the Hubble expansion flow. Their presence, observed space density, and mass distributions confirm the standard cosmological model (e.g. Hasselfield et al. 2013; Mantz et al. 2014; Planck Collaboration XXIV 2016; de Haan et al. 2016), making galaxy clusters powerful probes of cosmological parameters, such as the dark energy content and its equation of state (e.g. Vikhlinin et al. 2009); see also Allen et al. (2011) for a review. The identification and study of the different components of galaxy clusters (dark matter halo, intra-cluster medium, galaxies, and relativistic particles) require the use of several different observational techniques. Among such techniques, X-ray observations stand out, since clusters of galaxies are the most luminous extended sources in the extra-galactic X-ray sky, and therefore are easily detectable in large surveys. The importance of galaxy clusters in a cosmological context has been realized since the pioneering surveys undertaken with the Einstein observatory (e.g. Forman & Jones 1982; Gioia et al. 1990), followed by studies with the ROSAT all-sky survey (e.g. Ebeling et al. 2000; Borgani et al. 2001; Böhringer et al. 2004, 2017; Henry et al. 2009; Klein et al. 2018). By simply counting the number of observed galaxy clusters one can confront cosmological model predictions and survey observations. However, it has been established that observational selection effects play a crucial role and must be controlled accurately when pursuing the goal of precision cosmology (e.g. Vikhlinin et al. 2009; Mantz et al. 2010b; Allen et al. 2011; Pacaud et al. 2016).

X-ray astronomy will enter a new era with the extended ROentgen Survey with an Imaging Telescope Array (eROSITA, Predehl 2017). This telescope is the primary instrument of the Russian/German Spektrum-Roentgen-Gamma (SRG) observatory, expected to be launched in 2019 (P. Predehl, priv. comm.). eROSITA will possess unprecedented sensitivity and imaging capabilities for extended source emission (Merloni et al. 2012), and allow the detection of approximately 105 galaxy clusters (Pillepich et al. 2012). In order to detect this huge number of galaxy clusters, eROSITA will scan the entire sky for four years, making it the second imaging X-ray all-sky survey ever made after ROSAT in the soft band (0.5–2 keV), and the first ever imaging survey in the hard band (2–8 keV). The promising capabilities of eROSITA bring great expectations to constrain dark matter and dark energy models through galaxy cluster science.

The derivation of a selection function for extended X-ray sources involves first their detection and then their classification as extended objects. Because extended objects are defined, in contrast to point-like sources, this paper also focuses on the simulation and selection of point-like sources in the eROSITA All-Sky Survey (eRASS).

A reliable detection probability function of point-sources is crucial for assessing the completeness of samples, understanding the X-ray background, evaluating clustering studies, and so on. Given the simple morphology of point-sources, detection probabilities may rely on knowledge of the local exposure time and background levels in a given observation (e.g. Georgakakis et al. 2008). An alternative and common approach consists in simulating mock observations accounting for a range of instrumental and astrophysical effects. Although this method is more computationally demanding, it embraces the entire chain from light emission to source detection and cataloguing, and this is the approach adopted in this work.

As mentioned previously, a selection function for extended sources is a critical ingredient in almost all studies of the X-ray galaxy cluster population, including cosmological studies, scaling relation works (Stanek et al. 2006; Pacaud et al. 2007; Mantz et al. 2010a; Giodini et al. 2013; Lovisari et al. 2015; Andreon et al. 2016), and detailed studies of the evolution of the intra-cluster medium physics and chemistry (see Böhringer & Werner 2010 for a review). The morphological complexity and diversity of the X-ray cluster population makes it more difficult to accurately describe selection effects. Comparison between samples detected at different wavelengths (e.g. Wen et al. 2012; Rozo et al. 2014; Sadibekova et al. 2014; Nurgaliev et al. 2017) allows an understanding of potential selection biases, but does not a priori provide a truth table for source detection. Therefore, Monte-Carlo simulations play an essential role in understanding the entire process leading to a validated galaxy cluster catalog. Reducing the diversity of cluster shapes to a sensible and reduced set of parameters sets limits on the computational demand, and, importantly, allows for a link between theoretical (e.g. mass, redshift, etc.) and observational quantities. Cluster fluxes and apparent sizes are among the most relevant of these observables (Böhringer et al. 2000; Pacaud et al. 2006; Burenin et al. 2007).

Such synthetic simulations are not the unique route to address the selection function of clusters and active galactic nuclei (AGNs). Numerical N-body and hydrodynamic simulations play an increasingly important role in this debate. Indeed, as they become more and more realistic in reproducing the observed sky at multiple wavelengths (e.g. Ragagnin et al. 2016), they offer invaluable support in the understanding of selection biases. However, the still large computational requirements limit their usage for statistical studies.

The aim of this work is to forecast and illustrate realistic selection functions for the eRASS cluster and point-source population. It relies on multiple realisations of selected areas of the eROSITA sky, with X-ray-emitting sources described by controlled parametric inputs. For instance, the galaxy cluster population is uniquely described by its apparent flux and size on the sky. We make a special effort to reproduce the main spectro-photometric features of the extragalactic point-source population (AGNs). For the first time, we process eRASS simulation fields with the eROSITA source-detection software (preliminary version). We derive realistic detection lists, similar to the real detection lists expected for scientific use. In particular, we explore thresholds needed to distinguish between spurious, point-like, and extended sources and provide, given a chosen set of cuts, a first series of selection functions for point-like and extended sources. We demonstrate their practical usability with a prediction of the distribution of galaxy clusters in the eROSITA sky by means of a forward-modelling approach.

This paper is constructed as follows. We first describe the built-in components of the simulations in Sect. 2, and then we describe the simulation engine at the core of the analysis in Sect. 3. We describe our selected simulation and instrumental setup, as well as our choice of fields in Sect. 4. In Sect. 5 we show the source detection results, in particular the selection functions. We discuss the impact of our important assumptions in Sect. 6 and bring perspectives in Sect. 7.

Throughout the paper and unless stated otherwise, we assume a Λ cold dark matter (CDM) cosmology with Ωm = 0.3, ΩΛ = 0.7 and H0 = 70 km s−1 Mpc−1.

2. Simulated components

This section presents the main expected components in a typical blank field of our simulations of the extragalactic eROSITA sky.

2.1. AGN and cosmic X-ray background

We attempt to accurately reproduce the observed distribution of spectro-photometric properties of X-ray-emitting AGNs. A list of spectra and positions, each corresponding to an individual source, is produced down to extremely low fluxes. The integration of the low-flux tail of the distribution provides a model for the unresolved X-ray background component up to the limit at which we simulate sources individually.

2.1.1. Spectral models

We rely on a custom implementation of the formalism by Gilli et al. (2007) to generate spectral models on a log-spaced grid of energies in the range [0.1, 100] keV using XSPEC v12.7.0u (Arnaud 1996). Parameters governing the spectral shape of a source are a power-law photon index, Γ, the absorbing column density, NH, the source redshift, z, and the (unabsorbed) luminosity, LX, of the object in a given rest-frame 2–10 keV band. A critical parameter governing the choice of spectral model is the intrinsic absorption NH. We call unobscured those sources with log10 NH < 21, Compton-thin those showing 21 < log10 NH < 24, Compton-thick mild those with 24 < log10 NH < 25, and Compton-thick heavy those that have log10 NH > 25. For a given obscuration class, two regimes are considered, Seyfert or QSO, depending on whether the 0.5–2 keV rest-frame luminosity of the source is lower or greater than LX = 1046 erg s−1. We refer to Gilli et al. (2007) for details on the modelling of spectral energy distribution (SED) for each of these classes. The energy range and level of detail in the SED were chosen to match the expected detector performances of eROSITA. Depending on source class, they include a (cut-off) power-law with index Γ, and a 6.4 keV iron line with various equivalent widths (Gilli et al. 1999), possibly modulated by a reflection component. Compton-thick mild sources have their cut-off power-law replaced by a more complex plcabs model (Yaqoob et al. 1997). The source is redshifted before applying an additional absorption by the Galaxy () depending on the location of the source on the sky. Finally, the flux of a source is obtained by integration of its SED, accounting for the luminosity distance computed in our reference cosmology.

2.1.2. Sampling the luminosity functions

Similarly to Gilli et al. (2007) we describe the luminosity function of unobscured AGN sources with the luminosity-dependent density evolution (LDDE) model of Hasinger et al. (2005). Obscured sources are sampled from the LDDE modulated by a multiplicative factor, ranging from four to one as the source intrinsic luminosity increases. Obscuration values are distributed following the prescription by Gilli et al. (2007), while power-law index parameters are drawn from a normal distribution of mean 〈Γ〉 = 1.9 and spread 0.2 regardless of the source obscuration level. Source luminosities range from 1042 erg s−1 and redshifts span the 0 < z < 5 interval. After accounting for the cosmological volume, we compute the sky density n(Γ, NH, z, LX) (units deg−2) and random-sample this distribution in order to obtain a discrete list of sources. Figure 1 represents the density of one such source list in the luminosity-redshift plane. Each source is then assigned an SED as described in the previous section. Sky positions are uniformly distributed in a field, as we do not aim to accurately model the spatial distribution of sources in this work (see Paper II, Ramos-Ceja et al. for a more detailed treatment).

thumbnail Fig. 1.

Two-dimensional histogram distribution of simulated sources in one realisation of our X-ray AGN luminosity function sampling for a 22.7 deg2 area on the sky (253, 297 sources in total). Each black contour encloses the fraction of sources indicated as a label. To each source belongs one X-ray spectral model uniquely defined by the source luminosity, redshift, power-law index Γ and absorbing column density NH (Sect. 2.1).

Open with DEXTER

We verified the validity of our sampling procedure by computing the flux distributions of the simulated sources in different bands. We compared our results to Gilli et al. (2007) and to published log N–log S: the agreement in the soft-band is excellent (see Fig. 2), while we predict twice as many heavily obscured sources (log10 NH > 24) in the 2–10 keV band in comparison to Gilli et al. (2007). We attribute this discrepancy for the rarest sources to our choices made in the high-energy modeling of the SED. This has practically no impact on this work which focuses on the soft-band characteristics of the eROSITA images.

thumbnail Fig. 2.

Soft-band cumulative source number counts for one realisation of the X-ray AGN luminosity function sampling for a 22.7 deg2 area on the sky. Error bars are for each point. The parametrized log N–log S from Georgakakis et al. (2008) and Lehmer et al. (2012) are overplotted (lines) for comparison. Vertical dashed lines indicate the flux flim of the faintest source being simulated in each of the three fields (Sect. 4.3).

Open with DEXTER

2.1.3. Constructing the unresolved X-ray background

The above-described procedure does not assume a lower limit on the flux of simulated sources. Sources well below the eROSITA detection limit are actually not simulated in order to save computation resources. A flux threshold flim is set depending on the exposure time of a simulated field (Sect. 4.3) and only sources with f > flim are individually simulated. The spectra of the remaining faint sources are stacked together and uniformly redistributed over a simulated patch of sky, thereby constituting one single “uniformly extended source” instead of many point-sources. By doing so, we ensure self-consistent and realistic modelling of the spectral emission of the X-ray background (XRB) generated by unresolved AGNs. As an illustration, the spectrum of the AGN background component in the equatorial field (flim = 3 × 10−15 erg s−1 cm2) in the 0.5–2 keV band with galactic absorption is shown with a dashed line in Fig. 3. This figure also demonstrates the good agreement between the XMM-Newton measurements of (Lumb et al. 2002; derived from XMM-Newton observations with sources excised down to ∼10−14 erg s−1 cm−2 in the soft-band) and our unresolved XRB model with a similar flim.

thumbnail Fig. 3.

Energy spectrum of the simulated cosmic X-ray background components. The thick black dashed and plain lines are obtained with our model for AGN sources by stacking individual spectra of faint sources below flim = 10−14 (plain) or 3 × 10−15 (dashed erg s−1 cm−2. For comparison, the dot-dashed green line shows the model of Lumb et al. (2002) derived from XMM-Newton observations. Our emission model for the Galaxy (red and dotted lines) is described in the text.

Open with DEXTER

2.2. Extended sources as β-models

Galaxy clusters are simulated in the simplest way using spherically symmetric β-models (Cavaliere & Fusco-Femiano 1978) with different fluxes and core radii values and β = 2/3. Our goal is indeed to derive selection functions that depend on a limited number of parameters. Sources representing galaxy clusters are randomly distributed across a simulated field, with a density of around 2 per deg2. Their spectral emission is rendered by an isothermal APEC model with 0.3 Z abundance, at temperature T ∈ {1, 5} keV and redshift z ∈ {0.3, 0.8}. Clusters have 0.5–2 keV fluxes chosen among discrete values ranging between 2 × 10−15 and 5 × 10−13 erg s−1 cm−2; core radii are also picked among discrete values ranging between 10 and 80 arcsec. The redshift and temperature of the spectral models have practically no impact on the 0.5–2 keV energy conversion factors transforming fluxes into count-rates, and therefore have no impact on the 0.5–2 keV detection tests, which are the core of this study.

2.3. Particle and galactic background components

In addition to the X-ray background originating from unresolved AGNs in the field, two other main background components were added to our set of simulations. The contribution of unresolved galaxy clusters and groups to the eROSITA soft-band background is neglected, since it is a small component in the energy and sensitivity regimes relevant to this study (e.g. Gilli et al. 1999, 2007; Kolodzig et al. 2017).

Following Lumb et al. (2002), the emission of the Galaxy is modelled with a double MEKAL model of temperatures 0.21 and 0.081 keV and solar abundance, representing the emission of the hot plasma located in the Galactic disk and halo. We assume a local photo absorbing column density equivalent to that of the field under consideration. We neglect here any spatially dependent contribution to the Galactic background such as emission from the Hot Local Bubble.

Particle background is sampled from a list of events drawn from a GEANT4 simulation designed to reproduce the expected radiation environment at the Lagrange point L2 (Tenzer et al. 2010). We assume this background component is not focused by the telescope mirror systems, and therefore it is not vignetted and impacts the detectors uniformly. Soft proton flares can create rapid enhancement of the level of unvignetted background. However, we limit our present study to the case of nominal particle background level and defer the analysis of the flare-induced background to further work. Therefore, the exposure assumptions in this work are on the optimistic side.

3. The eROSITA simulation engine

The simulations presented in this paper result in realistic eROSITA-calibrated event lists, similar to those expected to be delivered by the eROSITA ground segment. Such event lists contain the arrival time and CCD coordinates of the incoming events (photons or particles), as well as a reconstruction of their sky location and absolute energy. We reconstruct these characteristics assuming perfect knowledge of the calibration and spacecraft attitude. We make use of the Monte-Carlo simulator SIXTE1. This simulator virtually implements a realistic transfer function converting sky photons into detector events, accurately accounting for CCD characteristics (including response functions and clocking) and telescope mirror behaviour. In order to save computation time, some parts of the telescope+instrument transfer function are modelled statistically, thus deviating from a pure ray-tracing simulator. These simplifications show notably at the mirror (point-spread and vignetting functions) and the CCD (response function) stages. We refer to Schmid (2012) for a detailed description of the SIXTE and its implementation in the context of eROSITA.

The detectors were simulated assuming an integration time of 50 ms and a finite readout time of the 384 CCD lines (pile-up effects are not relevant in this work). Response matrices are taken from rescaled EPIC-pn response matrices; those are of sufficient accuracy here, as we are focusing on broad-band properties. The field-of-view of each of the seven detectors is circular with a diameter of 1.02 deg, corresponding to the extent of the 384 × 384 pixel cameras with pixel size 9.6″.

4. Instrumental and observational setup

4.1. Exposure maps and attitude files

A simple scanning strategy for the four-year survey is assumed in this work, with the spacecraft scanning axis always pointing towards the Sun. The actual spacecraft law will be subject to subtle changes in the scanning pattern in order to fulfill angular constraints linked to, for example, the solar panels or stray-light requirements. Those ultimately lead to less uniform all-sky exposure maps, as discussed in Merloni et al. (2012). Since the present paper focuses on small patches of sky sufficiently far away from the ecliptic poles, these differences are neglected. Extrapolation of our results to the all-sky survey needs, in principle, a proper treatment of these exposure variations. The corresponding attitude files describing the coordinates of the scanning axis in steps of 60 s serve as input to the simulator. We assumed no gaps or jumps over the full duration of the survey, as well as ideal reconstruction of the attitude from the on-board star trackers.

4.2. Point-spread function and vignetting

During the simulation procedure, photons originating from a source at infinite distance are redistributed using synthetic point-spread functions simulated with a ray-tracing procedure (P. Friedrich, priv. comm.). This accurately reproduces an eROSITA ideal mirror system made of 54 nested shells (Wolter-I configuration), including the spokes and the presence of an X-ray baffle. Such simulations were performed assuming a focal length of 1.6 m and a 0.4 mm intra-focal shift of the detector relative to the best on-axis focal point. This small shift was found to optimize the overall survey PSF size, at the cost of degrading the on-axis PSF. We note that the actual point-spread function will be measured on the sky when the instrument operates and will be compared to ray-tracing simulations and ground measurements (e.g. as done at the PANTER facility).

The PSF we used is described as a tabulated series of images in steps of 1′ off-axis angles ranging from 0 to 30′ and for energies E = {1,2,3,4,7} keV (see Fig. 5). We assume constant PSF shape as a function of azimuthal angle, as we consider only axial rotation, as is usual with the Wolter-I telescope symmetry. Because it is counting photons individually, the ray-tracing simulation additionally provides an estimate of the vignetting factor on a grid of energies and off-axis angles. It is used to compute the ratio of flux between double-reflected photons and all photons emitted by a source located at a given off-axis angle, and usually expressed relative to the on-axis position. Figure 6 shows the combined effect of vignetting and PSF distortion on a bright point source passing about 50 times through the eROSITA field-of-view during the four-year scan duration.

thumbnail Fig. 5.

Ray-tracing simulated telescope point-spread function used in this paper. The images show the response of one eROSITA mirror module to a point-source at different incoming photon energies (from top to bottom: 1, 3 and 7 keV) and different angular distances from the optical axis (from left to right: on-axis, 15′, 25′). The colour scale in each panel is linear and encompasses the tenfold increase between the minimal (light red) and maximal (black) intensity, thereby emphasising the typical shape distortions due to Wolter optics.

Open with DEXTER

thumbnail Fig. 6.

Top panel: simulation of a bright point-source with flux 10−11 erg s−1 cm−2 in a four-year eROSITA equatorial region (∼2 ks exposure time). The image shows the sky projection of the 0.5–2 keV source events collected by the seven CCD, binned with 4″ pixels. Left: “Survey PSF”, including all events. Middle: selecting only low off-axis events (θ < 16.5′, 40% of the total number of events). Right: selecting only large off-axis events (θ > 16.5′, 60%). The circle has a radius of 30″, slightly larger than the half-energy width of the survey PSF. Bottom panel: corresponding radial profiles in 4″ bins (error bars are only shown for the top curve).

Open with DEXTER

4.3. Simulated fields

We selected three fields at specific locations in the eROSITA sky (see Fig. 4). A field corresponds to an elementary region of the eROSITA sky tiling pattern, and shows as a 3.6 deg × 3.6 deg square in tangential projection. In the following we name these fields: Equatorial (∼2 ks exposure time, uniform), Intermediate (∼4 ks, less uniform), and Deep (∼10 ks, larger exposure gradient). Table 1 provides key parameters relevant to these simulated fields.

thumbnail Fig. 4.

Simulated eROSITA all-sky four-year exposure map in equatorial coordinates used in this work, with the location of the three relevant simulated fields: equatorial (∼2 ks exposure time, uniform), intermediate (∼4 ks, slight gradient), and deep (∼10 ks, larger gradient). The colour bar (logarithmic scale) is in units of seconds.

Open with DEXTER

Table 1.

Global parameters for the three types of fields simulated in this study.

4.4. Images of the synthetic fields

Figure 7 shows a 15′ × 15′ excerpt of the images created out of the simulated event lists of blank fields, that is, fields without galaxy clusters. The distribution of point-like sources is uniform over the sky: any slight apparent gradient in source concentration is an effect of varying exposure times across the fields. The increase in sensitivity clearly makes more sources visible by eye; this figure also outlines the excellent angular resolution of eROSITA, well-adapted to beat confusion effects over most of the survey area, even in deep fields.

thumbnail Fig. 7.

Zoom over three simulated eROSITA extragalactic survey fields (equatorial, intermediate and deep, from left to right) in the 0.5–2 keV band, free from galaxy clusters (i.e. containing only backgrounds and AGNs as point-sources). North is up and east left; each dashed square has sides of 15′ in length. The blue circles have a radius of 0.5′ and show the position of the detected sources. The pixel scale is 4″ and identical grey scales are applied to each image to emphasise the differences in sensitivity.

Open with DEXTER

5. Results

5.1. Source detection and characterisation

The source detection and characterisation procedure used in this work is a preliminary version of the source detection tool in the eROSITA Science Analysis Software System (eSASS) package. It builds upon the source detection algorithm used in the XMM-Newton Science Analysis System (XMM-SAS) with several revisions and upgrades. The detection procedure is based on the sliding-cell method. As a first step, this algorithm scans an X-ray image with a sliding square box, and if the signal-to-noise ratio (S/N) in the box is greater than a specified threshold value it is marked as a source candidate. The signal is calculated from the pixel values inside the cell, and the background is estimated from the neighbouring pixels. Subsequently, the candidate objects are removed from the image creating a source-free image which is interpolated by a spline function to create a smooth background map. The algorithm convolves the input image with a 9 × 9 pixel (36 × 36 arcsec) kernel described by a β = 2/3-profile with rc = 15 arcsec, which roughly matches the survey PSF. The convolved image and the corresponding background map are then used to calculate an S/N map, in which the significant peaks are the positions of the detected sources. In order to increase the sensitivity for large extended sources, this procedure is repeated for 2 × 2 and 4 × 4 rebinned images corresponding to kernels with rc = 30 and rc = 60 arcsec, respectively.

Each source candidate identified by the sliding cell algorithm is further analysed by a maximum likelihood fitting method. This technique compares the spatial distribution of the input sources with a PSF2 convolved with a source extent model (β-profile). The final log-likelihood is calculated by varying the input source parameters, i.e. position, counts, and extent. A multi-PSF fit is also implemented which helps in deblending and reconstructing the parameters of close-by sources. In the output list, only sources with a log-likelihood above a given threshold are kept.

Among the maximum likelihood fit, output parameters of interest are: i) detection log-likelihood, which gives the significance of the detection; ii) extent, which is the apparent extension of the best fitting β-model in pixel units; and iii) extension log-likelihood, which compares the significance of the extended model and the point-like model. This last parameter classifies the detected sources as point-like (value equal zero) or as extended-like (value greater than zero).

Given that the PSF fitting of the maximum likelihood fitting method is more sensitive to the core of the PSF when on- and off-axis photons are separated, two images from the same simulation and covering the same sky region are produced with photons chosen according to their position on the FoV. The photons are split into inner photons (<16.5′) and outer photons (>16.5′). In this way, the source detection pipeline runs simultaneously over two images (see Fig. 6).

All simulated images were analysed with the method described above. The detected sources were cross-identified with the simulation inputs using a matching radius of 28 arcsec for point-like sources and 80 arcsec for extended ones.

5.2. Source classification

A trade-off between sample completeness and contamination is inevitable when the source selection function in surveys is estimated. Following a methodology introduced in Pacaud et al. (2006), we explore the output parameter space of the maximum likelihood fitting method by means of our simulations in order to set point-like and extended source classification criteria and to estimate their contamination by spurious and misclassified sources. We define spurious detections as those that cannot be identified with any input source within the search radius, and misclassified sources as those point-sources classified by the pipeline as extended sources or vice versa. We define false detections as a single concept that includes spurious and misclassified detections.

5.2.1. Point-source selection functions

AGNs represent the dominant extra-galactic population at X-ray wavelengths. Although the goal of this work is to determine the galaxy cluster selection function, the estimation of the point-like detection efficiency and its contamination helps to control the systematics in the detection and characterisation of the extended source population.

First, we restrict ourselves to estimating the false detection rate based on the blank field simulations, that is, with point-like sources plus background only. We simulate each field 30 times. We find that a simple threshold in the source detection log-likelihood parameter removes most of the false point-like sources while maintaining a good detection efficiency. We choose a threshold value of 10, obtaining ∼0.1, ∼0.2, and ∼1.1 spurious sources per deg2 for the equatorial, intermediate, and deep fields, respectively. Such false detection numbers correspond to ∼0.1%, ∼0.2%, and ∼0.3% of the average detected sources per deg2 in their respective fields.

The resulting AGN detection efficiency as a function of input flux is shown in the top panel of Fig. 8. This efficiency is obtained by calculating the ratio of the cross-identified objects to the input sources. The displayed error is given by the standard deviation over the 30 simulations of each simulated field. For the equatorial field, the point-like sources have a 90% completeness at a flux limit of ∼1.7 × 10−14 erg s−1 cm−2, while for the intermediate field this flux limit is ∼9.7 × 10−15 erg s−1 cm−2, and for the deep field it is ∼6.5 × 10−15 erg s−1 cm−2. The 50% completeness is reached at ∼1.0 × 10−14 erg s−1 cm−2 for the equatorial field, ∼5.2 × 10−15 erg s−1 cm−2 for the intermediate field, and ∼3.1 × 10−15 erg s−1 cm−2 for the deep field. The large error bars in bright sources reflect mainly their lower number density, which is given by the AGN log N– log S distribution.

thumbnail Fig. 8.

Point-like source completeness analysis for all three simulated sky regions: Equatorial (red diamonds), Intermediate (green circles) and Deep (blue squares). The abscissa is the input source flux. Top panel: point-like detection efficiency. Discontinuous lines represent the parametrized models described in Appendix A. Middle panel: differential number counts. Bottom panel: integral number of point-like sources. In the middle and bottom panels, the solid line shows the input distribution. The error is given by the standard deviation over the simulations.

Open with DEXTER

5.2.2. Cluster selection functions

The extended source classification is a complicated task since it has not to only deal with spurious detections but also with misclassified point-like sources, that is, point-like sources characterised as extended. Moreover, extended sources usually have a low surface brightness making their detection and characterisation a difficult process. Our goal is to find a location in the detection/characterisation parameter space where the majority of the simulated extended sources are recovered while keeping the contamination level at a reasonable rate. This is of special importance given that the goal of eROSITA is to use galaxy cluster counts to constrain the dark energy. We remind here that in contrast with the AGN population, which was simulated following a log N– log S, sources representing galaxy clusters are randomly distributed across the simulated fields with a density of around 2 per deg2 (see Sect. 2).

Besides the source detection log-likelihood values stated in the previous section, we scanned the source extent–extension log-likelihood parameter space to look for criteria that allow us to obtain a large and uncontaminated extended source sample while maintaining a high detection rate. For this, we use cluster fields, that is, simulations that contain X-ray background, point-like sources, and extended sources. Figure 9 shows the final selection process in the extent–extension log-likelihood plane for the Equatorial (top), Intermediate (middle), and Deep (bottom) fields.

thumbnail Fig. 9.

Final selection criteria for extended sources, from the preliminary version of the eSASS pipeline, with optimal (low-contamination) parameters. The extent–extension log-likelihood plane is shown for the three simulated sky fields: equatorial (top), intermediate (middle), and (deep) (bottom). Left panels: simulated (and detected) clusters are displayed as green dots, spurious extended detections as red triangles, and AGNs classified as extended sources in blue squares. Middle panels: only input detected galaxy clusters are displayed (green diamonds in the left panels). The distinct colours show the different simulated core radii (in arcsec). Right panels: only input detected galaxy clusters are displayed. The different colours show the distinct simulated input fluxes (in units of erg s−1 cm−2).

Open with DEXTER

We specify that the maximum extent value that the algorithm should assign to a source is 30 pixels (120 arcsec), even if the algorithm drifts towards a larger value. The minimum requested extent value is 1.5 pixels (6 arcsec), and the threshold of the extension log-likelihood is 6. These thresholds ensure a low contamination by spurious sources, but the number of misclassified point-like sources varies in the different fields. For the equatorial field we obtain ∼0.5 false extended sources per deg2. In the intermediate field we have ∼1.4 false extended sources per deg2, and for the deep field we obtain ∼8.5 false extended sources per deg2.

Table 2 shows in detail the fraction of spurious and misclassified sources in each simulated field. It is worth mentioning that similar numbers of spurious and misclassified sources are found in both the blank and cluster fields when using the same thresholds. In Sect. 6.2 we forecast the number of expected clusters assuming a survey with a depth equal to the Equatorial field all over the sky. We expect to detect ∼5.2 clusters per deg2 plus 10% contamination from our false sources.

The middle and right panels of Fig. 9 show the extended sources colour-coded according to the input core radius and flux values, respectively. The middle panels display the distribution of the discrete values used for the core radius of the simulated clusters (see Sect. 2), while the right panels show that mainly sources with high-flux end within the plane of the selection criteria.

As seen in Fig. 9, one could put more stringent criteria to obtain a non-contaminated cluster sample, for example, increasing the minimum value of the extension log-likelihood, but this would lead to excluding a considerable amount of extended sources, especially the faintest ones.

The normalized detection probabilities of extended sources for the three simulated fields are presented in Fig. 10, as a function of the input flux. In these plots, a detection efficiency equal to 1 means that 100% of the simulated sources have been detected and classified as extended. As expected, the deeper the observation, the fainter the recovered extended sources. Figure 11 shows the mean detection probability of extended sources as a function of input flux and input core radius. Similar to other works (e.g. Vikhlinin et al. 1998; Pacaud et al. 2006; Clerc et al. 2012a), we also found that the extended source detection efficiency is not a function of source flux only, especially for the shallower observations.

thumbnail Fig. 10.

Extended source detection efficiency from the eSASS pipeline in the Equatorial (∼2 ks exposure, top), Intermediate (∼4 ks, middle) and Deep (∼10 ks, bottom) simulated fields as a function of input flux and for each simulated core radius value.

Open with DEXTER

thumbnail Fig. 11.

Extended source detection efficiency of the eSASS pipeline in the Equatorial (∼2 ks exposure, top), Intermediate (∼4 ks, middle) and Deep (∼10 ks, bottom) simulated fields as a function of input flux and core radius.

Open with DEXTER

Table 2.

Number of spurious and misclassified extended sources (galaxy clusters) and point-like sources (AGN) in the cluster field simulations on the equatorial, intermediate, and deep fields.

6. Discussion

6.1. Effect of source classification criteria

One could argue that the number of false extended source detections, that is, spurious and misclassified detections, found in the different simulated fields (see Table 2) is high considering that eROSITA will perform an all-sky survey. However, most of the false extended detections are misclassified point-sources. Such sources might be close pairs of point-sources which cannot be disentangled by the detection algorithm and were therefore classified as an extended source. One way to reduce the number of misclassified sources is by doing a complete follow-up on the detected extended sources. Another way is by putting stricter thresholds in source classification criteria; for example by increasing the extent and extension log-likelihood thresholds (see Sect. 5.2 and Fig. 9). For example, using a threshold value in extension log-likelihood of 20 reduces the number of missclassified point-like sources in the three fields by ≳95%. Although such an approach gives a cleaner sample, many real extended sources are missed.

6.2. Relevance on cosmological forecasts

Uncertainties in the selection function of a sample of clusters can introduce biases to the cosmological constraints which are determined from them. In this section, we discuss the impact that incomplete knowledge of the selection has on the recovered cosmological constraints. For this test, we follow the methodology of Clerc et al. (2012a) and use the z-CR-HR method. We assume that the selection has eliminated all spurious clusters and misclassified AGNs.

6.2.1. The z-CR-HR method

The z-CR-HR method is based on the premise that the raw X-ray data of a galaxy cluster contain significant information about its redshift, luminosity, and temperature and that this information can be statistically extracted. The cosmological analysis is then simplified by basing it on only the cluster redshift and quantities that are directly observable in X-rays, namely the count-rate in the 0.5–2 keV band (CR) and the hardness ratio (HR), which is the ratio of the count-rates measured in the 1–2 keV and 0.5–1 keV bands. A particular advantage of this method is that it bypasses the need to derive individual cluster masses, X-ray luminosities, and temperatures and that the scaling relations between mass and its X-ray proxies can be constrained simultaneously with the cosmological parameters. The key steps in this procedure are as follows:

  • Compute the halo mass function.

  • Derive the 3D distributions of temperature, luminosity, and core radius using the MT, LT, and Mrc scaling relations, taking the relevant scatters into account.

  • Apply an instrumental model for eROSITA to obtain a theoretical distribution of clusters in the CR-HR plane for each slice in the redshift space.

  • Apply the selection function to obtain a synthetic observed distribution of clusters that one would expect eROSITA to detect (here the equatorial selection for nominal thresholds, Fig. 10, top).

  • Apply an error model to account for measurement errors of CR and HR.

6.2.2. Simulated eROSITA z-CR-HR catalogues

After following the procedure described in the previous section and with the unconvolved, error-free z-CR-HR distribution in hand, we randomly sampled the CR-HR plane for each redshift slice to obtain a catalogue of mock clusters each with a redshift, count-rate, and hardness ratio. Gaussian random errors of 10% and 20% for CR and HR, respectively, are then added to each cluster in the mock catalogue3. Once the errors have been added, the catalogue is cut with the selection criteria. For this analysis, we apply cuts in CR in [0.002,1] cts s−1 (roughly corresponding to [0.28, 140] × 10−14 erg s−1 cm−2) and in HR in [0.02 , 2.0].

6.2.3. Cosmological analysis of mocks

In order to recover the input cosmological parameters, we employ a maximum likelihood method and sample the cosmological parameters using a Markov chain Monte Carlo (MCMC) method. For the description of the likelihood we make use of the unbinned Cash C-statistic (Cash 1979) which provides a useful way of determining how well a given set of data fits the expected distribution. The log-likelihood which we compute for each set of cosmological parameters is given by,

(1)

where the sum in the above equation runs over all selected clusters and the integral (calculated over the cluster selection criteria) gives the number of clusters expected to be within the CR-HR region.

For this work, we chose to use the publicly available Python package emcee (Foreman-Mackey et al. 2013), an affine invariant ensemble sampler.

For this analysis, we assume a ΛCDM cosmological model relying on the parameters calculated by Hinshaw et al. (2013), in particular with Ωm = 0.28, ΩΛ = 0.72, σ8 = 0.82 and H0 = 70 km s−1 Mpc−1. The scaling relations for MT and LT are those derived by the XXL collaboration (Pacaud et al. 2016; Giles et al. 2016; Lieu et al. 2016). We only fit for two cosmological parameters, ΩM and σ8, since we only wish to show that incomplete knowledge of the selection function results in a bias to the recovered parameters. As shown in Fig. 10, the eROSITA selection function is defined for a series of values for the core radius. Here we consider the effect of assuming a selection function which is defined only for a single value of 35 arcsec for the core radius. This core radius is obtained as a weighted average of the core radii (in arcminutes) of the X-CLASS sample of clusters, Clerc et al. (2012b) and Ridl et al. (2017).

A total of 104 574 clusters were generated over a hypothetical survey of 20 000 square degrees to a uniform depth of 1.6 ks. The selection criteria for clusters entering the mock were 0.002 < CR < 1.0 cts s−1 and 0.02 < HR < 2.0. The results obtained from the MCMC likelihood analysis are shown in Fig. 12. We see that very tight and unbiased constraints on both Ωm and σ8 are obtained when the selection function is precisely known, as illustrated by the black contours in Fig. 12. On the other hand, a significant bias (shown by the green contours) is observed for both of these parameters when one assumes a core-radius-independent selection function when attempting to fit the cosmological parameters.

thumbnail Fig. 12.

Bias introduced by the single core radius selection function. The black contours show the recovered constraints from the complete selection function while the green contours are the results obtained by fitting the cosmology assuming a single core radius in the selection function. The contours represent the 68% and 95% confidence intervals, respectively. The red lines indicate the position of the fiducial input values used in the creation of the mock catalogue and the values quoted above the plots indicate the median value recovered when using the incorrect selection function.

Open with DEXTER

7. Conclusions

We have produced and analysed a set of realistic simulations for the eROSITA All-Sky Survey (eRASS) aiming towards precise selection functions for galaxy clusters. Our approach represents a trade-off between realism and tractability, capturing the essential (expected) instrumental and astrophysical features of the eRASS:

  • Fields of typical sizes in typical locations of the sky were selected and the exposure maps derived according to the spacecraft scanning law;

  • they are populated with AGNs following a realistic spectrophotometric distribution;

  • expected X-ray backgrounds (extragalactic and instrumental) are added;

  • the instrument is accurately modelled using the SIXTE simulator, combined with accurate ray-tracing PSF and vignetting models as well as a detailed detector model;

  • galaxy clusters are simulated with various fluxes and sizes following an average β-model profile.

Our main result consists in a revisited selection function for extended sources defined in the (flux, extent) parameter space. We show that such a selection function can be coupled to cosmological codes and we provide an example with forward-modelling the entire galaxy cluster population with the CR-HR method (Clerc et al. 2012a). Adjusting cosmological parameters to a mock catalog, we demonstrate that inaccurate knowledge of the selection function can lead to a significant bias in the derivation of cosmological parameters.

Such selection functions and results are valid to the extent of our current instrumental and astrophysical knowledge. Refined calibration and measurements (e.g. background, point-spread function, etc.), on-ground and in-orbit, will provide updated results, critically needed for statistical analyses based on the eROSITA all-sky survey. Different source-detection algorithms, possibly combining data from other wavelengths, may result in different quantitative selection functions; however the framework presented in this paper remains valid and can be used to quickly and efficiently assess their ability to provide constraints on cosmological models of structure formation.


2

This PSF is based on the ray-tracing simulations with 0.4 mm focus offset (see Sect. 4).

3

Although the characterisation of photometric measurements is beyond the scope of this paper, the same simulations as presented in this work can support derivation of such uncertainties.

Acknowledgments

The authors thank the anonymous referee for their suggestions and comments which clearly increased the quality of this paper. MERC acknowledges support by the German Aerospace Agency (DLR) with funds from the Ministry of Economy and Technology (BMWi) through grant 50 OR 1608. THR acknowledges support by the German Research Association (DFG) through grant RE 1462/6 and the Transregio 33 “The Dark Universe” sub-project B18. The authors thank S. Grandis for useful comments on a preliminary version of this paper.

References

Appendix A: Analytic fit to the point-like and extended source-selection curves

We provide analytic functions that represent the results obtained in Figs. 8 and 10. Due to the limited number of points sampling the curves in the steep transition region, we fitted functions that constitute a reasonable representation of the simulation.

For the extended source selection (galaxy clusters), we parametrize the completeness, dubbed c, as a function of 0.5–2 keV flux, exposure time (Texp) and core radius (rc) as follows:

where erf represents the error function,

For the point-like sources, the parametrization only depends on flux and exposure time:

(A.1)

In Figs. 8 and A.1 we show the models and their relatively good agreement to the data points extracted from the simulations. Such simple models cannot fully account for the details of the selection function curves, but they should be useful to provide ready-to-use estimates of completeness for various forecasts.

thumbnail Fig. A.1.

Similar to Fig. 10, where we superimposed the model lines computed according to formulas in Appendix A, for rc = 10, 40, 80″.

Open with DEXTER

All Tables

Table 1.

Global parameters for the three types of fields simulated in this study.

Table 2.

Number of spurious and misclassified extended sources (galaxy clusters) and point-like sources (AGN) in the cluster field simulations on the equatorial, intermediate, and deep fields.

All Figures

thumbnail Fig. 1.

Two-dimensional histogram distribution of simulated sources in one realisation of our X-ray AGN luminosity function sampling for a 22.7 deg2 area on the sky (253, 297 sources in total). Each black contour encloses the fraction of sources indicated as a label. To each source belongs one X-ray spectral model uniquely defined by the source luminosity, redshift, power-law index Γ and absorbing column density NH (Sect. 2.1).

Open with DEXTER
In the text
thumbnail Fig. 2.

Soft-band cumulative source number counts for one realisation of the X-ray AGN luminosity function sampling for a 22.7 deg2 area on the sky. Error bars are for each point. The parametrized log N–log S from Georgakakis et al. (2008) and Lehmer et al. (2012) are overplotted (lines) for comparison. Vertical dashed lines indicate the flux flim of the faintest source being simulated in each of the three fields (Sect. 4.3).

Open with DEXTER
In the text
thumbnail Fig. 3.

Energy spectrum of the simulated cosmic X-ray background components. The thick black dashed and plain lines are obtained with our model for AGN sources by stacking individual spectra of faint sources below flim = 10−14 (plain) or 3 × 10−15 (dashed erg s−1 cm−2. For comparison, the dot-dashed green line shows the model of Lumb et al. (2002) derived from XMM-Newton observations. Our emission model for the Galaxy (red and dotted lines) is described in the text.

Open with DEXTER
In the text
thumbnail Fig. 5.

Ray-tracing simulated telescope point-spread function used in this paper. The images show the response of one eROSITA mirror module to a point-source at different incoming photon energies (from top to bottom: 1, 3 and 7 keV) and different angular distances from the optical axis (from left to right: on-axis, 15′, 25′). The colour scale in each panel is linear and encompasses the tenfold increase between the minimal (light red) and maximal (black) intensity, thereby emphasising the typical shape distortions due to Wolter optics.

Open with DEXTER
In the text
thumbnail Fig. 6.

Top panel: simulation of a bright point-source with flux 10−11 erg s−1 cm−2 in a four-year eROSITA equatorial region (∼2 ks exposure time). The image shows the sky projection of the 0.5–2 keV source events collected by the seven CCD, binned with 4″ pixels. Left: “Survey PSF”, including all events. Middle: selecting only low off-axis events (θ < 16.5′, 40% of the total number of events). Right: selecting only large off-axis events (θ > 16.5′, 60%). The circle has a radius of 30″, slightly larger than the half-energy width of the survey PSF. Bottom panel: corresponding radial profiles in 4″ bins (error bars are only shown for the top curve).

Open with DEXTER
In the text
thumbnail Fig. 4.

Simulated eROSITA all-sky four-year exposure map in equatorial coordinates used in this work, with the location of the three relevant simulated fields: equatorial (∼2 ks exposure time, uniform), intermediate (∼4 ks, slight gradient), and deep (∼10 ks, larger gradient). The colour bar (logarithmic scale) is in units of seconds.

Open with DEXTER
In the text
thumbnail Fig. 7.

Zoom over three simulated eROSITA extragalactic survey fields (equatorial, intermediate and deep, from left to right) in the 0.5–2 keV band, free from galaxy clusters (i.e. containing only backgrounds and AGNs as point-sources). North is up and east left; each dashed square has sides of 15′ in length. The blue circles have a radius of 0.5′ and show the position of the detected sources. The pixel scale is 4″ and identical grey scales are applied to each image to emphasise the differences in sensitivity.

Open with DEXTER
In the text
thumbnail Fig. 8.

Point-like source completeness analysis for all three simulated sky regions: Equatorial (red diamonds), Intermediate (green circles) and Deep (blue squares). The abscissa is the input source flux. Top panel: point-like detection efficiency. Discontinuous lines represent the parametrized models described in Appendix A. Middle panel: differential number counts. Bottom panel: integral number of point-like sources. In the middle and bottom panels, the solid line shows the input distribution. The error is given by the standard deviation over the simulations.

Open with DEXTER
In the text
thumbnail Fig. 9.

Final selection criteria for extended sources, from the preliminary version of the eSASS pipeline, with optimal (low-contamination) parameters. The extent–extension log-likelihood plane is shown for the three simulated sky fields: equatorial (top), intermediate (middle), and (deep) (bottom). Left panels: simulated (and detected) clusters are displayed as green dots, spurious extended detections as red triangles, and AGNs classified as extended sources in blue squares. Middle panels: only input detected galaxy clusters are displayed (green diamonds in the left panels). The distinct colours show the different simulated core radii (in arcsec). Right panels: only input detected galaxy clusters are displayed. The different colours show the distinct simulated input fluxes (in units of erg s−1 cm−2).

Open with DEXTER
In the text
thumbnail Fig. 10.

Extended source detection efficiency from the eSASS pipeline in the Equatorial (∼2 ks exposure, top), Intermediate (∼4 ks, middle) and Deep (∼10 ks, bottom) simulated fields as a function of input flux and for each simulated core radius value.

Open with DEXTER
In the text
thumbnail Fig. 11.

Extended source detection efficiency of the eSASS pipeline in the Equatorial (∼2 ks exposure, top), Intermediate (∼4 ks, middle) and Deep (∼10 ks, bottom) simulated fields as a function of input flux and core radius.

Open with DEXTER
In the text
thumbnail Fig. 12.

Bias introduced by the single core radius selection function. The black contours show the recovered constraints from the complete selection function while the green contours are the results obtained by fitting the cosmology assuming a single core radius in the selection function. The contours represent the 68% and 95% confidence intervals, respectively. The red lines indicate the position of the fiducial input values used in the creation of the mock catalogue and the values quoted above the plots indicate the median value recovered when using the incorrect selection function.

Open with DEXTER
In the text
thumbnail Fig. A.1.

Similar to Fig. 10, where we superimposed the model lines computed according to formulas in Appendix A, for rc = 10, 40, 80″.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.