ges logo
GES Home
Start Here
Data Overview
Known Issues
Data Releases
Release History
Schema browser
Data access
Login
Freeform SQL
Example Queries
Q&A
Glossary
Gallery
Publications
Downtime
Links
ges logo bottom
IFA     ROE
Home | Overview | Browser | Access | Login | Examples 
  ges logo

iDR5: Cross-matching the GES Science Archive with the Gaia DR1 release and other surveys

This document applies to release iDR5 and subsequent releases of the GES Science Archive and to release DR1 of the Gaia survey. Throughout 'Gaia' or 'Gaia release' should be understood to refer to the Gaia DR1 release.

This document describes how to cross-match, or join, tables in the GES Science Archive (hereafter GSA) with tables in other surveys hosted by WFAU, most notably a version of the Gaia DR1 data release.

Cross-matching the GES Science Archive

It is possible to cross-match or join the GES Science Archive (or GSA) with a local copy of the Gaia release or with several other external surveys hosted by WFAU (a list is given below). A target star in the GSA is considered to cross-match with objects in an external survey (that is, to potentially correspond to the same astronomical source) on the basis of the local star and external object having similar celestial coordinates (that is, the angular separation between their celestial coordinates is less than some pre-defined value). You can also include additional criteria to improve the reliability of the association. You have considerable flexibility both in customising the cross-match and in choosing the information to be listed from each of the surveys.

The large number of astronomical objects included in modern sky surveys means that it is impractical to cross-match surveys on-the-fly in response to a user query. Instead WFAU has consistently adopted the following approach in all the surveys that it hosts. For each survey a table of potential matches is created for each external survey with which it is to be cross-matched and the resulting set of tables provided as part of the survey. These tables are called cross-neighbour tables and the GSA has half-a-dozen or so of them. A cross-neighbour table comprises a list of all the objects in the remote survey to which it pertains that are within a pre-defined angular separation of any of the GES target stars. All these remote objects are potential cross-matches with a GES target star. The cross-neighbour table can be joined to other tables in both the GES and remote surveys, thus allowing:

  • Further criteria to be included in the selection to constrain and improve the selection of cross-matches; examples include a smaller maximum angular separation, consistency in brightness (or other parameters) between the surveys and the values of classification flags,

  • any of the tables in either survey to be included in the query and thus any of their columns to be listed.

An important constraint, however, is that it is only possible to find cross-matches out to the maximum angular separation specified when the cross-neighbour table was generated. Another flip-side is that the syntax of the query is somewhat opaque; it must include (at least) three tables: the cross-neighbour table and the source list in each survey from which it was generated. An alternative approach would have been to create a single static cross-match table with pre-computed matches and a subset of columns extracted from the remote survey, but this would have been much less flexible.

Note that in the GSA (and other WFAU databases) equatorial coordinates almost always have units of decimal degrees for both RA and Dec. Where prefered you can easily convert units in selections or WHERE clauses:

SELECT ra/15.0 AS raHours ...
       

or:

(ra/15.0) > 23.0 ...
       

Naming Convention for Tables and Columns

In GSA parlance, a table of pointers that associates a set of externally catalogued sources to the table of GES targets is called a cross-neighbour table. There is a basic naming convention for cross-neighbour tables, though in practice the actual names often deviate from it. This basic convention is as follows:

localTargetTable X remoteSurvey remoteTargetTable

Where appropriate the remoteSurvey name may include a data release number. All the GSA cross-neighbour tables have table Target as the localTargetTAble, so they all have names of the form:

TargetX remoteSurvey remoteTargetTable

and, for example, the cross-neighbour table for joining with the Gaia source catalogue is:

TargetXGAIADR1gaia_source

The actual names may differ somewhat from the basic convention (for example the database name may be omitted or the localTargetTable name may be abbreviated). However, all start with TargetX and they are always unambiguous about the names of the remote database and source catalogue to which they refer. Moreover, all the cross-neighbour tables are included amongst the tables listed in the schema browser, and the documentation in the schema browser for each cross-neighbour table explicitly tells you which remote database and table it corresponds to and hence which you should join it with. Every cross-neighbour table will contain at least the following three columns:

MasterObjId

The identifier of a row in the local target table that has matched with at least one row in the remote target table. In the case of the GSA MasterObjId will always be the TargetID of a star in table Target.

SlaveObjId

The identifier of a row in the remote target table which has matched with the GSA star corresponding to MasterObjId. The objects corresponding to MasterObjId and SlaveObjId will always have similar equatorial coordinates, with an angular separation less than some maximum specified when the table was created.

distanceMins

The actual angular separation, in minutes of arc, between the objects corresponding to MasterObjId and SlaveObjId.

Some cross-neighbour tables contain additional columns; you can use the schema browser to check which columns are present in individual tables.

Note that in principle, and in practice in crowded fields, a given star in the local target table might cross-match with multiple objects in the remote target table, and similarly a single object in the remote target table might cross-match with several stars in the local target table. Another effect that can lead to multiple matches (in either the local or remote table) is that the remote survey may have a higher or lower angular resolution than the star positions in the local survey. A further consideration is that if the local and remote surveys were conducted at different epochs a larger maximum cross-match separation will be used than is strictly required by the positional accuracies of the pairs of coordinates to allow for proper motion.

In any event, in all cases of multiple matches the local GES star and remote object pair with the smallest angular separation does not necessarily correspond to the correct identification of the same astronomical object in the two surveys. Additional columns in the surveys (perhaps a magnitude) can be used to establish the correct (or most likely) match.

External Surveys Cross-matched with the GSAs

The following table lists all the external surveys that have been cross-matched with the GSA. All the GSA cross-match tables generated for these surveys had a maximum cross-match angular separation of 10 seconds of arc.

Acronym External Survey Remote Survey Database Name Remote Target Table Description Number of Cross-Matches
Gaia Gaia Astrometric Survey GAIADR1 gaia_source Main Gaia source catalogue, with astrometry and photometry for each source. 277855
Gaia Gaia Astrometric Survey GAIADR1 tgas_source Tycho-Gaia Astrometric Solution catalogue. 613
ATLAS VST ATLAS Survey atlasDR3 atlasSource VST Southern hemisphere optical survey at high Galactic latitudes. 171709
SSA SuperCOSMOS Science Archive SSA Source The SuperCOSMOS Sky Survey as ingested into the SuperCOSMOS Science Archive. 214333
2MASS 2-Micron All-Sky Survey TWOMASS twomass_psc 2MASS point source catalogue. 133549
VHS Vista Hemisphere Survey VHSDR4 vhsSource Infrared southern hemisphere survey, as available in the Vista Science Archive. 124181

Example Cross-matches

  1. List the Gaia sources cross-matched with a given GES target star

    To list all the Gaia sources cross-matched with a given GES target star:

    SELECT
      tg.cname, tg.ra, tg.dec, tg.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax
    FROM
      Target tg,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE tg.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND tg.cName ='11053303-7700120'
    ORDER BY gdr1.source_id;
               

    Note:

    • The syntax for specifying a table in a remote database: the database name, followed by .. and then the table name; GaiaDR1..gaia_source gdr1 in the example.

    • The so-called table aliases tg, xgdr1 and gdr1 are a convenient short-hand for the table name.

    • The use of attribute prefixes (for example tg.cname) to distinguish between attributes in different tables that happen to have the same name.

    • The table joining condition in the WHERE clause, which selects associated rows. If you omit this condition, you will get every row in the remote table joined to every row in the input table. In database jargon this type of join is known as an 'outer join' and the resulting table is (usually) extremely large and particularly useless.

    • The ORDER BY clause forces the results to be listed in order of ascending Gaia source index. If this clause was omitted the results would listed in a random order.

    • All the Gaia sources selected are within the 10 seconds of arc, the cross-match radius specified when the cross-neighbour table was created.

    You can further refine the query by including only Gaia sources closer to the GES target than some angular separation as long as this separation is smaller than that used when the cross-neighbour table was generated:

    SELECT
      tg.cname, tg.ra, tg.dec, tg.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID, 
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax
    FROM
      Target tg,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE tg.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND tg.cName ='11053303-7700120'
      AND xgdr1.distanceMins < 0.1
    ORDER BY gdr1.source_id;
               

    Here only objects closer than 0.1 minutes of arc (that is, 6 seconds of arc) are being selected and the number of Gaia sources selected is reduced from 3 to 2.

  2. List the Gaia source closest to a given GES target star

  3. To list the single Gaia source that is closest to the GES target star:

    SELECT
      tg.cname, tg.ra, tg.dec, tg.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax
    FROM
      Target tg,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE tg.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND distanceMins IN (
        SELECT MIN(x2gdr1.distanceMins)
          FROM TargetXGAIADR1gaia_source x2gdr1
         WHERE x2gdr1.masterObjID=tg.targetID)
      AND tg.cName ='11053303-7700120';
               

    Note the use of a subquery to select the closest Gaia source. Selecting the closest neighbour can be combined wth setting your own cross-match radius within which Gaia objects are considered to be matches:

    SELECT
      tg.cname, tg.ra, tg.dec, tg.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax
    FROM
      Target tg,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE tg.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND distanceMins IN (
        SELECT MIN(x2gdr1.distanceMins)
          FROM TargetXGAIADR1gaia_source x2gdr1
         WHERE x2gdr1.masterObjID=tg.targetID)
      AND tg.cName ='11053303-7700120'
      AND xgdr1.distanceMins < 0.1;
               

    Here the closest source still falls within the specified angular separation of 6 seconds of arc and so it is still selected.

    Whether you prefer to retain all the possible matches or to just a single most likely match (either the closest or one determined by some other criteria, such as consistent brightness etc) will depend on what you are trying to do. For example, in the early stages of an investigation you may wish to retain all the matches, so that you can develop your criteria for identifying your best estimate of the corresponding astronomical source amongst the various matches. However, once you are satisfied with your criterion you might use it to generate a single match for each GES target, corresponding to (your best estimate of) the corresponding astronomical source, so that you can proceed to study the astrophysics of the chosen stars.

  4. List some recommended astrophysical parameters, together with Gaia positions, for a given GES target star

    A likely use of the GES Science Archive is to list recommended astrophysical parameters (tabulated in view RecommendedAstroAnalysis) cross-matched with details from the corresponding Gaia sources. This effect can be simply achieved by including view RecommendedAstroAnalysis in the query, suitably joined. For example, for a given star:

    SELECT
      tg.cname, tg.ra, tg.dec, tg.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      Target tg,
      RecommendedAstroAnalysis raa,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID = tg.targetID
      AND tg.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND raa.cName ='11053303-7700120'
    ORDER BY gdr1.source_id;
               

    However, this query can be simplified: because view RecommendedAstroAnalysis includes a TargetID column it can be joined directly to the cross-neighbour table:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND raa.cName ='11053303-7700120'
    ORDER BY gdr1.source_id;
               

    Obviously, similar queries can be constructed including any of the tables in the GSA.

  5. List all the Gaia sources for which the recommended astrophysical parameters of the corresponding GES stars satisfy some criteria

    To list the positions of all the Gaia sources that cross-match with the set of GES target stars whose recommended astrophysical parameters satisfy some criteria construct a query to select the required GES stars and join the resulting table with the Gaia source table. For example, to list the Gaia sources that cross-match with all the GES stars with an effective temperature greater than 25,000 K:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND raa.teff > 25000.0
    ORDER BY raa.cname, gdr1.source_id;
               

    Note that the results are listed in order of GES cname and then Gaia source index. Note that, again, each GES star will usually cross-match with several Gaia sources. Alternatively, to find just the Gaia source closest to each GES star:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND distanceMins IN (
        SELECT MIN(x2gdr1.distanceMins)
          FROM TargetXGAIADR1gaia_source x2gdr1
         WHERE x2gdr1.masterObjID=raa.targetID)
      AND raa.teff > 25000.0
    ORDER BY raa.cname;
               

    The results are now just listed in order of GES cname; there is no need to further order by Gaia source index.

  6. Cross-match the GSA with a survey other than Gaia

    All the previous examples cross-matched the GSA with the Gaia archive. The GSA can, of course, be cross-matched with any of the other surveys hosted by WFAU for which cross-neighbour tables have been generated (see the table above for the list). For example, to cross-match view RecommendedAstroAnalysis with the 2MASS survey:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xTwomass.masterObjID, xTwomass.slaveObjID,
      xTwomass.distanceMins,
      twomass.designation, twomass.ra, twomass.dec, twomass.j_m,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TWOMASS..twomass_psc twomass,
      TargetXtwomass_psc xTwomass
    WHERE raa.targetID=xTwomass.masterObjID
      AND xTwomass.slaveObjID=twomass.pts_key
      AND raa.cName ='11053303-7700120'
    ORDER BY twomass.designation;
               

    Where the GES target star cross-matches with two 2MASS sources. Similarly the GES stars can be cross-matched with more than one external survey:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xTwomass.masterObjID, xTwomass.slaveObjID,
      xTwomass.distanceMins,
      twomass.designation, twomass.ra, twomass.dec, twomass.j_m,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TWOMASS..twomass_psc twomass,
      TargetXtwomass_psc xTwomass,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID=xTwomass.masterObjID
      AND xTwomass.slaveObjID=twomass.pts_key
      AND raa.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND raa.cName ='11053303-7700120'
    ORDER BY twomass.designation, gdr1.source_id;
               

    Though note that the more surveys that you include the longer the query will take to run. Also, in the example, the single GES star matched with two 2MASS sources and three Gaia sources. Consequently in the results generated there is a row corresponding to every combination of the 2MASS and Gaia sources, leading to a results table with six entries. Such multiple rows can rapidly become confusing and when joining multiple surveys you may prefer to use a query that selects only a single match to the GES star from each external survey. For example:

    SELECT
      raa.cname, raa.objRa, raa.objDec, raa.targetID,
      xTwomass.masterObjID, xTwomass.slaveObjID,
      xTwomass.distanceMins,
      twomass.designation, twomass.ra, twomass.dec, twomass.j_m,
      xgdr1.masterObjID, xgdr1.slaveObjID,
      gdr1.source_id, xgdr1.distanceMins,
      gdr1.ra, gdr1.dec, gdr1.pmra, gdr1.pmdec, gdr1.parallax,
      raa.teff, raa.logg, raa.Feh
    FROM
      RecommendedAstroAnalysis raa,
      TWOMASS..twomass_psc twomass,
      TargetXtwomass_psc xTwomass,
      TargetXGAIADR1gaia_source xgdr1,
      GaiaDR1..gaia_source gdr1
    WHERE raa.targetID=xTwomass.masterObjID
      AND xTwomass.slaveObjID=twomass.pts_key
      AND raa.targetID = xgdr1.masterObjID
      AND xgdr1.slaveObjID = gdr1.source_id
      AND xTwomass.distanceMins IN (
        SELECT MIN(x2Twomass.distanceMins)
          FROM targetXtwomass_psc x2Twomass
         WHERE x2Twomass.masterObjID=raa.targetID)
      AND xgdr1.distanceMins IN (
        SELECT MIN(x2gdr1.distanceMins)
          FROM TargetXGAIADR1gaia_source x2gdr1
         WHERE x2gdr1.masterObjID=raa.targetID)
      AND raa.cName ='11053303-7700120';
    
               



Home | Overview | Browser | Access | Login | Examples

Links | Credits

WFAU, Institute for Astronomy,
Royal Observatory, Blackford Hill
Edinburgh, EH9 3HJ, UK
Tel +44 131 668 8356 (office)
or +44 131 668 8100 (switchboard)

ges-support@roe.ac.uk
11/1/2018