3. MET Data I/O

Data must often be preprocessed prior to using it for verification. Several MET tools exist for this purpose. In addition to preprocessing observations, some plotting utilities for data checking are also provided and described at the end of this section. Both the input and output file formats are described in this section. Section 3.1 and Section 3.2 are primarily concerned with re-formatting input files into the intermediate files required by some MET modules. These steps are represented by the first three columns in the MET flowchart depicted in Figure 1.1. Output data formats are described in Section 3.3. Common configuration files options are described in Section 3.5. Description of software modules used to reformat the data may now be found in Section 6 and Section 7.

3.1. Input data formats

The MET package can handle multiple gridded input data formats: GRIB version 1, GRIB version 2, and NetCDF files following the Climate and Forecast (CF) conventions, containing WRF output post-processed using wrf_interp, or produced by the MET tools themselves. MET supports standard NCEP, USAF, UKMet Office and ECMWF GRIB tables along with custom, user-defined GRIB tables and the extended PDS including ensemble member metadata. See Section 3.5 for more information. Point observation files may be supplied in either PrepBUFR, ASCII, or MADIS format. Note that MET does not require the Unified Post-Processor to be used, but does require that the input GRIB data be on a standard, de-staggered grid on pressure or regular levels in the vertical. While the Grid-Stat, Wavelet-Stat, MODE, and MTD tools can be run on a gridded field at virtually any level, the Point-Stat tool can only be used to verify forecasts at the surface or on pressure or height levels. MET does not interpolate between native model vertical levels.

When comparing two gridded fields with the Grid-Stat, Wavelet-Stat, Ensemble-Stat, MODE, MTD, or Series-Analysis tools, the input model and observation datasets must be on the same grid. MET will regrid files according to user specified options. Alternatively, outside of MET, the copygb and wgrib2 utilities are recommended for re-gridding GRIB1 and GRIB2 files, respectively. To preserve characteristics of the observations, it is generally preferred to re-grid the model data to the observation grid, rather than vice versa.

Input point observation files in PrepBUFR format are available through NCEP. The PrepBUFR observation files contain a wide variety of point-based observation types in a single file in a standard format. However, some users may wish to use observations not included in the standard PrepBUFR files. For this reason, prior to performing the verification step in the Point-Stat tool, the PrepBUFR file is reformatted with the PB2NC tool. In this step, the user can select various ways of stratifying the observation data spatially, temporally, and by type. The remaining observations are reformatted into an intermediate NetCDF file. The ASCII2NC tool may be used to convert ASCII point observations that are not available in the PrepBUFR files into this common NetCDF point observation format. Several other MET tools, described below, are also provided to reformat point observations into this common NetCDF point observation format prior to passing them as input to the Point-Stat or Ensemble-Stat verification tools.

Tropical cyclone forecasts and observations are typically provided in a specific ATCF (Automated Tropical Cyclone Forecasting) ASCII format, in A-deck, B-deck, and E-deck files.

3.2. Intermediate data formats

MET uses NetCDF as an intermediate file format. The MET tools which write gridded output files write to a common gridded NetCDF file format. The MET tools which write point output files write to a common point observation NetCDF file format.

3.3. Output data formats

The MET package currently produces output in the following basic file formats: STAT files, ASCII files, NetCDF files, PostScript plots, and png plots from the Plot-Mode-Field utility.

The STAT format consists of tabular ASCII data that can be easily read by many analysis tools and software packages. MET produces STAT output for the Grid-Stat, Point-Stat, Ensemble-Stat, Wavelet-Stat, and TC-Gen tools. STAT is a specialized ASCII format containing one record on each line. However, a single STAT file will typically contain multiple line types. Several header columns at the beginning of each line remain the same for each line type. However, the remaining columns after the header change for each line type. STAT files can be difficult for a human to read as the quantities represented for many columns of data change from line to line.

For this reason, ASCII output is also available as an alternative for these tools. The ASCII files contain exactly the same output as the STAT files but each STAT line type is grouped into a single ASCII file with a column header row making the output more human-readable. The configuration files control which line types are output and whether or not the optional ASCII files are generated.

The MODE tool creates two ASCII output files as well (although they are not in a STAT format). It generates an ASCII file containing contingency table counts and statistics comparing the model and observation fields being compared. The MODE tool also generates a second ASCII file containing all of the attributes for the single objects and pairs of objects. Each line in this file contains the same number of columns, and those columns not applicable to a given line type contain fill data. Similarly, the MTD tool writes one ASCII output file for 2D objects attributes and four ASCII output files for 3D object attributes.

The TC-Pairs and TC-Stat utilities produce ASCII output, similar in style to the STAT files, but with TC relevant fields.

Many of the tools generate gridded NetCDF output. Generally, this output acts as input to other MET tools or plotting programs. The point observation preprocessing tools produce NetCDF output as input to the statistics tools. Full details of the contents of the NetCDF files is found in Section 3.4 below.

The MODE, Wavelet-Stat and plotting tools produce PostScript plots summarizing the spatial approach used in the verification. The PostScript plots are generated using internal libraries and do not depend on an external plotting package. The MODE plots contain several summary pages at the beginning, but the total number of pages will depend on the merging options chosen. Additional pages will be created if merging is performed using the double thresholding or fuzzy engine merging techniques for the forecast and observation fields. The number of pages in the Wavelet-Stat plots depend on the number of masking tiles used and the dimension of those tiles. The first summary page is followed by plots for the wavelet decomposition of the forecast and observation fields. The generation of these PostScript output files can be disabled using command line options.

Users can use the optional plotting utilities Plot-Data-Plane, Plot-Point-Obs, and Plot-Mode-Field to produce graphics showing forecast, observation, and MODE object files.

3.4. Data format summary

The following is a summary of the input and output formats for each of the tools currently in MET. The output listed is the maximum number of possible output files. Generally, the type of output files generated can be controlled by the configuration files and/or the command line options:

  1. PB2NC Tool

    • Input: PrepBUFR point observation file(s) and one configuration file.

    • Output: One NetCDF file containing the observations that have been retained.

  2. ASCII2NC Tool

    • Input: ASCII point observation file(s) that has (have) been formatted as expected, and optional configuration file.

    • Output: One NetCDF file containing the reformatted observations.

  3. MADIS2NC Tool

    • Input: MADIS point observation file(s) in NetCDF format.

    • Output: One NetCDF file containing the reformatted observations.

  4. LIDAR2NC Tool

    • Input: One CALIPSO satellite HDF file.

    • Output: One NetCDF file containing the reformatted observations.

  5. IODA2NC Tool

    • Input: IODA observation file(s) in NetCDF format.

    • Output: One NetCDF file containing the reformatted observations.

  6. Point2Grid Tool

    • Input: One NetCDF file in the common point observation format.

    • Output: One NetCDF file containing a gridded representation of the point observations.

  7. Pcp-Combine Tool

    • Input: Two or more gridded model or observation files (in GRIB format for “sum” command, or any gridded file for “add”, “subtract”, and “derive” commands) containing data (often accumulated precipitation) to be combined.

    • Output: One NetCDF file containing output for the requested operation(s).

  8. Regrid-Data-Plane Tool

    • Input: One gridded model or observation field and one gridded field to provide grid specification if desired.

    • Output: One NetCDF file containing the regridded data field(s).

  9. Shift-Data-Plane Tool

    • Input: One gridded model or observation field.

    • Output: One NetCDF file containing the shifted data field.

  10. MODIS-Regrid Tool

    • Input: One gridded model or observation field and one gridded field to provide grid specification.

    • Output: One NetCDF file containing the regridded data field.

  11. Gen-VX-Mask Tool

    • Input: One gridded model or observation file and one file defining the masking region (varies based on masking type).

    • Output: One NetCDF file containing a bitmap for the resulting masking region.

  12. Point-Stat Tool

    • Input: One gridded model file, at least one NetCDF file in the common point observation format, and one configuration file.

    • Output: One STAT file containing all of the requested line types and several ASCII files for each line type requested.

  13. Grid-Stat Tool

    • Input: One gridded model file, one gridded observation file, and one configuration file.

    • Output: One STAT file containing all of the requested line types, several ASCII files for each line type requested, and one NetCDF file containing the matched pair data and difference field for each verification region and variable type/level being verified.

  14. Ensemble Stat Tool

    • Input: An arbitrary number of gridded model files, one or more gridded and/or point observation files, and one configuration file. Point and gridded observations are both accepted.

    • Output: One NetCDF file containing requested ensemble forecast information. If observations are provided, one STAT file containing all requested line types, several ASCII files for each line type requested, and one NetCDF file containing gridded observation ranks.

  15. Wavelet-Stat Tool

    • Input: One gridded model file, one gridded observation file, and one configuration file.

    • Output: One STAT file containing the “ISC” line type, one ASCII file containing intensity-scale information and statistics, one NetCDF file containing information about the wavelet decomposition of forecast and observed fields and their differences, and one PostScript file containing plots and summaries of the intensity-scale verification.

  16. GSID2MPR Tool

    • Input: One or more binary GSI diagnostic files (conventional or radiance) to be reformatted.

    • Output: One ASCII file in matched pair (MPR) format.

  17. GSID2ORANK Tool

    • Input: One or more binary GSI diagnostic files (conventional or radiance) to be reformatted.

    • Output: One ASCII file in observation rank (ORANK) format.

  18. Stat-Analysis Tool

    • Input: One or more STAT files output from the Point-Stat, Grid-Stat, Ensemble Stat, Wavelet-Stat, or TC-Gen tools and, optionally, one configuration file containing specifications for the analysis job(s) to be run on the STAT data.

    • Output: ASCII output of the analysis jobs is printed to the screen unless redirected to a file using the “-out” option or redirected to a STAT output file using the “-out_stat” option.

  19. Series-Analysis Tool

    • Input: An arbitrary number of gridded model files and gridded observation files and one configuration file.

    • Output: One NetCDF file containing requested output statistics on the same grid as the input files.

  20. Grid-Diag Tool

    • Input: An arbitrary number of gridded data files and one configuration file.

    • Output: One NetCDF file containing individual and joint histograms of the requested data.

  21. MODE Tool

    • Input: One gridded model file, one gridded observation file, and one or two configuration files.

    • Output: One ASCII file containing contingency table counts and statistics, one ASCII file containing single and pair object attribute values, one NetCDF file containing object indices for the gridded simple and cluster object fields, and one PostScript plot containing a summary of the features-based verification performed.

  22. MODE-Analysis Tool

    • Input: One or more MODE object statistics files from the MODE tool and, optionally, one configuration file containing specification for the analysis job(s) to be run on the object data.

    • Output: ASCII output of the analysis jobs will be printed to the screen unless redirected to a file using the “-out” option.

  23. MODE-TD Tool

    • Input: Two or more gridded model files, two or more gridded observation files, and one configuration file.

    • Output: One ASCII file containing 2D object attributes, four ASCII files containing 3D object attributes, and one NetCDF file containing object indices for the gridded simple and cluster object fields.

  24. TC-Dland Tool

    • Input: One or more files containing the longitude (Degrees East) and latitude (Degrees North) of all the coastlines and islands considered to be a significant landmass.

    • Output: One NetCDF format file containing a gridded field representing the distance to the nearest coastline or island, as specified in the input file.

  25. TC-Pairs Tool

    • Input: At least one A-deck or E-deck file and one B-deck ATCF format file containing output from a tropical cyclone tracker and one configuration file. The A-deck files contain forecast tracks, the E-deck files contain forecast probabilities, and the B-deck files are typically the NHC Best Track Analysis but could also be any ATCF format reference.

    • Output: ASCII output with the suffix .tcst.

  26. TC-Stat Tool

    • Input: One or more TCSTAT output files output from the TC-Pairs tool and, optionally, one configuration file containing specifications for the analysis job(s) to be run on the TCSTAT data.

    • Output: ASCII output of the analysis jobs will be printed to the screen unless redirected to a file using the “-out” option.

  27. TC-Gen Tool

    • Input: One or more Tropical Cyclone genesis format files, one or more verifying operational and BEST track files in ATCF format, and one configuration file.

    • Output: One STAT file containing all of the requested line types, several ASCII files for each line type requested, and one gridded NetCDF file containing counts of track points.

  28. TC-RMW Tool

    • Input: One or more gridded data files, one ATCF track file defining the storm location, and one configuration file.

    • Output: One gridded NetCDF file containing the requested model fields transformed into cylindrical coordinates.

  29. RMW-Analysis Tool

    • Input: One or more NetCDF output files from the TC-RMW tool and one configuration file.

    • Output: One NetCDF file for results aggregated across the filtered set of input files.

  30. Plot-Point-Obs Tool

    • Input: One NetCDF file containing point observation from the ASCII2NC, PB2NC, MADIS2NC, or LIDAR2NC tool.

    • Output: One postscript file containing a plot of the requested field.

  31. Plot-Data-Plane Tool

    • Input: One gridded data file to be plotted.

    • Output: One postscript file containing a plot of the requested field.

  32. Plot-MODE-Field Tool

    • Input: One or more MODE output files to be used for plotting and one configuration file.

    • Output: One PNG file with the requested MODE objects plotted. Options for objects include raw, simple or cluster and forecast or observed objects.

  33. GIS-Util Tools

    • Input: ESRI shape files ending in .dbf, .shp, or .shx.

    • Output: ASCII description of their contents printed to the screen.

3.5. Configuration File Details

Part of the strength of MET is the leveraging of capability across tools. There are several configuration options that are common to many of the tools.

Many of the MET tools use a configuration file to set parameters. This prevents the command line from becoming too long and cumbersome and makes the output easier to duplicate.

The configuration file details are described in Configuration File Overview and Tropical Cyclone Configuration Options.