25. TC-Diag Tool
25.1. Introduction
A diagnosis of the large-scale environment of tropical cyclones (TCs) is foundational for many prediction techniques, including statistical-dynamical forecast aids and techniques based on artificial intelligence. Such diagnostics can also be used by forecasters seeking to understand how a given model’s forecast will pan out. Finally, TC diagnostics can be useful in verification to stratify the performance of models in different environmental regimes over a longer period of time, thereby providing useful insights on model biases or deficiencies for model developers and forecasters.
Originally developed for the Statistical Hurricane Intensity Prediction Scheme (SHIPS), and later as a stand-alone package called ‘Model Diagnostics’, by the Cooperative Institute for Research in the Atmosphere (CIRA), MET now integrates these capabilities into an extensible framework called the TC-Diag tool. This tool allows users to compute diagnostics for the large-scale environment of TCs using ATCF track and gridded model data inputs. The current version of the TC-Diag tool requires that the tracks and fields be self-consistent [i.e., the track should be the model’s (or ensemble’s) own predicted track(s)]. The reason is that the diagnostics are computed in a coordinate system centered on the model’s moving model storm and the current version of the tool does not yet include vortex removal. If the track is not consistent with the underlying fields, the diagnostics output are unlikely to be useful because the model’s simulated storm would contaminate the diagnostics calculations.
Note
A future version of the tool will include the capability to remove the model’s own vortex, which will allow the user to specify any arbitrary track (such as the operational center’s official forecast). Until then, users are advised that the track selected must be consistent with the model’s predicted track.
TC-Diag is run once for each initialization time to produce diagnostics for each user-specified combination of TC tracks and model fields. The user provides track data (such as one or more ATCF a-deck track files), along with track filtering criteria as needed, to select one or more tracks to be processed. The user also provides gridded model data from which diagnostics should be computed. Gridded data can be provided for multiple concurrent storms, multiple models, and/or multiple domains (i.e. parent and nest) in a single run.
TC-Diag first determines the list of valid times that appear in any one of the tracks. For each valid time, it processes all track points for that time. For each track point, it reads the gridded model fields requested in the configuration file and transforms the gridded data to a range-azimuth cylindrical coordinates grid, as described for the TC-RMW tool in Section 28. For each domain, it writes the range-azimuth data to a temporary NetCDF file, as described in Contributor's Guide Section 3.1.6.
Once the input data have been processed into the temporary NetCDF files, TC-Diag then calls one or more Python diagnostics scripts, as specified in the configuration file, to compute tropical cyclone diagnostic values. The computed diagnostics values are retrieved from the Python script and stored in memory.
After processing all valid times and all corresponding track points, the computed diagnostics are written to ASCII and/or NetCDF output files. If requested in the configuration file, the temporary range-azimuth cylindrical coordinates files are combined into a single NetCDF file and written to the output for each combination of model track and domain.
The default Python diagnostics scripts included with the MET release provide the standard set of CIRA diagnostics. However, users can copy/modify the logic in those scripts as they see fit to refine and/or add to the diagnostics computed.
25.2. Practical Information
25.2.1. tc_diag Usage
The following sections describe the usage statement, required arguments, and optional arguments for tc_diag.
Usage: tc_diag
-data domain tech_id_list [ file_1 ... file_n | data_file_list ]
-deck file
-config file
[-outdir path]
[-log file]
[-v level]
tc_diag has required arguments and can accept several optional arguments.
25.2.1.1. Required Arguments for tc_diag
The -data domain tech_id_list [ file_1 … file_n | data_file_list ] option specifies a domain name, a comma-separated list of ATCF tech ID’s, and a list of gridded data files or an ASCII file containing a list of files to be used. Specify -data one for each gridded data source.
The -deck source option is the ATCF format track data source.
The -config file option is the TCDiagConfig file to be used. The contents of the configuration file are discussed below.
25.2.1.2. Optional Arguments for tc_diag
The -outdir path option overrides the default output directory (current working directory) with the output directory path provided.
The -log file option directs output and errors to the specified log file. All messages will be written to that file as well as standard out and error. Thus, users can save the messages without having to redirect the output on the command line. The default behavior is no logfile.
The -v level option indicates the desired level of verbosity. The contents of “level” will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.
25.2.2. tc_diag Configuration File
The default configuration file for the TC-Diag tool named TCDiagConfig_default can be found in the installed share/met/config/ directory. Users are encouraged to copy these default files before modifying their contents. The contents of the configuration file are described in the subsections below.
25.2.2.1. Configuring Input Tracks and Time
model = [ "GFSO", "OFCL" ];
storm_id = "";
basin = "";
cyclone = "";
init_inc = "";
valid_beg = "";
valid_end = "";
valid_inc = [];
valid_exc = [];
valid_hour = [];
tmp_dir = "/tmp";
version = "VN.N";
The TC-Diag tool should be configured to filter the input track data (-deck) down to the subset of tracks that correspond to the gridded data files provided (-data). The filtered tracks should contain data for only one initialization time but may contain tracks for multiple models.
The configuration options listed above are used to filter the input track data down to those that should be processed in the current run. These options are common to multiple MET tools and are described in Section 6.
lead = [ "0", "6", "12", "18", "24",
"30", "36", "42", "48", "54",
"60", "66", "72", "78", "84",
"90", "96", "102", "108", "114",
"120", "126" ];
The lead entry is an array of strings specifying lead times in HH[MMSS] format. By default, diagnostics are computed every 6 hours out to 126 hours. Lead times for which no track point or gridded model data exist produce a warning message and diagnostics set to a missing data value.
25.2.2.2. Configuring Domain Information
domain_info = [
{
domain = "parent";
n_range = 150;
n_azimuth = 8;
delta_range_km = 10.0;
diag_script = [ "MET_BASE/python/tc_diag/compute_tc_diag.py MET_BASE/python/tc_diag/config/post_resample.yml MET_BASE/tc_data/v2023-04-07_gdland_table.dat" ];
override_diags = [];
},
{
domain = "nest";
n_range = 150;
n_azimuth = 8;
delta_range_km = 2.0;
diag_script = [ "MET_BASE/python/tc_diag/compute_tc_diag.py MET_BASE/python/tc_diag/config/post_resample_nest.yml MET_BASE/tc_data/v2023-04-07_gdland_table.dat" ];
override_diags = [ "RMW", "SST" ];
}
];
The domain_info entry is an array of dictionaries. Each dictionary consists of five entries. The domain entry is a user-specified string that provides a name for the domain. Each domain name must also appear in a -deck command line option, and the reverse is also true.
The n_range entry is an integer specifying the number of equally spaced range intervals in the range-azimuth grid to be used for this data source.
The n_azimuth entry is an integer specifying the number of equally spaced azimuth intervals in the range-azimuth grid to be used for this data source. The azimuthal grid spacing is 360 / n_azimuth degrees. Azimuths are defined by MET as degrees clockwise from due east. However, the TC-Diag Python code expects them as radians counter-clockwise from due east. The tc_diag_driver/post_resample_driver.py driver script performs the neccessary rotation and conversion operations.
The delta_range_km entry is a floating point value specifying the spacing of the range rings in kilometers.
The diag_script entry is an array of strings. Each string specifies the path to a Python script to be executed to compute diagnostics from the transformed cylindrical coordinates data for this domain. When multiple Python diagnostics scripts are run, the union of the diagnostics computed are written to the output.
The override_diags entry is an array of strings. Each string specifies the name of diagnostic value to be used for that domain. If set to an empty list, all diagnostics computed by the Python scripts in diag_script for that domain will be used. If non-empty, only the specific diagnostics listed will be used.
In the default configuration, seen above, the same Python script is run for both the parent and nest domains, each using a different configuration file. For the parent domain, all computed diagnostics are used since override_diags is empty. For the nest domain, only the specific diagnostics listed in override_diags are used to override the parent values. In general, diagnostics computed earlier in the list of domain_info entries can be overridden by diagnostics computed later in the list.
25.2.2.3. Configuring Data Censoring and Conversion Options
censor_thresh = [];
censor_val = [];
convert(x) = x;
These data censoring and conversion options are common to multiple MET tools and are described in Section 5. They do not actually appear in the default configuration file but can be specified separately in each data.field array entry, described below. If provided, those operations are performed after reading the gridded data but prior to converting to the cylindrical coordinate range-azimuth grid.
25.2.2.4. Configuring regridding options
regrid = {
method = BILIN;
width = 2;
vld_thresh = 0.5;
shape = SQUARE;
}
The regrid dictionary is common to multiple MET tools and is described in Section 5. It specifies how the input data should be regridded to cylindrical coordinates prior to compute diagnostics. It can be specified separately in each data.field array entry, described below. The default setting uses bilinear interpolation for all fields.
25.2.2.5. Configuring Fields, Levels, and Domains
data = {
// If empty, the field is processed for all domains
domain = [];
// Pressure levels to be used, unless overridden below
level = [ "P1000", "P925", "P850", "P700", "P500",
"P400", "P300", "P250", "P200", "P150",
"P100" ];
field = [
{ name = "TMP"; },
{ name = "UGRD"; },
{ name = "VGRD"; },
{ name = "RH"; },
{ name = "HGT"; },
{ name = "PRMSL"; level = "Z0"; },
{ name = "PWAT"; level = "L0"; },
{ name = "TMP"; level = "Z0"; },
{ name = "TMP"; level = "Z2"; },
{ name = "RH"; level = "Z2"; },
{ name = "UGRD"; level = "Z10"; },
{ name = "VGRD"; level = "Z10"; }
];
}
The data entry is a dictionary that contains the field entry to define what gridded data should be processed. The field entry is an array of dictionaries. Each field dictionary consists of at least three entries.
The name and level entries are common to multiple MET tools and are described in Section 5.
The domain entry is an array of strings. Each string specifies a domain name. If the domain_info domain name appears in this domain list, then this field will be read from that domain_info data source. If domain is set to an empty list, then this field will be read from all domain data sources.
25.2.2.6. Configuring Vortex Removal Option
vortex_removel = FALSE;
The vortex_removal flag entry is a boolean specifying whether or not vortex removal logic should be applied.
Note
As of MET version 12.0.0, vortex removal logic is not yet supported.
25.2.2.7. Configuring Data Input and Output Options
one_time_per_file_flag = TRUE;
The one_time_per_file_flag entry controls the logic for reading data from input files. This option describes how data is stored in the gridded input files specified with the -data command line option. Set this to true if each input file contains all of the data for a single initialization time and for a single valid time. If the input files contain data for multiple initialization or valid times, or if data for one valid time is spread across multiple files, set this to false.
If true, all input fields are read efficiently from each file in a single call. If false, each field is processed separately in a less efficient manner.
nc_cyl_grid_flag = TRUE; // resulting output file ends with "_cyl_grid_{domain}.nc"
nc_diag_flag = TRUE; // resulting output file ends with "_diag.nc"
cira_diag_flag = TRUE; // resulting output file ends with "_diag.dat"
These three flag entries are booleans specifying what output data types should be written. At least one of these flags must be set to true.
The nc_cyl_grid_flag entry controls the writing of a NetCDF file containing the cylindrical coordinate range-azimuth data used to compute the diagnostics. These files are written with a _cyl_grid_{domain}.nc suffix, where {domain} is the domain name specified in the configuration file. One output file is written for each combination of model track and domain.
The nc_diag_file entry controls the writing of the computed diagnostics to a NetCDF file. These files are written with a _diag.nc suffix. One output file is written for each model track processed.
The cira_diag_flag entry controls the writing of the computed diagnostics to a formatted ASCII output file. These files are written with a _diag.dat suffix. One output file is written for each model track processed.
output_base_format = "s{storm_id}_{model}_doper_{init_time}";
The output_base_format entry is a string that defines the naming convention that should be used when writing the output files described above. The following keywords are supported and will be replaced with values from the corresponding track: {storm_id}, {basin}, {cyclone}, {storm_name}, {technique_number}, {technique}, {init_ymdh}, {init_ymd_hms}, {init_hour}.
25.2.3. tc_diag Output
The TC-Diag tool writes up to three output data types, as specified by flags in the configuration file. Each time TC-Diag is run it processes track data for a single initialization time. The actual number of output files varies depending on the number of model tracks provided.
CIRA Diagnostics Output
When the cira_diag_flag configuration entry is set to true, an ASCII CIRA diagnostics output file is written for each model track provided. These files are named using the output_base_format, described above, followed by the _diag.dat suffix.
These output files contain tabular ASCII data with diagnostic values either extracted directly from the input ATCF track file or computed from the gridded data, after converting it to a storm-centric cylindrical grid. One output file is created for each track from each model source. The output consists of the following sections:
Two header lines list the model name, initialization time, storm basin, and integer storm number (of the season).
The STORM DATA section contains single diagnostic values either extracted from the ATCF track file or computed from the cylindrical grid for each forecast lead time. This section begins with a line named TIME defining the forecast lead time of each track point in hours. The following lines contain the requested storm diagnostics. For example, MAXWIND contains the maximum wind speed reported in the ATCF track file and SST contains the average sea surface temperature computed in the range/azimuth grid.
The SOUNDING DATA section contains diagnostics computed separately for each vertical level. The vertical levels are typically the surface (e.g. SURF) followed by pressure levels (e.g. 0850). This section begins with two lines named NLEV and TIME defining the number of vertical levels and their values and the forecast lead times for which diagnostics were computed, respectively. The level name is appended to each diagnostic name. For example, the T_0850 contains the average temperature value within the range/azimuth grid at the 850 mb pressure level.
Each diagnostic output line contains:
Diagnostic name (with or without the level) e.g. SHR_MAG for magnitude of wind shear
Units string enclosed in parenthesis e.g. (KT) for knots
The diagnostic values computed for each lead time e.g. 13 10 14 …
NetCDF Diagnostics Output
When the nc_diag_flag configuration entry is set to true, a NetCDF output file containing the computed diagnostics is written for each model track provided. These files contain the same data provided in the CIRA Diagnostics Output but formatted in NetCDF instead of ASCII. These files are named using the output_base_format, described above, followed by the _diag.nc suffix.
NetCDF Dimension |
Description |
---|---|
time |
Time dimension for the number of track point valid times |
pressure |
Vertical dimension for the number of pressure levels |
NetCDF Variable |
Dimension |
Description |
Data Type |
---|---|---|---|
storm_id |
NA |
Tropical Cyclone Storm ID (BBNNYYYY) consisting of 2-letter basin name, 2-digit storm number, and 4-digit year |
String |
model |
NA |
Track ATCF ID model name |
String |
init_time |
NA |
Track initialization time string in YYYYMMDD_HHMMSS format |
Datetime String |
init_time_ut |
NA |
Track initialization time string in unixtime (seconds since January 1, 1970) format |
String |
valid_time |
time |
Track point valid time string in YYYYMMDD_HHMMSS format |
Datetime String |
valid_time_ut |
time |
Track point valid time string in unixtime (seconds since January 1, 1970) format |
String |
lead_time |
time |
Track point forecast lead time string in HHMMSS format |
Time String |
lead_time_sec |
time |
Track point forecast lead time integer number of seconds |
Integer |
{DOMAIN}_domain |
NA |
Attributes define the range/azimuth grid for the {DOMAIN} domain: n_range, n_azimuth, delta_range_km |
Integer |
Diagnostic values |
time or time and pressure |
Computed diagnostic values for each track point and, optionally, pressure level. The units attribute defines the units of the diagnostic values. |
Double |
NetCDF Range-Azimuth Output
When the nc_rng_azi_flag configuration entry is set to true, a NetCDF output file containing the cylindrical range-azimuth data is written for each combination of model track provided and domain specified. For example, if three model tracks are provided and data for both parent and nest domains are provided, six of these NetCDF output files will be written.
The NetCDF range-azimuth output is named using the output_base_format, described above, followed by _cyl_grid_{DOMAIN}.nc, where {DOMAIN} is specified by the domain string in each domain_info array entry.
This NetCDF file contains a concatenation of the data from the temporary NetCDF files created for each track point. For each track point, TC-Diag creates a temporary NetCDF file and calls Python code to read the cylindrical grid data and compute diagnostics. By default, these temporary NetCDF files are deleted at the end of each run, but if the nc_rng_azi_flag is true, the data for each track point is concatenated into a single output file for each track.
Note
Setting the MET_KEEP_TEMP_FILE (Section 5.1.8) environment variable retains the temporary NetCDF cylindrical grid files for development, testing, and debugging purposes.
The NetCDF range-azimuth file contains the dimensions and variables shown in Table 25.3 and Table 25.4.
NetCDF Dimension |
Description |
---|---|
track_line |
Dimension for the raw ATCF track lines written to the TrackLines variable |
time |
Time dimension for the number of track point valid times |
range |
Dimension for the number of range rings in the range-azimuth grid |
azimuth |
Dimension for the number of azimuths in the range-azimuth grid |
pressure |
Vertical dimension for the number of pressure levels |
NetCDF Variable |
Dimension |
Description |
Data Type |
---|---|---|---|
storm_id |
NA |
Tropical Cyclone Storm ID (BBNNYYYY) consisting of 2-letter basin name, 2-digit storm number, and 4-digit year |
String |
model |
NA |
Track ATCF ID model name |
String |
TrackLines |
track_lines |
Raw input ATCF track lines |
String |
TrackLat |
time |
Track point location latitude |
Double |
TrackLon |
time |
Track point location longitude |
Double |
TrackMSLP |
time |
Track point minimum sea level pressure |
Double |
TrackVMax |
time |
Track point maximum wind speed |
Double |
init_time |
NA |
Track initialization time string in YYYYMMDD_HHMMSS format |
Datetime String |
init_time_ut |
NA |
Track initialization time string in unixtime (seconds since January 1, 1970) format |
String |
valid_time |
time |
Track point valid time string in YYYYMMDD_HHMMSS format |
Datetime String |
valid_time_ut |
time |
Track point valid time string in unixtime (seconds since January 1, 1970) format |
String |
lead_time |
time |
Track point forecast lead time string in HHMMSS format |
Time String |
lead_time_sec |
time |
Track point forecast lead time integer number of seconds |
Integer |
range |
range |
Range ring coordinate variable in kilometers |
Double |
azimuth |
azimuth |
Azimuth coordinate variable in degrees clockwise from north |
Double |
pressure |
pressure |
Vertical level pressure coordinate variable in millibars |
Double |
lat |
time, range, azimuth |
Latitude in degrees north for each range-azimuth grid point |
Double |
lon |
time, range, azimuth |
Longitude in degrees east for each range-azimuth grid point |
Double |
single level data (e.g. TMP_Z2, PRMSL_L0) |
time, range, azimuth |
Gridded range-azimuth data on a single level |
Double |
pressure level data (e.g. TMP, HGT) |
time, pressure, range, azimuth |
Gridded range-azimuth data on pressure levels |
Double |