check_tbl_derived_task_id_vals()
check to validate_model_data()
that ensures that values in derived task ID columns match expected values for
the corresponding derived task IDs in the round as defined in tasks.json
config (#110). Given the dependence of derived task IDs on the values of
their source task ID values, the check ignores the combinations of derived
task ID values with those of other task IDs and focuses only on identifying
values that do not match corresponding accepted values.submission_tmpl()
gains the force_output_types
allowing users to force
optional output types to be included in a submission template when
required_vals_only = TRUE
. In conjunction with the use of the
output_types
argument, this allows users to create submission templates
which include optional output types they plan to submit.check_tbl_values_required()
no longer reports false positives for v4 hubs,
which fixes the bug reported in #177. Evaluation of whether all combinations
of required values have been submitted through check_tbl_values_required()
is now chunked by output type for v4 config and above. This reduces memory
pressure and should speed up required value validation in hubs with complex
task.json files.hubUtils::read_config()
and hubUtils::read_config_file()
for reading in hub configuration files.hubData::create_hub_schema()
and hubData::coerce_to_hub_schema()
for creating and coercing data to the hub schema.output_type_ids
.is_required
property to determine whether output types are required or not.derived_task_ids
are now extracted from the tasks.json
config by default.See the schemas
repository NEWS.md
for more details.
check_error
to check_failure
and suppress early return in case of check failure in validate_model_file()
(#138).check_file_n()
function to validate that the number of files submitted per round does not exceed the allowed number of submissions per team (#139).NA
s in relevant tbl
columns in opt_check_tbl_col_timediff()
and opt_check_tbl_horizon_timediff()
checks to ensure rows that may not be targeting relevant to modeling task do not cause false check failure. (#140).parse_file_name()
:
create_custom_check()
for creating custom validation check function files from templates (#121).check_tbl_values_required()
causing required missing values to not be identified correctly when all output types were optional (#123)check_tbl_col_types()
where columns in model output data with more than one class were causing an EXEC error (#118). Thanks for the bug report @ruarai!hub_validations
object print()
method to make more visible on lighter backgrounds.file_modification_check
argument "warn"
option and replaced it with "failure"
in validate_pr()
function.check_failure
are required to pass for files to be considered valid, check_failure
class objects are elevated to errors (#111). Also, to make it easier for users to identify errors from visually scanning the printed output, the following custom bullets have been assigned.
✖
: check_failure
class object. This indicates an error that does not impact the validation process.ⓧ
: check_error
class object. This also indicates early termination of the validation process.☒
: check_exec_error
class object. This indicates an error in the execution of a check function.hub_validations
class object combine()
method now ensures that check names are made unique across all hub_validations
objects being combined.hub_validations
class object print()
method.
hub_validations
object is now included as the prefix to the check result message instead of the file name (#76).octolog
dependency removed. This removes the annotation of validation results onto GitHub Action workflow logs (#113).arrow
package and bump required version to 17.0.0.This release introduces significant improvements in the performance of submission validation via the following changes:
output_type
argument in expand_model_out_grid()
(#98).derived_task_ids
in expand_model_out_grid()
.validate_model_data()
, validate_submission()
and validate_pr()
.Both of these changes allow for the creation of smaller, more focused expanded valid value grids, significantly reducing pressure on memory when working with large, complex hub configs and making submission validation much more efficient.
Additional useful functionality:
submission_tmpl()
. Ignoring derived task ids can be particularly useful to avoid creating templates with invalid derived task ID value combinations.match_tbl_to_model_task()
that matches the rows in a tbl
of model output data to a model task of a given round (as defined in tasks.json
).check_tbl_spl_compound_taskid_set()
check function to validate_model_data()
that ensures that sample compound task id sets for each modeling task match or are coarser than the expected set defined in tasks.json
config.get_tbl_compound_taskid_set()
for detecting sample compound task ID set from submission data.compound_taskid_set
to expand_model_out_grid()
and submission_tmpl()
that allows users to override the compound task ID set when creating sample indices in the output_type_id
column of samples.output_type_id_datatype
argument to validate_pr()
, validate_submission()
, validate_model_data()
and expand_model_out_grid()
and set default value to "from_config"
. This default means the data type specified in the output_type_id_datatype
property in tasks.json
(introduced in schema version v3.0.1
) is used to cast the hub level output_type_id
column data type. If not set in the config, the functions fall back to "auto"
which detects the simplest data type that can represent all output type id values across all output types and rounds. The argument also allows hub administrators to override this setting manually during validation.hubData
functions to hubValidations
:hubData::expand_model_out_val_grid
to expand_model_out_grid
.hubData::create_model_out_submit_tmpl
to submission_tmpl
.hubData
(0.1.0) & hubAdmin
(0.1.0). This allows for successful validation of submissions to hubs with multiple model tasks, where a given model task might contain non relevant task IDs and both required
and optional
properties have been set to null
in tasks.json
(#75). See the relevant section in hubDocs
documentation for more details.validate_submission_time()
message by removing decimal seconds and including local time zone.<hub_validations>
class objects.validate_*()
function to documentation.value
column are non-decreasing as output_type_id
s increase for all unique task ID /output type value combinations for cdf
and quantile
output types was erroneously returning validation errors if the output_type_id
column was not ordered. (Thanks @M-7th).validate_pr()
now has arguments for controlling modification/deletions check are performed on model output and model metadata files (#65).
file_modification_check
, which controls whether modification/deletion checks are performed and what is returned if modifications/deletions detected.allow_submit_window_mods
, which controls whether modifications/deletions of model output files are allowed within their submission windows.validate_pr()
now checks for deletions of previously submitted model metadata files and modifications or deletions of previously submitted model output files, adding an <error/check_error>
class object to the function output for each detected modified/deleted file (#43 & #44).This release contains a bug fix for reading in and validating CSV column types correctly. (#54)
This release includes a number of bug fixes:
validations.yml
can now be accessed directly form pkg
namespace, addressing bug which required pkg
library to be loaded. (#51)all.equal
to check that sums of pmf
probabilities equal 1. (#52)This release includes improvements designed after the first round of sandbox testing on setting up the CDC FluSight hub. Improvements include:
parse_file_name
function for parsing model output metadata from a model output file name.check_tbl_values()
check fails.verbose
option to check_for_errors()
function which prints the results of all checks in addition to the deafult overall result and subset of failed checks.hubValidations
packageNEWS.md
file to track changes to the package.