Performs existence checks for expected splits and annotations, then runs annotation diagnostics (counts, size distribution, potential issues) for each split.
Arguments
- data_dir
Directory containing 'train' and 'valid' subdirectories, or a
.tar.gz/.tgzarchive thereof (e.g. the path returned byget_training_dataset()). Archives are transparently extracted into a session-scoped cache.- quiet
If TRUE, suppress CLI output while still returning diagnostics
