Version 1.0.1
Pre-release[v1.0.1] - 2024-07-18
API-breaking changes:
- The output data frames produced by the
dgd_get_recount3_dataexecutable now contain both gene expression data and metadata unless otherwise filtered (see below).
Other changes:
-
Now the
experiment_attributescolumn, if present in the metadata columns of an SRA study, will be split into its constituent components when writing the output data frames for thedgd_get_recount3_dataexecutable (as it is already the case with thesample_attributescolumn). -
The user can now pass a YAML file to
dgd_get_recount3_datato download data from the Recount3 platform in bulk and filter them. -
The user can now pass
metadata_to_keepandmetadata_to_droplists of metadata columns in the input file todgd_get_recount3_datato keep or drop specific metadata columns in the output data frames. These can be passed both as columns if the input file is a CSV file or as specific keywords if the input file is a YAML file. -
The
recount3.util.get_metadatafunction now returns the metadata data frame with therecount3_project_nameandrecount3_samples_categorycolumns added. -
The
model_untrained.yamlconfiguration file was added to the examples of configuration files available within the package.
Internal changes (for contributors):
-
Two new internal functions in the
bulkDGD.recount3.utilmodule (_load_samples_batches_csv'andload_samples_batches_yaml) were introduced to parse the input files todgd_get_recount3_data. The public functionload_samples_batchessimply calls one of them depending on the file's extension. -
The
bulkDGD.util.get_handlersfunction now accepts two new arguments,log_level_consoleandlog_level_fileinstead of the oldlog_levelto have more fine-grained control over the log level of the handlers. -
The log level of the console handler for the
_dgd_get_recount3_data_single_batchexecutable was changed to ERROR so as not to clutter the console too much with all the INFO messages from the subprocesses (which get logged to their own log files anyway if the overall log level is INFO or below). -
The header of the
bulkDGD/recount3/data/sra_metadata_fields.txtfile was changed to better describe the metadata fields included in it.
Documentation:
-
The documentation was updated to reflect the user-facing changes.
-
The readme files for the configurations were removed because of the redundancy in the content of the documentation and the configuration files themselves.