Table 1 Summary of the variables.
From: Unfolding the downloads of datasets: A multifaceted exploration of influencing factors
Variable | Abbreviation | Description | |
|---|---|---|---|
Dependent variables | Total downloads | Normalized_download | The total number of downloads of files in the dataset. |
Average downloads | Normalized_average_download | The average daily downloads of files in the dataset. | |
Independent variables | Length of the descriptive text | Description_length | The number of words in the description metadata field. |
New Dale-Chall score | Dale_chall | The New Dale-Chall readability score of the description text in the dataset metadata. | |
Degree of file accessibility | File_openness | The proportion of files in the dataset that are completely open and can be directly downloaded by users. | |
Authority of the dataset author’s institution | Institution_rank | The impact of the dataset author’s institution in the world. | |
Citations of the dataset-related papers | Related_citation | The total number of citations of papers supported by the dataset. | |
Control variables | Length of time the dataset was released from | Publication_duration | The number of days between the publication date and the most recent date in the dataset. |
Number of files | File_num | Number of files in the dataset. | |
Dataset size | File_size | The total size of the files in the dataset. | |
Number of subjects | Subject_num | The number of subjects to which a dataset belongs. | |
Indexed by the registry of research data repositories | Re3data_fairsharing | Whether the data repository where the dataset comes from is indexed by re3data.org or fairsharing.org. |