Table 1 Schema of the DomainDemo-multivariate tables.

From: DomainDemo: a dataset of domain-sharing activities among different demographic groups on Twitter

Column

Type

Category

Note

domain

string

key

second-level domains of websites, e.g., nytimes.com

time

string

time

year and month in the format “YYYY-MM” for monthly data; equal to “alltime” for all-time data

shares

integer

statistic

sharing event count in the bucket

users

integer

statistic

unique number of users in the bucket

gini

float

statistic

Gini index of the sharing count across users in the bucket

domains_count_mean

float

statistic

average number of unique domains shared by users in the bucket

domains_count_std

float

statistic

standard deviation of the number of unique domains shared by users in the bucket

state

string

demographic

two-letter abbreviations, e.g., MA; including 50 U.S. states and the District of Columbia

race

string

demographic

can be one of “African-American,” “Caucasian,” “Hispanic,” “Asian,”"Other,” and “Unknown”

gender

string

demographic

can be one of “Male,” “Female,” and “Unknown”

age

string

demographic

age bucket; can be one of “<18,” “18-29,” “30-49,” “50-64,” “65+,” and “Unknown”

party

string

demographic

can be one of “Democrat,” “Independent,” and “Republican”

  1. The “domain” column is only included in the distribution variants, whereas the “domains_count_mean” and “domains_count_std” columns are only included in the baseline variants.