1 VERITAS dataset description

Unlike the Eligibility or Health questionnaires, which can mostly be encoded as a flat table, the VERITAS questionnaire implicitly records a series of entities and their relationships:

Places: list of geocoded locations visited by participants, along with the following characteristics: category, name, visit frequency, transportation mode
Social contacts: people and/or groups frequented by participants
Relationships: between social contacts (who knows who / who belongs to which group) as well as between locations and social contacts (places visited along with whom)

The diagram below illustrates the various entities collected throught the VERITAS questionnaire:

VERITAS entities

New participants and returning participants are presented separately below, as they were presented two slightly different question flows.

2 Basic descriptive statistics for new participants

2.1 Section 1: Residence and Neighbourhood

2.1.1 Now, let’s start with your home. What is your address?

home_location <- locations[locations$location_category == 1, ]

## version ggmap
vic_aoi <- st_bbox(home_location)
names(vic_aoi) <- c("left", "bottom", "right", "top")
vic_aoi[["left"]] <- vic_aoi[["left"]] - .07
vic_aoi[["right"]] <- vic_aoi[["right"]] + .07
vic_aoi[["top"]] <- vic_aoi[["top"]] + .01
vic_aoi[["bottom"]] <- vic_aoi[["bottom"]] - .01

bm <- get_stadiamap(vic_aoi, zoom = 11, maptype = "stamen_toner_lite") %>%
  ggmap(extent = "device")
bm + geom_sf(data = st_jitter(home_location, .008), inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3) # see https://github.com/r-spatial/sf/issues/336

NB: Home locations have been randomly shifted from their original position to protect privacy.

# Number of participants by municipalites
home_by_municipalites <- st_join(home_location, municipalities["NAME"])
home_by_mun_cnt <- as.data.frame(home_by_municipalites) %>%
  group_by(NAME) %>%
  dplyr::count() %>%
  arrange(desc(n), NAME)
home_by_mun_cnt$Shape <- NULL
kable(home_by_mun_cnt, caption = "Number of participants by municipalities") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of participants by municipalities
NAME	n
Victoria	65
Saanich	12
Esquimalt	9
Langford	4
View Royal	2
Oak Bay	1

2.1.2 If you were asked to draw the boundaries of your neighbourhood, what would they be?

prn <- poly_geom[poly_geom$area_type == "neighborhood", ]

## version ggmap
bm + geom_sf(data = prn, inherit.aes = FALSE, fill = alpha("blue", 0.05), color = alpha("blue", 0.3))

# Min, max, median & mean area of PRN
prn$area_m2 <- st_area(prn$geom)
kable(t(as.matrix(summary(prn$area_m2))),
  caption = "Area (in square meters) of the perceived residential neighborhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived residential neighborhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
19010.3	606003	1212794	1886093	1756926	28868855

NB only 87 valid neighborhoods were collected, as many participants struggled to draw polygons on the map.

2.1.3 How attached are you to your neighbourhood?

# extract and recode
.ngh_att <- veritas_main[veritas_main$neighbourhood_attach != 99, c("interact_id", "neighbourhood_attach")] %>% dplyr::rename(neighbourhood_attach_code = neighbourhood_attach)
.ngh_att$neighbourhood_attach <- factor(ifelse(.ngh_att$neighbourhood_attach_code == 1, "1 [Not attached at all]",
  ifelse(.ngh_att$neighbourhood_attach_code == 6, "6 [Very attached]",
    .ngh_att$neighbourhood_attach_code
  )
))

# histogram of attachment
ggplot(data = .ngh_att) +
  geom_histogram(aes(x = neighbourhood_attach), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "neighbourhood_attach")

.ngh_att_cnt <- .ngh_att %>%
  group_by(neighbourhood_attach) %>%
  dplyr::count() %>%
  arrange(neighbourhood_attach)
kable(.ngh_att_cnt, caption = "Neigbourhood attachment") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Neigbourhood attachment
neighbourhood_attach	n
1 [Not attached at all]	3
2	7
3	10
4	24
5	26
6 [Very attached]	22

2.1.4 On average, how many hours per day do you spend outside of your home?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$hours_out_w3))),
  caption = "Hours/day outside home",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside home
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1	2	5	5.5	8	20

2.1.5 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out_neighb_w3))

# Min, max, median & mean hours/day out of neighborhood
kable(t(as.matrix(summary(veritas_main$hours_out_neighb_w3))),
  caption = "Hours/day outside neighbourhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside neighbourhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	3.7	7	15

2.1.6 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

# extract and recode
.unsafe <- veritas_main[c("interact_id", "unsafe_area")] %>% dplyr::rename(unsafe_area_code = unsafe_area)
.unsafe$unsafe_area <- factor(ifelse(.unsafe$unsafe_area_code == 1, "1 [Yes]",
  ifelse(.unsafe$unsafe_area_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .unsafe) +
  geom_histogram(aes(x = unsafe_area), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "unsafe_area")

.unsafe_cnt <- .unsafe %>%
  group_by(unsafe_area) %>%
  dplyr::count() %>%
  arrange(unsafe_area)
kable(.unsafe_cnt, caption = "Unsafe areas") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Unsafe areas
unsafe_area	n
1 [Yes]	21
2 [No]	72

# map
unsafe <- poly_geom[poly_geom$area_type == "unsafe area", ]

## version ggmap
bm + geom_sf(data = unsafe, inherit.aes = FALSE, fill = alpha("blue", 0.3), color = alpha("blue", 0.5))

# Min, max, median & mean area of PRN
unsafe$area_m2 <- st_area(unsafe$geom)
kable(t(as.matrix(summary(unsafe$area_m2))),
  caption = "Area (in square meters) of the perceived unsafe area",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived unsafe area
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
4467	35727.4	262389.7	470977.9	698392.1	1952029

2.1.7 Do you spend the night somewhere other than your home at least once per week?

# extract and recode
.o_res <- veritas_main[c("interact_id", "other_resid_w3")] %>% dplyr::rename(other_resid_code = other_resid_w3)
.o_res$other_resid_w3 <- factor(ifelse(.o_res$other_resid_code == 1, "1 [Yes]",
  ifelse(.o_res$other_resid_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .o_res) +
  geom_histogram(aes(x = other_resid_w3), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other_resid")

.o_res_cnt <- .o_res %>%
  group_by(other_resid_w3) %>%
  dplyr::count() %>%
  arrange(other_resid_w3)
kable(.o_res_cnt, caption = "Other residence") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Other residence
other_resid_w3	n
1 [Yes]	7
2 [No]	86

2.2 Section 2: Occupation

2.2.1 Are you currently working?

# extract and recode
.work <- veritas_main[c("interact_id", "working")] %>% dplyr::rename(working_code = working)
.work$working <- factor(ifelse(.work$working_code == 1, "1 [Yes]",
  ifelse(.work$working_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .work) +
  geom_histogram(aes(x = working), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "working")

.work_cnt <- .work %>%
  group_by(working) %>%
  dplyr::count() %>%
  arrange(working)
kable(.work_cnt, caption = "Currently working") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently working
working	n
1 [Yes]	67
2 [No]	26

2.2.2 Where do you work?

work_location <- locations[locations$location_category == 3, ]

bm + geom_sf(data = work_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.2.3 On average, how many hours per week do you work?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$working == 1, ]) +
  geom_histogram(aes(x = work_hours_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$work_hours_w3[veritas_main$working == 1]))),
  caption = "Work hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Work hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
3	35	38	36.3	40	90

2.2.4 Are you currently a registered student?

# extract and recode
.study <- veritas_main[c("interact_id", "studying")] %>% dplyr::rename(studying_code = studying)
.study$studying <- factor(ifelse(.study$studying_code == 1, "1 [Yes]",
  ifelse(.study$studying_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .study) +
  geom_histogram(aes(x = studying), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Studying")

.study_cnt <- .study %>%
  group_by(studying) %>%
  dplyr::count() %>%
  arrange(studying)
kable(.study_cnt, caption = "Currently studying") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently studying
studying	n
1 [Yes]	12
2 [No]	81

2.2.5 Where do you study?

study_location <- locations[locations$location_category == 4, ]

bm + geom_sf(data = study_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.2.6 On average, how many hours per week do you study?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$studying == 1, ]) +
  geom_histogram(aes(x = study_hours_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$study_hours_w3[veritas_main$studying == 1]))),
  caption = "study hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

study hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
2	5.8	15	19.1	30	50

2.3 Section 3: Shopping activities

The following questions are used to generate the locations grouped into this section:

Do you shop for groceries at a supermarket at least once per month?
Do you shop at a public/farmer’s market at least once per month?
Do you shop at a bakery at least once per month?
Do you go to a specialty food store at least once per month? For example: a cheese shop, fruit and vegetable store, butcher’s shop, natural and health food store.
Do you go to a convenience store at least once per month?
Do you go to a liquor store at least once per month?

shop_lut <- data.frame(
  location_category_code = c(5, 6, 7, 8, 9, 10),
  location_category = factor(c(
    " 5 [Supermarket]",
    " 6 [Public/farmer’s market]",
    " 7 [Bakery]",
    " 8 [Specialty food store]",
    " 9 [Convenience store/Dépanneur]",
    "10 [Liquor store/SAQ]"
  ))
)
shop_location <- locations[locations$location_category %in% shop_lut$location_category_code, ] %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(shop_lut, by = "location_category_code")

# map
bm + geom_sf(data = shop_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = shop_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Shopping locations by categories")

.location_category_cnt <- as.data.frame(shop_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
5 [Supermarket]	241
6 [Public/farmer’s market]	46
7 [Bakery]	62
8 [Specialty food store]	53
9 [Convenience store/Dépanneur]	25
10 [Liquor store/SAQ]	95

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(shop_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE to build list of all combination iid/shopping categ
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = shop_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_shop_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_shop_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
5 [Supermarket]	2.59	3	5
6 [Public/farmer’s market]	0.49	0	2
7 [Bakery]	0.67	0	3
8 [Specialty food store]	0.57	0	5
9 [Convenience store/Dépanneur]	0.27	0	3
10 [Liquor store/SAQ]	1.02	1	5

2.4 Section 4: Services

The following questions are used to generate the locations grouped into this section:

Where is the bank you go to most often located?
Where is the hair salon or barber shop you go to most often?
Where is the post office where you go to most often?
Where is the drugstore you go to most often?
If you need to visit a doctor or other healthcare provider, where do you go most often?

serv_lut <- data.frame(
  location_category_code = c(11, 12, 13, 14, 15),
  location_category = factor(c(
    "11 [Bank]",
    "12 [Hair salon/barbershop]",
    "13 [Post office]",
    "14 [Drugstore]",
    "15 [Doctor/healthcare provider]"
  ))
)
serv_location <- locations[locations$location_category %in% serv_lut$location_category_code, ] %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(serv_lut, by = "location_category_code")

# map
bm + geom_sf(data = serv_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = serv_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Service locations by categories")

.location_category_cnt <- as.data.frame(serv_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
11 [Bank]	46
12 [Hair salon/barbershop]	35
13 [Post office]	33
14 [Drugstore]	64
15 [Doctor/healthcare provider]	44

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(serv_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = serv_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_serv_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_serv_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
11 [Bank]	0.49	0	1
12 [Hair salon/barbershop]	0.38	0	1
13 [Post office]	0.35	0	1
14 [Drugstore]	0.69	1	1
15 [Doctor/healthcare provider]	0.47	0	5

2.5 Section 5: Transportation

2.5.1 Do you use public transit from your home?

# extract and recode
.transp <- veritas_main[c("interact_id", "public_transit_w3")] %>% dplyr::rename(public_transit_code = public_transit_w3)
.transp$public_transit_w3 <- factor(ifelse(.transp$public_transit_code == 1, "1 [Yes]",
  ifelse(.transp$public_transit_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .transp) +
  geom_histogram(aes(x = public_transit_w3), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "public_transit")

.transp_cnt <- .transp %>%
  group_by(public_transit_w3) %>%
  dplyr::count() %>%
  arrange(public_transit_w3)
kable(.transp_cnt, caption = "Use public transit") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Use public transit
public_transit_w3	n
1 [Yes]	21
2 [No]	71
NA	1

2.5.2 Where are the public transit stops that you access from your home?

transp_location <- locations[locations$location_category == 16, ]

bm + geom_sf(data = transp_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.6 Section 6: Leisure activities

The following questions are used to generate the locations grouped into this section:

Do you participate in any (individual or group) sports or leisure-time physical activities at least once per month?
Do you visit a park at least once per month?
Do you participate in or attend as a spectator a cultural or non-sport leisure activity at least once per month? For example: singing or drawing lessons, book or poker club, concert or play.
Do you volunteer at least once per month?
Do you engage in any religious or spiritual activities at least once per month?
Do you go to a restaurant, café, bar or other food and drink establishment at least once per month?
Do you get take-out food at least once per month?
Do you regularly go for walks?

leisure_lut <- data.frame(
  location_category_code = c(17, 18, 19, 20, 21, 22, 23, 24),
  location_category = factor(c(
    "17 [Leisure-time physical activity]",
    "18 [Park]",
    "19 [Cultural activity]",
    "20 [Volunteering place]",
    "21 [Religious or spiritual activity]",
    "22 [Restaurant, café, bar, etc. ]",
    "23 [Take-out]",
    "24 [Walk]"
  ))
)
leisure_location <- locations[locations$location_category %in% leisure_lut$location_category_code, ] %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(leisure_lut, by = "location_category_code")

# map
bm + geom_sf(data = leisure_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = leisure_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Leisure locations by categories")

.location_category_cnt <- as.data.frame(leisure_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
17 [Leisure-time physical activity]	138
18 [Park]	237
19 [Cultural activity]	12
20 [Volunteering place]	47
21 [Religious or spiritual activity]	8
22 [Restaurant, café, bar, etc. ]	179
23 [Take-out]	161
24 [Walk]	187

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(leisure_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = leisure_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_leisure_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)

.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_leisure_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of leisure locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of leisure locations by participant and category
location_category	mean	median	max
17 [Leisure-time physical activity]	1.48	1	5
18 [Park]	2.55	2	5
19 [Cultural activity]	0.13	0	2
20 [Volunteering place]	0.51	0	3
21 [Religious or spiritual activity]	0.09	0	2
22 [Restaurant, café, bar, etc. ]	1.92	1	5
23 [Take-out]	1.73	1	5
24 [Walk]	2.01	2	5

2.7 Section 7: Other places/activities

2.7.1 Are there other places that you go to at least once per month that we have not mentioned? For example: a mall, a daycare, a hardware store, or a community center.

# extract and recode
.other <- veritas_main[c("interact_id", "other_w3")] %>% dplyr::rename(other_code = other_w3)
.other$other_w3 <- factor(ifelse(.other$other_code == 1, "1 [Yes]",
  ifelse(.other$other_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .other) +
  geom_histogram(aes(x = other_w3), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other")

.other_cnt <- .other %>%
  group_by(other_w3) %>%
  dplyr::count() %>%
  arrange(other_w3)
kable(.other_cnt, caption = "Other places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Other places
other_w3	n
1 [Yes]	41
2 [No]	52

2.7.2 Can you locate this place?

other_location <- locations[locations$location_category == 25, ]

bm + geom_sf(data = other_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.8 Section 8: Areas of change

Participants were not asked for areas of change in Victoria

2.9 Section 9: Social contact

2.9.1 Do you visit anyone at his or her home at least once per month?

# extract and recode
.visiting <- veritas_main[c("interact_id", "visiting_w3")] %>% dplyr::rename(visiting_code = visiting_w3)
.visiting$visiting_w3 <- factor(ifelse(.visiting$visiting_code == 1, "1 [Yes]",
  ifelse(.visiting$visiting_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .visiting) +
  geom_histogram(aes(x = visiting_w3), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "visiting")

.visiting_cnt <- .visiting %>%
  group_by(visiting_w3) %>%
  dplyr::count() %>%
  arrange(visiting_w3)
kable(.visiting_cnt, caption = "Social contact") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Social contact
visiting_w3	n
1 [Yes]	52
2 [No]	41

2.9.2 Great, we are almost done completing this questionnaire. You have documented all your activity places on a map, and specified with whom you generally do these activities. These last few questions concern the people you documented earlier.

# compute statistics on groups / participant
# > one needs to account for participants who did not report any group
.gr_iid_cnt <- as.data.frame(group[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_gr_iid <- anti_join(veritas_main[c("interact_id")], .gr_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.gr_iid_cnt <- bind_rows(.gr_iid_cnt, .no_gr_iid)

kable(t(as.matrix(summary(.gr_iid_cnt$n))), caption = "Number of groups per participant") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of groups per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	1	1.258064	2	5

# compute statistics on people / participant
# > one needs to account for participants who did not report any group
.pl_iid_cnt <- as.data.frame(people[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_pl_iid <- anti_join(veritas_main[c("interact_id")], .pl_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.pl_iid_cnt <- bind_rows(.pl_iid_cnt, .no_pl_iid)

kable(t(as.matrix(summary(.pl_iid_cnt$n))), caption = "Number of people per participant") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	4.387097	5	23

# histogram
.sc_iid_cnt <- .pl_iid_cnt %>% mutate(soc_type = "people")
.sc_iid_cnt <- .gr_iid_cnt %>%
  mutate(soc_type = "group") %>%
  bind_rows(.sc_iid_cnt)

ggplot(data = .sc_iid_cnt) +
  geom_histogram(aes(x = n, y = stat(count), fill = soc_type), position = "dodge") +
  labs(x = "Social network size by element type", fill = element_blank())

2.9.2.1 Among these people, who do you discuss important matters with?

# extract number of important people / participant
.n_important <- important %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_imp <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_important, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_important = n.y) %>%
  mutate(pct = 100 * n_important / n_people)

kable(t(as.matrix(summary(.n_people_imp$n_important))),
  caption = "Number of important people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of important people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	2.8	3	17

kable(t(as.matrix(summary(.n_people_imp$pct))),
  caption = "% of important people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of important people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	50	66.7	65.3	100	100	3

NB In Victoria, 3 participants also listed groups in this catagory, hence the max value over 100%.

2.9.2.2 Among these people, who do you like to socialize with?

# extract number of important people / participant
.n_socialize <- socialize %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_soc <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_socialize, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_socialize = n.y) %>%
  mutate(pct = 100 * n_socialize / n_people)

kable(t(as.matrix(summary(.n_people_soc$n_socialize))),
  caption = "Number of people with whom to socialize per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people with whom to socialize per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	3.6	5	23

kable(t(as.matrix(summary(.n_people_soc$pct))),
  caption = "% of people with whom to  socialize among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of people with whom to socialize among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	66.7	100	83.3	100	100	3

NB In Victoria, 6 participants also listed groups in this catagory, hence the max value over 100%.

2.9.2.3 Among these people, who do you meet often with but do not necessarily feel close to?

# extract number of important people / participant
.n_not_close <- not_close %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_not_close <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_not_close, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_not_close = n.y) %>%
  mutate(pct = 100 * n_not_close / n_people)

kable(t(as.matrix(summary(.n_people_not_close$n_not_close))),
  caption = "Number of not so close people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of not so close people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	0.5	1	8

kable(t(as.matrix(summary(.n_people_not_close$pct))),
  caption = "% of not so close people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of not so close people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	0	0	14.1	20	100	3

2.9.2.4 Among these people, who knows whom?

# extract number of who knows who relationships
.n_relat <- relationship %>%
  filter(relationship_type == 1) %>%
  dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_relat <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_relat, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_relat = n.y) %>%
  mutate(pct = 100 * n_relat * 2 / (n_people * (n_people - 1))) # potential number of relationships = N x (N -1) / 2

kable(t(as.matrix(summary(.n_people_relat$n_relat))),
  caption = "Number of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	3	7.8	8	122

kable(t(as.matrix(summary(.n_people_relat$pct))),
  caption = "% of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	47.9	81	72.6	100	100	18

2.10 Derived metrics

2.10.1 Transportation mode preferences

Based on the answers to the question Usually, how do you go there? (Check all that apply.).

# code  en
# 1 By car and you drive
# 2 By car and someone else drives
# 3 By taxi/Uber
# 4 On foot
# 5 By bike
# 6 By bus
# 7 By subway
# 8 By train
# 99    Other

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.tm <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels)

.tm_grouped <- .tm %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "By car (driver)" = sum(location_tmode_1),
    "By car (passenger)" = sum(location_tmode_2),
    "By taxi/Uber" = sum(location_tmode_3),
    "On foot" = sum(location_tmode_4),
    "By bike" = sum(location_tmode_5),
    "By bus" = sum(location_tmode_6),
    "By train" = sum(location_tmode_7),
    "Other" = sum(location_tmode_99)
  )

kable(.tm_grouped, caption = "Transportation mode preferences") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Transportation mode preferences
description	N	By car (driver)	By car (passenger)	By taxi/Uber	On foot	By bike	By bus	By train	Other
2 [Other residence]	7	2	3	0	1	4	2	0	0
3 [Work]	88	12	3	3	12	37	7	0	0
4 [School/College/University]	14	0	0	0	2	4	3	0	3
5 [Supermarket]	241	95	22	0	104	98	3	0	2
6 [Public/farmer’s market]	46	7	3	0	21	21	1	0	0
7 [Bakery]	62	13	3	0	37	30	1	0	0
8 [Specialty food store]	53	16	1	0	20	22	1	1	0
9 [Convenience store/Dépanneur]	25	2	1	0	14	10	2	0	0
10 [Liquor store/SAQ]	95	33	11	0	39	33	4	0	0
11 [Bank]	46	11	2	0	19	19	2	0	0
12 [Hair salon/barbershop]	35	8	1	0	14	13	1	0	0
13 [Post office]	33	8	2	0	17	11	0	0	1
14 [Drugstore]	64	17	7	0	35	24	0	0	0
15 [Doctor/healthcare provider]	44	8	3	1	10	23	1	0	1
16 [Public transit stop]	29	0	1	0	22	0	6	0	0
17 [Leisure-time physical activity]	138	35	13	0	28	70	3	0	3
18 [Park]	237	53	26	0	112	89	2	0	1
19 [Cultural activity]	12	1	0	0	4	5	1	0	0
20 [Volunteering place]	47	6	0	0	10	14	0	0	9
21 [Religious/spiritual activity]	8	0	0	0	2	1	0	0	3
22 [Restaurant, café, bar, etc.]	179	39	20	0	77	68	2	0	1
23 [Take-out]	161	49	23	0	57	34	0	0	19
24 [Walk]	187	25	16	0	144	30	0	0	4
25 [Other place]	95	37	9	0	17	54	2	0	0

# graph
.tm1 <- .tm %>%
  filter(location_tmode_1 == 1) %>%
  mutate(tm = "[1] By car (driver)")
.tm2 <- .tm %>%
  filter(location_tmode_2 == 1) %>%
  mutate(tm = "[2] By car (passenger)")
.tm3 <- .tm %>%
  filter(location_tmode_3 == 1) %>%
  mutate(tm = "[3] By taxi/Uber")
.tm4 <- .tm %>%
  filter(location_tmode_4 == 1) %>%
  mutate(tm = "[4] On foot")
.tm5 <- .tm %>%
  filter(location_tmode_5 == 1) %>%
  mutate(tm = "[5] By bike")
.tm6 <- .tm %>%
  filter(location_tmode_6 == 1) %>%
  mutate(tm = "[6] By bus")
.tm7 <- .tm %>%
  filter(location_tmode_7 == 1) %>%
  mutate(tm = "[7] By train")
.tm99 <- .tm %>%
  filter(location_tmode_99 == 1) %>%
  mutate(tm = "[99] Other")
.tm <- bind_rows(.tm1, .tm2) %>%
  bind_rows(.tm3) %>%
  bind_rows(.tm4) %>%
  bind_rows(.tm5) %>%
  bind_rows(.tm6) %>%
  bind_rows(.tm7) %>%
  bind_rows(.tm99)

# histogram of answers
ggplot(data = .tm) +
  geom_bar(aes(x = fct_rev(description), fill = tm), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Transport modes") +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of transportation mode by location category", x = element_blank()) +
  coord_flip() +
  theme(legend.position = "bottom", legend.justification = c(0, 0), legend.text = element_text(size = 8)) +
  guides(fill = guide_legend(nrow = 3))

2.10.2 Visiting places alone

Based on the answers to the question Do you usually go to this place alone or with other people?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.alone <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels) %>%
  mutate(location_alone_recode = case_when(
    location_alone2 == 1 ~ 1,
    location_alone2 == 2 ~ 0
  ))

.alone_grouped <- .alone %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "Visited alone" = sum(location_alone_recode),
    "Visited alone (%)" = round(sum(location_alone_recode) * 100.0 / n(), digits = 1)
  )

kable(.alone_grouped, caption = "Visiting places alone") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visiting places alone
description	N	Visited alone	Visited alone (%)
2 [Other residence]	7	2	28.6
3 [Work]	88	41	46.6
4 [School/College/University]	14	11	78.6
5 [Supermarket]	241	179	74.3
6 [Public/farmer’s market]	46	25	54.3
7 [Bakery]	62	42	67.7
8 [Specialty food store]	53	43	81.1
9 [Convenience store/Dépanneur]	25	23	92.0
10 [Liquor store/SAQ]	95	71	74.7
11 [Bank]	46	44	95.7
12 [Hair salon/barbershop]	35	34	97.1
13 [Post office]	33	31	93.9
14 [Drugstore]	64	44	68.8
15 [Doctor/healthcare provider]	44	38	86.4
16 [Public transit stop]	29	28	96.6
17 [Leisure-time physical activity]	138	52	37.7
18 [Park]	237	62	26.2
19 [Cultural activity]	12	3	25.0
20 [Volunteering place]	47	17	36.2
21 [Religious/spiritual activity]	8	2	25.0
22 [Restaurant, café, bar, etc.]	179	25	14.0
23 [Take-out]	161	87	54.0
24 [Walk]	187	93	49.7
25 [Other place]	95	41	43.2

# histogram of answers
ggplot(data = .alone) +
  geom_bar(aes(x = fct_rev(description), fill = factor(location_alone2)), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Visiting places", labels = c("Alone", "With someone")) +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of places visited alone", x = element_blank()) +
  coord_flip()

2.10.3 Visit frequency

Based on the answers to the question How often do you go there?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.freq <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels)

.freq_grouped <- .freq %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), min = min(location_freq_visit),
    max = max(location_freq_visit),
    mean = mean(location_freq_visit),
    median = median(location_freq_visit),
    sd = sd(location_freq_visit)
  )

kable(.freq_grouped, caption = "Visit frequency (expressed in times/year)") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visit frequency (expressed in times/year)
description	N	min	max	mean	median	sd
2 [Other residence]	7	1	208	98.428571	104.0	79.355439
3 [Work]	88	0	364	191.943182	260.0	100.711761
4 [School/College/University]	14	0	364	189.428571	208.0	131.562044
5 [Supermarket]	241	1	364	55.684647	48.0	57.289180
6 [Public/farmer’s market]	46	1	156	30.739130	24.0	27.824398
7 [Bakery]	62	1	260	34.096774	24.0	41.993641
8 [Specialty food store]	53	1	208	26.037736	12.0	42.751501
9 [Convenience store/Dépanneur]	25	1	104	28.720000	12.0	32.368349
10 [Liquor store/SAQ]	95	1	156	26.115790	24.0	24.877942
11 [Bank]	46	1	52	16.195652	12.0	12.969399
12 [Hair salon/barbershop]	35	1	12	7.942857	8.0	4.371989
13 [Post office]	33	3	104	24.727273	12.0	28.799167
14 [Drugstore]	64	4	156	24.687500	12.0	24.045509
15 [Doctor/healthcare provider]	44	1	52	12.772727	12.0	11.563502
16 [Public transit stop]	29	1	104	28.551724	12.0	30.375726
17 [Leisure-time physical activity]	138	1	364	88.420290	52.0	94.149269
18 [Park]	237	1	520	64.936709	24.0	88.238843
19 [Cultural activity]	12	1	156	24.500000	2.5	45.578105
20 [Volunteering place]	47	0	365	76.914894	52.0	99.362715
21 [Religious/spiritual activity]	8	12	365	176.625000	104.0	159.616449
22 [Restaurant, café, bar, etc.]	179	1	260	20.804469	12.0	33.967611
23 [Take-out]	161	1	156	21.118012	12.0	23.510737
24 [Walk]	187	1	468	95.160428	52.0	105.932906
25 [Other place]	95	1	520	59.757895	24.0	84.915191

# graph
ggplot(data = .freq) +
  geom_boxplot(aes(x = fct_rev(description), y = location_freq_visit)) +
  scale_y_continuous(limits = c(0, 365)) +
  labs(y = "Visits/year (Frequency over 1 visit/day not shown)", x = element_blank()) +
  coord_flip()

2.10.4 Spatial indicators: Camille Perchoux’s toolbox

Below is a list of indicators proposed by Camille Perchoux in her paper Assessing patterns of spatial behavior in health studies: Their socio-demographic determinants and associations with transportation modes (the RECORD Cohort Study).

-- Reading Camille tbx indics from Essence table
SELECT interact_id,
  n_acti_places, n_weekly_vst, n_acti_types,
  cvx_perimeter, cvx_surface,
  min_length, max_length, median_length, 
  pct_visits_neighb, 
  n_acti_prn, pct_visits_prn, prn_area_km2
FROM essence_table.essence_perchoux_tbx
WHERE city_id = 'Victoria' AND wave_id = 3 AND status = 'new'

2.10.4.1 Indicators related to the lifestyle

Indicators	Measurement approach
Number of activity places (`n_acti_places`)	Count of activity places
Number of visits to places per week (`n_weekly_vst`)	Number of activity places per individual multiplied by the frequency of visit per week to each location, excluding the residence
Number of activity types (`n_acti_types`)	6 types of activities considered: 1-Residential; 2-Occupation; 3-Shopping activities; 4-Services; 5-Transportation; 6-Leisure activities (NB original categories were as follow: 1-Residential; 2-work; 3-food and other services; 4-transport station/stop; 5-recreational activity; 6-social activity)

LLindic <- ess.tab.camille %>%
  select("interact_id", "n_acti_places", "n_weekly_vst", "n_acti_types")
.llmtx <- as.matrix(summary(LLindic[c("n_acti_places", "n_weekly_vst", "n_acti_types")], digits = 1))
.llmtx <- apply(.llmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.llmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
kable(.llmtx, caption = "Lifestyle statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Lifestyle statistics
	n_acti_places	n_weekly_vst	n_acti_types
Min.	6	4	4
1st Qu.	15	16	5
Median	21	20	5
Mean	22	23	5
3rd Qu.	29	27	5
Max.	58	73	6

2.10.4.2 Indicators related to the geometry of the activity space

Indicators	Measurement approach
Perimeter of the convex hull (`cvx_perimeter`)	Perimeter of the smallest polygon containing all the activity locations of the participant (unit: km)
Surface of the convex hull (`cvx_surface`)	Surface of the smallest polygon containing all the activity locations of the participant (unit: km2)
Minimal road network distance from the residence to an activity place (`min_length`)	Minimal distance from the residence to an activity place using the road network (in meters)
Maximal road network distance from the residence to an activity place (`max_length`)	Maximal distance from the residence to an activity place using the road network (in meters)
Median road network distance from the residence to all activity places (`median_length`)	Median distance from home to all activity places using the road network (in meters)

ASindic <- ess.tab.camille %>%
  select("interact_id", "cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")
.asmtx <- as.matrix(summary(ASindic[c("cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")], digits = 1))
.asmtx <- apply(.asmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.asmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ", "N/A.")
.asmtx[is.na(.asmtx)] <- 0
kable(.asmtx, caption = "Activity space statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Activity space statistics
	cvx_perimeter	cvx_surface	min_length	max_length	median_length
Min.	7	1	6e-02	2e+03	571
1st Qu.	16	13	4e+00	6e+03	1249
Median	30	32	5e+01	1e+04	1728
Mean	155	427	2e+02	3e+04	2321
3rd Qu.	60	137	3e+02	3e+04	2444
Max.	6754	11474	3e+03	2e+05	14456
N/A.	0	0	1	1	1

2.10.4.3 Indicators related to the importance of the residential neighborhood

Indicators	Measurement approach
Percentage of visits to places in the residential neighborhood (`pct_visits_neighb`)	Count of visits to places within the 500 m road network buffer centered on the residence divided by the total number of visits to places
Number of activity locations in the PRN (`n_acti_prn`)	Count of activity locations in the PRN
Percentage of visits in the PRN (`pct_visits_prn`)	Count of visits to places in the PRN divided by the total number of visits to places
Surface of the PRN (`prn_area_km2`)	Unit: km2

RNindic <- ess.tab.camille %>%
  select("interact_id", "pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")
RNindic <- RNindic[RNindic$interact_id %in% prn$interact_id, ]
.rnmtx <- as.matrix(summary(RNindic[c("pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")], digits = 1))
.rnmtx <- apply(.rnmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.rnmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
.rnmtx[is.na(.rnmtx)] <- 0
kable(.rnmtx, caption = "Residential neighbourhood statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Residential neighbourhood statistics
	pct_visits_neighb	n_acti_prn	pct_visits_prn	prn_area_km2
Min.	0	0	0	0.02
1st Qu.	11	2	25	0.61
Median	27	5	45	1.21
Mean	30	6	42	1.89
3rd Qu.	45	8	58	1.76
Max.	83	16	95	28.87

Only participants with valid PRN considered in Residential Neighbourhood indicators (= 87 participants).

2.10.5 Social indicators: Alexandre Naud’s toolbox

See Alex’s document for a more comprehensive presentation of the social indicators.

-- Reading Alex tbx indics from Essence table
SELECT interact_id,
  people_degree, 
  socialize_size, socialize_meet, socialize_chat,
  important_size, group_degree, simmelian
FROM essence_table.essence_naud_social
WHERE city_id = 'Victoria' AND wave_id = 3 AND status = 'new'

2.10.5.1 Number of people in the network (`people_degree`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = people_degree))

kable(t(as.matrix(summary(ess.tab.alex$people_degree))), caption = "people_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

people_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	4.387097	5	23

2.10.5.2 Simmelian Brokerage (`simmelian`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = simmelian))

kable(t(as.matrix(summary(ess.tab.alex$simmelian))), caption = "simmelian") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

simmelian
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
1	2.291667	5	6.733041	10.00833	25.90323	10

2.10.5.3 Number of people with whom the participant like to socialize (`socialize_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = socialize_size))

kable(t(as.matrix(summary(ess.tab.alex$socialize_size))), caption = "socialize_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	3.591398	5	23

2.10.5.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

ggplot(filter(ess.tab.alex, socialize_meet < 100)) +
  geom_histogram(aes(x = socialize_meet)) +
  annotate(geom = "text", x = 75, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_meet))), caption = "socialize_meet") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_meet
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	364	388	500.7312	636	2108

2.10.5.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

ggplot(filter(ess.tab.alex, socialize_chat < 100)) +
  geom_histogram(aes(x = socialize_chat)) +
  annotate(geom = "text", x = 55, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_chat))), caption = "socialize_chat") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_chat
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	232	468	2188.484	832	132120

2.10.5.6 Number of people with whom the participant discuss important matters (`important_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = important_size))

kable(t(as.matrix(summary(ess.tab.alex$important_size))), caption = "important_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

important_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	2.806452	3	17

2.10.5.7 Number of people in all groups (`group_degree`)

ggplot(filter(ess.tab.alex, group_degree < 100)) +
  geom_histogram(aes(x = group_degree)) +
  annotate(geom = "text", x = 20, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$group_degree))), caption = "group_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

group_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	3	4.741936	8	27

3 Basic descriptive statistics for returning participants

3.1 Section 1: Residence and Neighbourhood

3.1.1 Now, let’s start with your home. What is your address?

home_location <- locations[locations$location_category == 1, ]

## version ggmap
vic_aoi <- st_bbox(home_location)
names(vic_aoi) <- c("left", "bottom", "right", "top")
vic_aoi[["left"]] <- vic_aoi[["left"]] - .07
vic_aoi[["right"]] <- vic_aoi[["right"]] + .07
vic_aoi[["top"]] <- vic_aoi[["top"]] + .01
vic_aoi[["bottom"]] <- vic_aoi[["bottom"]] - .01

bm <- get_stadiamap(vic_aoi, zoom = 11, maptype = "stamen_toner_lite") %>%
  ggmap(extent = "device")
bm + geom_sf(data = st_jitter(home_location, .008), inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3) # see https://github.com/r-spatial/sf/issues/336

NB: Home locations have been randomly shifted from their original position to protect privacy.

# Number of participants by municipalites
home_by_municipalites <- st_join(home_location, municipalities["NAME"])
home_by_mun_cnt <- as.data.frame(home_by_municipalites) %>%
  group_by(NAME) %>%
  dplyr::count() %>%
  arrange(desc(n), NAME)
home_by_mun_cnt$Shape <- NULL
kable(home_by_mun_cnt, caption = "Number of participants by municipalities") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of participants by municipalities
NAME	n
Victoria	37
Saanich	22
Esquimalt	5
Langford	3
Oak Bay	2
View Royal	1

3.1.2 If you were asked to draw the boundaries of your neighbourhood, what would they be?

prn <- poly_geom[poly_geom$area_type == "neighborhood", ]

## version ggmap
bm + geom_sf(data = prn, inherit.aes = FALSE, fill = alpha("blue", 0.05), color = alpha("blue", 0.3))

# Min, max, median & mean area of PRN
prn$area_m2 <- st_area(prn$geom)
kable(t(as.matrix(summary(prn$area_m2))),
  caption = "Area (in square meters) of the perceived residential neighborhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived residential neighborhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
20259.9	837288.6	1584197	2840301	2691140	49044855

NB only 99 valid neighborhoods were collected, as many participants struggled to draw polygons on the map.

3.1.3 How attached are you to your neighbourhood?

# extract and recode
.ngh_att <- veritas_main[veritas_main$neighbourhood_attach != 99, c("interact_id", "neighbourhood_attach")] %>% dplyr::rename(neighbourhood_attach_code = neighbourhood_attach)
.ngh_att$neighbourhood_attach <- factor(ifelse(.ngh_att$neighbourhood_attach_code == 1, "1 [Not attached at all]",
  ifelse(.ngh_att$neighbourhood_attach_code == 6, "6 [Very attached]",
    .ngh_att$neighbourhood_attach_code
  )
))

# histogram of attachment
ggplot(data = .ngh_att) +
  geom_histogram(aes(x = neighbourhood_attach), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "neighbourhood_attach")

.ngh_att_cnt <- .ngh_att %>%
  group_by(neighbourhood_attach) %>%
  dplyr::count() %>%
  arrange(neighbourhood_attach)
kable(.ngh_att_cnt, caption = "Neigbourhood attachment") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Neigbourhood attachment
neighbourhood_attach	n
1 [Not attached at all]	1
2	4
3	7
4	25
5	35
6 [Very attached]	32

3.1.4 On average, how many hours per day do you spend outside of your home?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$hours_out_w3))),
  caption = "Hours/day outside home",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside home
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1	2	4	4.8	7	20

3.1.5 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out_neighb_w3))

# Min, max, median & mean hours/day out of neighborhood
kable(t(as.matrix(summary(veritas_main$hours_out_neighb_w3))),
  caption = "Hours/day outside neighbourhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside neighbourhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	1	2.9	3	15

3.1.6 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

# extract and recode
.unsafe <- veritas_main[c("interact_id", "unsafe_area")] %>% dplyr::rename(unsafe_area_code = unsafe_area)
.unsafe$unsafe_area <- factor(ifelse(.unsafe$unsafe_area_code == 1, "1 [Yes]",
  ifelse(.unsafe$unsafe_area_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .unsafe) +
  geom_histogram(aes(x = unsafe_area), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "unsafe_area")

.unsafe_cnt <- .unsafe %>%
  group_by(unsafe_area) %>%
  dplyr::count() %>%
  arrange(unsafe_area)
kable(.unsafe_cnt, caption = "Unsafe areas") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Unsafe areas
unsafe_area	n
1 [Yes]	27
2 [No]	77

# map
unsafe <- poly_geom[poly_geom$area_type == "unsafe area", ]

## version ggmap
bm + geom_sf(data = unsafe, inherit.aes = FALSE, fill = alpha("blue", 0.3), color = alpha("blue", 0.5))

# Min, max, median & mean area of PRN
unsafe$area_m2 <- st_area(unsafe$geom)
kable(t(as.matrix(summary(unsafe$area_m2))),
  caption = "Area (in square meters) of the perceived unsafe area",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived unsafe area
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
10939.2	76094.9	117023.4	289010.1	364241.4	2471570

3.1.7 Do you spend the night somewhere other than your home at least once per week?

# extract and recode
.o_res <- veritas_main[c("interact_id", "other_resid_w3")] %>% dplyr::rename(other_resid_code = other_resid_w3)
.o_res$other_resid <- factor(ifelse(.o_res$other_resid_code == 1, "1 [Yes]",
  ifelse(.o_res$other_resid_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .o_res) +
  geom_histogram(aes(x = other_resid), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other_resid")

.o_res_cnt <- .o_res %>%
  group_by(other_resid) %>%
  dplyr::count() %>%
  arrange(other_resid)
kable(.o_res_cnt, caption = "Other residence") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Other residence
other_resid	n
1 [Yes]	9
2 [No]	95

3.2 Section 2: Occupation

3.2.1 Are you currently working?

# extract and recode
.work <- veritas_main[c("interact_id", "working")] %>% dplyr::rename(working_code = working)
.work$working <- factor(ifelse(.work$working_code == 1, "1 [Yes]",
  ifelse(.work$working_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .work) +
  geom_histogram(aes(x = working), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "working")

.work_cnt <- .work %>%
  group_by(working) %>%
  dplyr::count() %>%
  arrange(working)
kable(.work_cnt, caption = "Currently working") %>% 
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently working
working	n
1 [Yes]	84
2 [No]	20

3.2.2 Where do you work?

work_location <- locations[locations$location_category == 3, ]

bm + geom_sf(data = work_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

3.2.3 On average, how many hours per week do you work?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$working == 1, ]) +
  geom_histogram(aes(x = work_hours_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$work_hours_w3[veritas_main$working == 1]))),
  caption = "Work hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Work hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
2	35	36	34.9	40	74

3.2.4 Are you currently a registered student?

# extract and recode
.study <- veritas_main[c("interact_id", "studying")] %>% dplyr::rename(studying_code = studying)
.study$studying <- factor(ifelse(.study$studying_code == 1, "1 [Yes]",
  ifelse(.study$studying_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .study) +
  geom_histogram(aes(x = studying), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Studying")

.study_cnt <- .study %>%
  group_by(studying) %>%
  dplyr::count() %>%
  arrange(studying)
kable(.study_cnt, caption = "Currently studying") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently studying
studying	n
1 [Yes]	2
2 [No]	102

3.2.5 Where do you study?

study_location <- locations[locations$location_category == 4, ]

bm + geom_sf(data = study_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

3.2.6 On average, how many hours per week do you study?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$studying == 1, ]) +
  geom_histogram(aes(x = study_hours_w3))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$study_hours_w3[veritas_main$studying == 1]))),
  caption = "study hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

study hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
8	12.2	16.5	16.5	20.8	25

3.3 Section 3: Shopping activities

3.3.1 In Date of Previous Data Collection Wave, you reported shopping at these locations. Do you still visit these places?

shop_lut <- data.frame(
  location_category_code = c(5, 6, 7, 8, 9, 10),
  location_category = factor(c(
    " 5 [Supermarket]",
    " 6 [Public/farmer’s market]",
    " 7 [Bakery]",
    " 8 [Specialty food store]",
    " 9 [Convenience store/Dépanneur]",
    "10 [Liquor store/SAQ]"
  ))
)
shop_location <- locations[locations$location_category %in% shop_lut$location_category_code, ] %>%
  filter(location_current == 1) %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(shop_lut, by = "location_category_code")

# map
bm + geom_sf(data = shop_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = shop_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Shopping locations by categories")

.location_category_cnt <- as.data.frame(shop_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
5 [Supermarket]	280
6 [Public/farmer’s market]	26
7 [Bakery]	31
8 [Specialty food store]	62
9 [Convenience store/Dépanneur]	5
10 [Liquor store/SAQ]	62

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(shop_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE to build list of all combination iid/shopping categ
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = shop_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_shop_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_shop_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
5 [Supermarket]	2.69	2	8
6 [Public/farmer’s market]	0.25	0	2
7 [Bakery]	0.30	0	2
8 [Specialty food store]	0.60	0	4
9 [Convenience store/Dépanneur]	0.05	0	1
10 [Liquor store/SAQ]	0.60	0	3

3.3.2 Thinking about the places where you shop, are there other supermarkets, farmers markets, bakeries, specialty stores, convenience stores or liquor stores you visit at least once per month?

NB: Variable grp_shopping_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.grp_shopping <- veritas_main[c("interact_id", "grp_shopping_new_w3")] %>% dplyr::rename(grp_shopping_new_code = grp_shopping_new_w3)
.grp_shopping$grp_shopping_new <- factor(ifelse(.grp_shopping$grp_shopping_new_code == 1, "1 [Yes]",
  ifelse(.grp_shopping$grp_shopping_new_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .grp_shopping) +
  geom_histogram(aes(x = grp_shopping_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "grp_shopping_new")

.grp_shopping_cnt <- .grp_shopping %>%
  group_by(grp_shopping_new) %>%
  dplyr::count() %>%
  arrange(grp_shopping_new)
kable(.grp_shopping_cnt, caption = "New shopping places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.4 Section 4: Services

3.4.1 In Date of Previous Data Collection Wave, you reported using services at these locations. Do you still visit these places?

serv_lut <- data.frame(
  location_category_code = c(11, 12, 13, 14, 15),
  location_category = factor(c(
    "11 [Bank]",
    "12 [Hair salon/barbershop]",
    "13 [Post office]",
    "14 [Drugstore]",
    "15 Doctor/healthcare provider]"
  ))
)
serv_location <- locations[locations$location_category %in% serv_lut$location_category_code, ] %>%
  filter(location_current == 1) %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(serv_lut, by = "location_category_code")

# map
bm + geom_sf(data = serv_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = serv_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Service locations by categories")

.location_category_cnt <- as.data.frame(serv_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
11 [Bank]	36
12 [Hair salon/barbershop]	31
13 [Post office]	33
14 [Drugstore]	60
15 Doctor/healthcare provider]	67

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(serv_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = serv_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_serv_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_serv_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
11 [Bank]	0.35	0.0	2
12 [Hair salon/barbershop]	0.30	0.0	1
13 [Post office]	0.32	0.0	2
14 [Drugstore]	0.58	0.5	3
15 Doctor/healthcare provider]	0.64	0.0	5

3.4.2 Thinking about the places where you use services, are there other banks, hair salons, post offices, drugstores, doctors or other healthcare providers you visit at least once per month?

NB: Variable grp_services_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.grp_services <- veritas_main[c("interact_id", "grp_services_new_w3")] %>% dplyr::rename(grp_services_new_code = grp_services_new_w3)
.grp_services$grp_services_new <- factor(ifelse(.grp_services$grp_services_new_code == 1, "1 [Yes]",
  ifelse(.grp_services$grp_services_new_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .grp_services) +
  geom_histogram(aes(x = grp_services_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "grp_services_new")

.grp_services_cnt <- .grp_services %>%
  group_by(grp_services_new) %>%
  dplyr::count() %>%
  arrange(grp_services_new)
kable(.grp_services_cnt, caption = "New services places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.5 Section 5: Transportation

3.5.1 In Date of Previous Data Collection Wave, you reported accessing these public transit stops from your home. Do you still access these places?

transp_location <- locations[locations$location_category == 16, ] %>% filter(location_current == 1)

bm + geom_sf(data = transp_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

3.5.2 Are there other public transit stops you access from your home at least once per month?

NB: Variable grp_ptransit_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.grp_ptransit <- veritas_main[c("interact_id", "grp_ptransit_new_w3")] %>% dplyr::rename(grp_ptransit_new_code = grp_ptransit_new_w3)
.grp_ptransit$grp_ptransit_new <- factor(ifelse(.grp_ptransit$grp_ptransit_new_code == 1, "1 [Yes]",
  ifelse(.grp_ptransit$grp_ptransit_new_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .grp_ptransit) +
  geom_histogram(aes(x = grp_ptransit_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "grp_ptransit_new")

.grp_ptransit_cnt <- .grp_ptransit %>%
  group_by(grp_ptransit_new) %>%
  dplyr::count() %>%
  arrange(grp_ptransit_new)
kable(.grp_ptransit_cnt, caption = "New transit places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.6 Section 6: Leisure activities

3.6.1 In Date of Previous Data Collection Wave, you reported doing leisure activities at these locations. Do you still visit these places?

leisure_lut <- data.frame(
  location_category_code = c(17, 18, 19, 20, 21, 22, 23, 24),
  location_category = factor(c(
    "17 [Leisure-time physical activity]",
    "18 [Park]",
    "19 [Cultural activity]",
    "20 [Volunteering place]",
    "21 [Religious or spiritual activity]",
    "22 [Restaurant, café, bar, etc. ]",
    "23 [Take-out]",
    "24 [Walk]"
  ))
)
leisure_location <- locations[locations$location_category %in% leisure_lut$location_category_code, ] %>%
  dplyr::rename(location_category_code = location_category) %>%
  inner_join(leisure_lut, by = "location_category_code")

# map
bm + geom_sf(data = leisure_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = leisure_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Leisure locations by categories")

.location_category_cnt <- as.data.frame(leisure_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
17 [Leisure-time physical activity]	88
18 [Park]	154
19 [Cultural activity]	5
20 [Volunteering place]	17
21 [Religious or spiritual activity]	2
22 [Restaurant, café, bar, etc. ]	129
23 [Take-out]	49
24 [Walk]	143

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(leisure_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data_frame(
  interact_id = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- data_frame(
    interact_id = as.character(iid),
    location_category = leisure_lut$location_category
  )
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_leisure_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)

.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_leisure_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of leisure locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of leisure locations by participant and category
location_category	mean	median	max
17 [Leisure-time physical activity]	0.85	0.5	7
18 [Park]	1.48	1.0	7
19 [Cultural activity]	0.05	0.0	2
20 [Volunteering place]	0.16	0.0	2
21 [Religious or spiritual activity]	0.02	0.0	1
22 [Restaurant, café, bar, etc. ]	1.24	1.0	6
23 [Take-out]	0.47	0.0	3
24 [Walk]	1.38	1.0	6

3.6.2 Thinking about the places where you do leisure activities, are there other parks, gyms, movie theaters, concert halls, churchs, temples, restaurants, cafés, bars or any places where you do leisure activities and that you visit at least once per month?

NB: Variable grp_leisure_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.grp_leisure <- veritas_main[c("interact_id", "grp_leisure_new_w3")] %>% dplyr::rename(grp_leisure_new_code = grp_leisure_new_w3)
.grp_leisure$grp_leisure_new <- factor(ifelse(.grp_leisure$grp_leisure_new_code == 1, "1 [Yes]",
  ifelse(.grp_leisure$grp_leisure_new_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .grp_leisure) +
  geom_histogram(aes(x = grp_leisure_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "grp_leisure_new")

.grp_leisure_cnt <- .grp_leisure %>%
  group_by(grp_leisure_new) %>%
  dplyr::count() %>%
  arrange(grp_leisure_new)
kable(.grp_leisure_cnt, caption = "New leisure places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.7 Section 7: Other places/activities

3.7.1 Here are the other places you reported regularly visiting in Date of Previous Data Collection Wave. Do you still visit these places?

other_location <- locations[locations$location_category == 25, ] %>% filter(location_current == 1)

bm + geom_sf(data = other_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

3.7.2 Are there other places that you go to at least once per month that we have not mentioned? For example: a mall, a daycare, a hardware store, or a community center.

NB: Variable other_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.other <- veritas_main[c("interact_id", "other_new_w3")] %>% dplyr::rename(other_new_code = other_new_w3)
.other$other_new <- factor(ifelse(.other$other_new_code == 1, "1 [Yes]",
  ifelse(.other$other_new_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .other) +
  geom_histogram(aes(x = other_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other_new")

.other_cnt <- .other %>%
  group_by(other_new) %>%
  dplyr::count() %>%
  arrange(other_new)
kable(.other_cnt, caption = "New other places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.8 Section 8: Areas of change

Participants were not asked for areas of change in Victoria

3.9 Section 9: Social contact

3.9.1 In Date of Previous Data Collection Wave, you reported visiting people at their home. Do you still visit these places?

visiting_location <- locations[locations$location_category == 26, ] %>% filter(location_current == 1)

bm + geom_sf(data = visiting_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

3.9.2 Do you visit anyone else at his or her home at least once per month?

NB: Variable visiting_new has not been properly recorded in Victoria wave 2 for returning participants.

# extract and recode
.visiting <- veritas_main[c("interact_id", "visiting_new_w3")] %>% dplyr::rename(visiting_code = visiting_new_w3)
.visiting$visiting_new <- factor(ifelse(.visiting$visiting_code == 1, "1 [Yes]",
  ifelse(.visiting$visiting_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .visiting) +
  geom_histogram(aes(x = visiting_new), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "visiting_new")

.visiting_cnt <- .visiting %>%
  group_by(visiting_new) %>%
  dplyr::count() %>%
  arrange(visiting_new)
kable(.visiting_cnt, caption = "Social contact") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

3.9.3 Great, we are almost done completing this questionnaire. You have documented all your activity places on a map, and specified with whom you generally do these activities. These last few questions concern the people you documented earlier.

# compute statistics on groups / participant
# > one needs to account for participants who did not report any group
.gr_iid_cnt <- as.data.frame(group[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_gr_iid <- anti_join(veritas_main[c("interact_id")], .gr_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.gr_iid_cnt <- bind_rows(.gr_iid_cnt, .no_gr_iid)

kable(t(as.matrix(summary(.gr_iid_cnt$n))), caption = "Number of groups per participant") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of groups per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	2.038461	3	8

# compute statistics on people / participant
# > one needs to account for participants who did not report any group
.pl_iid_cnt <- as.data.frame(people[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_pl_iid <- anti_join(veritas_main[c("interact_id")], .pl_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.pl_iid_cnt <- bind_rows(.pl_iid_cnt, .no_pl_iid)

kable(t(as.matrix(summary(.pl_iid_cnt$n))), caption = "Number of people per participant") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	3	5	6.778846	10	34

# histogram
.sc_iid_cnt <- .pl_iid_cnt %>% mutate(soc_type = "people")
.sc_iid_cnt <- .gr_iid_cnt %>%
  mutate(soc_type = "group") %>%
  bind_rows(.sc_iid_cnt)

ggplot(data = .sc_iid_cnt) +
  geom_histogram(aes(x = n, y = stat(count), fill = soc_type), position = "dodge") +
  labs(x = "Social network size by element type", fill = element_blank())

3.9.3.1 Among these people, who do you discuss important matters with?

# extract number of important people / participant
.n_important <- important %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_imp <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_important, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_important = n.y) %>%
  mutate(pct = 100 * n_important / n_people)

kable(t(as.matrix(summary(.n_people_imp$n_important))),
  caption = "Number of important people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of important people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	3	3.6	5.2	11

kable(t(as.matrix(summary(.n_people_imp$pct))),
  caption = "% of important people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of important people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	37.7	59.2	62	100	100	2

NB In Victoria, 3 participants also listed groups in this catagory, hence the max value over 100%.

3.9.3.2 Among these people, who do you like to socialize with?

# extract number of important people / participant
.n_socialize <- socialize %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_soc <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_socialize, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_socialize = n.y) %>%
  mutate(pct = 100 * n_socialize / n_people)

kable(t(as.matrix(summary(.n_people_soc$n_socialize))),
  caption = "Number of people with whom to socialize per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people with whom to socialize per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	4	4.7	7	16

kable(t(as.matrix(summary(.n_people_soc$pct))),
  caption = "% of people with whom to  socialize among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of people with whom to socialize among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	50	81.7	74.1	100	100	2

3.9.3.3 Among these people, who do you meet often with but do not necessarily feel close to?

# extract number of important people / participant
.n_not_close <- not_close %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_not_close <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_not_close, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_not_close = n.y) %>%
  mutate(pct = 100 * n_not_close / n_people)

kable(t(as.matrix(summary(.n_people_not_close$n_not_close))),
  caption = "Number of not so close people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of not so close people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	0.8	1	6

kable(t(as.matrix(summary(.n_people_not_close$pct))),
  caption = "% of not so close people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of not so close people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	0	0	9.5	14.7	100	2

3.9.3.4 Among these people, who knows whom?

# extract number of who knows who relationships
.n_relat <- relationship %>%
  filter(relationship_type == 1) %>%
  dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_relat <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_relat, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_relat = n.y) %>%
  mutate(pct = 100 * n_relat * 2 / (n_people * (n_people - 1))) # potential number of relationships = N x (N -1) / 2

kable(t(as.matrix(summary(.n_people_relat$n_relat))),
  caption = "Number of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	6	14.9	19.5	189

kable(t(as.matrix(summary(.n_people_relat$pct))),
  caption = "% of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	33.4	60	61.3	100	100	14

3.10 Derived metrics

3.10.1 Transportation mode preferences

Based on the answers to the question Usually, how do you go there? (Check all that apply.).

# code  en
# 1 By car and you drive
# 2 By car and someone else drives
# 3 By taxi/Uber
# 4 On foot
# 5 By bike
# 6 By bus
# 7 By subway
# 8 By train
# 99    Other

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.tm <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1 & location_current == 1) %>%
  left_join(loc_labels)

.tm_grouped <- .tm %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "By car (driver)" = sum(location_tmode_1),
    "By car (passenger)" = sum(location_tmode_2),
    "By taxi/Uber" = sum(location_tmode_3),
    "On foot" = sum(location_tmode_4),
    "By bike" = sum(location_tmode_5),
    "By bus" = sum(location_tmode_6),
    "By train" = sum(location_tmode_7),
    "Other" = sum(location_tmode_99)
  )

kable(.tm_grouped, caption = "Transportation mode preferences") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Transportation mode preferences
description	N	By car (driver)	By car (passenger)	On foot	By bike	By bus	By train	Other
2 [Other residence]	13	8	1	2	3	1	1	2
3 [Work]	139	26	3	8	61	5	0	7
4 [School/College/University]	3	1	0	0	1	0	0	1
5 [Supermarket]	280	142	22	98	118	2	0	0
6 [Public/farmer’s market]	26	10	5	9	12	0	0	0
7 [Bakery]	31	10	2	14	17	0	0	0
8 [Specialty food store]	62	25	6	21	24	2	0	0
9 [Convenience store/Dépanneur]	5	1	0	4	1	0	0	0
10 [Liquor store/SAQ]	62	16	2	30	20	1	0	1
11 [Bank]	36	13	2	10	17	0	0	0
12 [Hair salon/barbershop]	31	12	0	7	20	1	0	0
13 [Post office]	33	7	0	18	12	0	0	0
14 [Drugstore]	60	20	2	32	26	0	0	0
15 [Doctor/healthcare provider]	67	21	3	19	38	2	0	0
16 [Public transit stop]	11	0	0	9	1	1	0	0
17 [Leisure-time physical activity]	88	27	7	16	47	0	0	0
18 [Park]	154	21	10	94	51	0	0	2
19 [Cultural activity]	5	0	0	0	3	0	0	0
20 [Volunteering place]	17	7	0	4	5	0	0	0
21 [Religious/spiritual activity]	2	0	0	0	1	0	0	0
22 [Restaurant, café, bar, etc.]	129	17	11	73	53	1	0	1
23 [Take-out]	49	16	8	18	18	0	0	0
24 [Walk]	143	29	0	97	37	1	0	0
25 [Other place]	212	102	16	60	90	1	1	3
26 [Social contact residence]	110	60	16	28	47	1	0	0

# graph
.tm1 <- .tm %>%
  filter(location_tmode_1 == 1) %>%
  mutate(tm = "[1] By car (driver)")
.tm2 <- .tm %>%
  filter(location_tmode_2 == 1) %>%
  mutate(tm = "[2] By car (passenger)")
.tm3 <- .tm %>%
  filter(location_tmode_3 == 1) %>%
  mutate(tm = "[3] By taxi/Uber")
.tm4 <- .tm %>%
  filter(location_tmode_4 == 1) %>%
  mutate(tm = "[4] On foot")
.tm5 <- .tm %>%
  filter(location_tmode_5 == 1) %>%
  mutate(tm = "[5] By bike")
.tm6 <- .tm %>%
  filter(location_tmode_6 == 1) %>%
  mutate(tm = "[6] By bus")
.tm7 <- .tm %>%
  filter(location_tmode_7 == 1) %>%
  mutate(tm = "[7] By train")
.tm99 <- .tm %>%
  filter(location_tmode_99 == 1) %>%
  mutate(tm = "[99] Other")
.tm <- bind_rows(.tm1, .tm2) %>%
  bind_rows(.tm3) %>%
  bind_rows(.tm4) %>%
  bind_rows(.tm5) %>%
  bind_rows(.tm6) %>%
  bind_rows(.tm7) %>%
  bind_rows(.tm99)

# histogram of answers
ggplot(data = .tm) +
  geom_bar(aes(x = fct_rev(description), fill = tm), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Transport modes") +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of transportation mode by location category", x = element_blank()) +
  coord_flip() +
  theme(legend.position = "bottom", legend.justification = c(0, 0), legend.text = element_text(size = 8)) +
  guides(fill = guide_legend(nrow = 3))

3.10.2 Visiting places alone

Based on the answers to the question Do you usually go to this place alone or with other people?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.alone <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1 & location_current == 1) %>%
  left_join(loc_labels) %>%
  mutate(location_alone_recode = case_when(
    location_alone2 == 1 ~ 1,
    location_alone2 == 2 ~ 0
  ))

.alone_grouped <- .alone %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "Visited alone" = sum(location_alone_recode),
    "Visited alone (%)" = round(sum(location_alone_recode) * 100.0 / n(), digits = 1)
  )

kable(.alone_grouped, caption = "Visiting places alone") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visiting places alone
description	N	Visited alone	Visited alone (%)
2 [Other residence]	13	1	7.7
3 [Work]	139	57	41.0
4 [School/College/University]	3	2	66.7
5 [Supermarket]	280	193	68.9
6 [Public/farmer’s market]	26	14	53.8
7 [Bakery]	31	22	71.0
8 [Specialty food store]	62	47	75.8
9 [Convenience store/Dépanneur]	5	4	80.0
10 [Liquor store/SAQ]	62	52	83.9
11 [Bank]	36	31	86.1
12 [Hair salon/barbershop]	31	31	100.0
13 [Post office]	33	28	84.8
14 [Drugstore]	60	51	85.0
15 [Doctor/healthcare provider]	67	59	88.1
16 [Public transit stop]	11	11	100.0
17 [Leisure-time physical activity]	88	26	29.5
18 [Park]	154	46	29.9
19 [Cultural activity]	5	1	20.0
20 [Volunteering place]	17	5	29.4
21 [Religious/spiritual activity]	2	0	0.0
22 [Restaurant, café, bar, etc.]	129	36	27.9
23 [Take-out]	49	24	49.0
24 [Walk]	143	53	37.1
25 [Other place]	212	97	45.8
26 [Social contact residence]	110	NA	NA

# histogram of answers
ggplot(data = .alone) +
  geom_bar(aes(x = fct_rev(description), fill = factor(location_alone2)), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Visiting places", labels = c("N/A", "Alone", "With someone")) +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of places visited alone", x = element_blank()) +
  coord_flip()

3.10.3 Visit frequency

Based on the answers to the question How often do you go there?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.freq <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1 & location_current == 1) %>%
  left_join(loc_labels)

.freq_grouped <- .freq %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), min = min(location_freq_visit),
    max = max(location_freq_visit),
    mean = mean(location_freq_visit),
    median = median(location_freq_visit),
    sd = sd(location_freq_visit)
  )

kable(.freq_grouped, caption = "Visit frequency (expressed in times/year)") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visit frequency (expressed in times/year)
description	N	min	max	mean	median	sd
2 [Other residence]	13	3	156	43.923077	24	43.622360
3 [Work]	139	0	364	157.122302	208	117.679648
4 [School/College/University]	3	0	416	156.000000	52	226.662745
5 [Supermarket]	280	1	260	43.650000	24	43.476591
6 [Public/farmer’s market]	26	1	104	32.461539	24	27.530682
7 [Bakery]	31	3	156	36.935484	24	38.765049
8 [Specialty food store]	62	1	156	25.370968	12	23.482281
9 [Convenience store/Dépanneur]	5	12	52	35.200000	36	17.527122
10 [Liquor store/SAQ]	62	3	208	31.693548	24	35.308286
11 [Bank]	36	2	104	16.333333	12	17.443787
12 [Hair salon/barbershop]	31	1	52	8.032258	6	9.012894
13 [Post office]	33	1	52	14.151515	12	9.877757
14 [Drugstore]	60	3	156	24.833333	12	24.772980
15 [Doctor/healthcare provider]	67	1	96	14.074627	12	16.216633
16 [Public transit stop]	11	1	156	24.545455	12	44.587809
17 [Leisure-time physical activity]	88	1	1248	87.159091	52	148.545893
18 [Park]	154	1	520	72.064935	24	92.397173
19 [Cultural activity]	5	2	260	74.000000	52	106.826963
20 [Volunteering place]	17	4	364	70.705882	36	111.912781
21 [Religious/spiritual activity]	2	364	364	364.000000	364	0.000000
22 [Restaurant, café, bar, etc.]	129	1	364	29.007752	12	46.093061
23 [Take-out]	49	2	156	19.714286	12	22.850055
24 [Walk]	143	1	364	85.447552	36	105.887309
25 [Other place]	212	1	260	33.575472	24	43.624409
26 [Social contact residence]	110	1	156	31.554545	24	32.278202

# graph
ggplot(data = .freq) +
  geom_boxplot(aes(x = fct_rev(description), y = location_freq_visit)) +
  scale_y_continuous(limits = c(0, 365)) +
  labs(y = "Visits/year (Frequency over 1 visit/day not shown)", x = element_blank()) +
  coord_flip()

3.10.4 Spatial indicators: Camille Perchoux’s toolbox

-- Reading Camille tbx indics from Essence table
SELECT interact_id,
  n_acti_places, n_weekly_vst, n_acti_types,
  cvx_perimeter, cvx_surface,
  min_length, max_length, median_length, 
  pct_visits_neighb, 
  n_acti_prn, pct_visits_prn, prn_area_km2
FROM essence_table.essence_perchoux_tbx
WHERE city_id = 'Victoria' AND wave_id = 3 AND status = 'return'

3.10.4.1 Indicators related to the lifestyle

Indicators	Measurement approach
Number of activity places (`n_acti_places`)	Count of activity places
Number of visits to places per week (`n_weekly_vst`)	Number of activity places per individual multiplied by the frequency of visit per week to each location, excluding the residence
Number of activity types (`n_acti_types`)	6 types of activities considered: 1-Residential; 2-Occupation; 3-Shopping activities; 4-Services; 5-Transportation; 6-Leisure activities (NB original categories were as follow: 1-Residential; 2-work; 3-food and other services; 4-transport station/stop; 5-recreational activity; 6-social activity)

LLindic <- ess.tab.camille %>%
  select("interact_id", "n_acti_places", "n_weekly_vst", "n_acti_types")
.llmtx <- as.matrix(summary(LLindic[c("n_acti_places", "n_weekly_vst", "n_acti_types")], digits = 1))
.llmtx <- apply(.llmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.llmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
kable(.llmtx, caption = "Lifestyle statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Lifestyle statistics
	n_acti_places	n_weekly_vst	n_acti_types
Min.	2	3	2
1st Qu.	13	12	4
Median	18	17	4
Mean	18	17	4
3rd Qu.	21	22	5
Max.	38	37	6

3.10.4.2 Indicators related to the geometry of the activity space

Indicators	Measurement approach
Perimeter of the convex hull (`cvx_perimeter`)	Perimeter of the smallest polygon containing all the activity locations of the participant (unit: km)
Surface of the convex hull (`cvx_surface`)	Surface of the smallest polygon containing all the activity locations of the participant (unit: km2)
Minimal road network distance from the residence to an activity place (`min_length`)	Minimal distance from the residence to an activity place using the road network (in meters)
Maximal road network distance from the residence to an activity place (`max_length`)	Maximal distance from the residence to an activity place using the road network (in meters)
Median road network distance from the residence to all activity places (`median_length`)	Median distance from home to all activity places using the road network (in meters)

ASindic <- ess.tab.camille %>%
  select("interact_id", "cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")
.asmtx <- as.matrix(summary(ASindic[c("cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")], digits = 1))
.asmtx <- apply(.asmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.asmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
.asmtx[is.na(.asmtx)] <- 0
kable(.asmtx, caption = "Activity space statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Activity space statistics
	cvx_perimeter	cvx_surface	min_length	max_length	median_length
Min.	4	0	0	2e+03	500
1st Qu.	16	11	7	6e+03	1397
Median	28	36	48	1e+04	1961
Mean	73	1054	260	2e+04	2389
3rd Qu.	47	84	388	2e+04	2839
Max.	1627	93167	2514	2e+05	7555

3.10.4.3 Indicators related to the importance of the residential neighborhood

Indicators	Measurement approach
Percentage of visits to places in the residential neighborhood (`pct_visits_neighb`)	Count of visits to places within the 500 m road network buffer centered on the residence divided by the total number of visits to places
Number of activity locations in the PRN (`n_acti_prn`)	Count of activity locations in the PRN
Percentage of visits in the PRN (`pct_visits_prn`)	Count of visits to places in the PRN divided by the total number of visits to places
Surface of the PRN (`prn_area_km2`)	Unit: km2

RNindic <- ess.tab.camille %>%
  select("interact_id", "pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")
RNindic <- RNindic[RNindic$interact_id %in% prn$interact_id, ]
.rnmtx <- as.matrix(summary(RNindic[c("pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")], digits = 1))
.rnmtx <- apply(.rnmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.rnmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
.rnmtx[is.na(.rnmtx)] <- 0
kable(.rnmtx, caption = "Residential neighbourhood statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Residential neighbourhood statistics
	pct_visits_neighb	n_acti_prn	pct_visits_prn	prn_area_km2
Min.	0	0	0	0.02
1st Qu.	3	2	29	0.84
Median	25	4	52	1.58
Mean	28	5	49	2.84
3rd Qu.	46	7	71	2.69
Max.	98	18	100	49.04

Only participants with valid PRN considered in Residential Neighbourhood indicators (= 99 participants).

3.10.5 Social indicators: Alexandre Naud’s toolbox

See Alex’s document for a more comprehensive presentation of the social indicators.

-- Reading Alex tbx indics from Essence table
SELECT interact_id,
  people_degree, 
  socialize_size, socialize_meet, socialize_chat,
  important_size, group_degree, simmelian
FROM essence_table.essence_naud_social
WHERE city_id = 'Victoria' AND wave_id = 3 AND status = 'return'

3.10.5.1 Number of people in the network (`people_degree`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = people_degree))

kable(t(as.matrix(summary(ess.tab.alex$people_degree))), caption = "people_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

people_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	5	6.230769	9	29

3.10.5.2 Simmelian Brokerage (`simmelian`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = simmelian))

kable(t(as.matrix(summary(ess.tab.alex$simmelian))), caption = "simmelian") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

simmelian
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
1	3.857143	7.625	9.24397	12.10882	40.57447	4

3.10.5.3 Number of people with whom the participant like to socialize (`socialize_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = socialize_size))

kable(t(as.matrix(summary(ess.tab.alex$socialize_size))), caption = "socialize_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	4	4.586538	7	16

3.10.5.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

ggplot(filter(ess.tab.alex, socialize_meet < 100)) +
  geom_histogram(aes(x = socialize_meet)) +
  annotate(geom = "text", x = 75, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_meet))), caption = "socialize_meet") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_meet
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	339.75	395	596.4519	680	5588

3.10.5.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

ggplot(filter(ess.tab.alex, socialize_chat < 100)) +
  geom_histogram(aes(x = socialize_chat)) +
  annotate(geom = "text", x = 55, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_chat))), caption = "socialize_chat") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_chat
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	214	514	805.5096	1040	6084

3.10.5.6 Number of people with whom the participant discuss important matters (`important_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = important_size))

kable(t(as.matrix(summary(ess.tab.alex$important_size))), caption = "important_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

important_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	3	3.509615	5	11

3.10.5.7 Number of people in all groups (`group_degree`)

ggplot(filter(ess.tab.alex, group_degree < 100)) +
  geom_histogram(aes(x = group_degree)) +
  annotate(geom = "text", x = 20, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$group_degree))), caption = "group_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

group_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	5	7.519231	10.25	34

INTERACT Victoria Participant VERITAS Summary - W3 - new and returning

Benoit THIERRY

04 April, 2024

1 VERITAS dataset description

2 Basic descriptive statistics for new participants

2.1 Section 1: Residence and Neighbourhood

2.1.1 Now, let’s start with your home. What is your address?

2.1.2 If you were asked to draw the boundaries of your neighbourhood, what would they be?

2.1.3 How attached are you to your neighbourhood?

2.1.4 On average, how many hours per day do you spend outside of your home?

2.1.5 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

2.1.6 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

2.1.7 Do you spend the night somewhere other than your home at least once per week?

2.2 Section 2: Occupation

2.2.1 Are you currently working?

2.2.2 Where do you work?

2.2.3 On average, how many hours per week do you work?

2.2.4 Are you currently a registered student?

2.2.5 Where do you study?

2.2.6 On average, how many hours per week do you study?

2.3 Section 3: Shopping activities

2.4 Section 4: Services

2.5 Section 5: Transportation

2.5.1 Do you use public transit from your home?

2.5.2 Where are the public transit stops that you access from your home?

2.6 Section 6: Leisure activities

2.7 Section 7: Other places/activities

2.7.1 Are there other places that you go to at least once per month that we have not mentioned? For example: a mall, a daycare, a hardware store, or a community center.

2.7.2 Can you locate this place?

2.8 Section 8: Areas of change

2.9 Section 9: Social contact

2.9.1 Do you visit anyone at his or her home at least once per month?

2.9.2 Great, we are almost done completing this questionnaire. You have documented all your activity places on a map, and specified with whom you generally do these activities. These last few questions concern the people you documented earlier.

2.9.2.1 Among these people, who do you discuss important matters with?

2.9.2.2 Among these people, who do you like to socialize with?

2.9.2.3 Among these people, who do you meet often with but do not necessarily feel close to?

2.9.2.4 Among these people, who knows whom?

2.10 Derived metrics

2.10.1 Transportation mode preferences

2.10.2 Visiting places alone

2.10.3 Visit frequency

2.10.4 Spatial indicators: Camille Perchoux’s toolbox

2.10.4.1 Indicators related to the lifestyle

2.10.4.2 Indicators related to the geometry of the activity space

2.10.4.3 Indicators related to the importance of the residential neighborhood

2.10.5 Social indicators: Alexandre Naud’s toolbox

2.10.5.1 Number of people in the network (people_degree)

2.10.5.2 Simmelian Brokerage (simmelian)

2.10.5.3 Number of people with whom the participant like to socialize (socialize_size)

2.10.5.4 Weekly face-to-face interactions among people with whom the participant like to socialize (socialize_meet)

2.10.5.5 Weekly ICT interactions among people with whom the participant like to socialize (socialize_chat)

2.10.5.6 Number of people with whom the participant discuss important matters (important_size)

2.10.5.7 Number of people in all groups (group_degree)

3 Basic descriptive statistics for returning participants

3.1 Section 1: Residence and Neighbourhood

3.1.1 Now, let’s start with your home. What is your address?

3.1.2 If you were asked to draw the boundaries of your neighbourhood, what would they be?

3.1.3 How attached are you to your neighbourhood?

3.1.4 On average, how many hours per day do you spend outside of your home?

3.1.5 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

3.1.6 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

3.1.7 Do you spend the night somewhere other than your home at least once per week?

3.2 Section 2: Occupation

3.2.1 Are you currently working?

3.2.2 Where do you work?

3.2.3 On average, how many hours per week do you work?

3.2.4 Are you currently a registered student?

3.2.5 Where do you study?

3.2.6 On average, how many hours per week do you study?

3.3 Section 3: Shopping activities

3.3.1 In Date of Previous Data Collection Wave, you reported shopping at these locations. Do you still visit these places?

3.3.2 Thinking about the places where you shop, are there other supermarkets, farmers markets, bakeries, specialty stores, convenience stores or liquor stores you visit at least once per month?

3.4 Section 4: Services

3.4.1 In Date of Previous Data Collection Wave, you reported using services at these locations. Do you still visit these places?

3.4.2 Thinking about the places where you use services, are there other banks, hair salons, post offices, drugstores, doctors or other healthcare providers you visit at least once per month?

3.5 Section 5: Transportation

3.5.1 In Date of Previous Data Collection Wave, you reported accessing these public transit stops from your home. Do you still access these places?

3.5.2 Are there other public transit stops you access from your home at least once per month?

3.6 Section 6: Leisure activities

3.6.1 In Date of Previous Data Collection Wave, you reported doing leisure activities at these locations. Do you still visit these places?

2.10.5.1 Number of people in the network (`people_degree`)

2.10.5.2 Simmelian Brokerage (`simmelian`)

2.10.5.3 Number of people with whom the participant like to socialize (`socialize_size`)

2.10.5.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

2.10.5.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

2.10.5.6 Number of people with whom the participant discuss important matters (`important_size`)

2.10.5.7 Number of people in all groups (`group_degree`)

3.10.5.1 Number of people in the network (`people_degree`)

3.10.5.2 Simmelian Brokerage (`simmelian`)

3.10.5.3 Number of people with whom the participant like to socialize (`socialize_size`)

3.10.5.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

3.10.5.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

3.10.5.6 Number of people with whom the participant discuss important matters (`important_size`)

3.10.5.7 Number of people in all groups (`group_degree`)