INTERACT Montréal Participant VERITAS Summary

2.1 Section 1: Residence and Neighbourhood

2.1.1 Now, let’s start with your home. What is your address?

home_location <- locations[locations$location_category == 1, ]

## version ggmap
mtl_aoi <- st_bbox(home_location)
names(mtl_aoi) <- c("left", "bottom", "right", "top")
mtl_aoi[["left"]] <- mtl_aoi[["left"]] - .05
mtl_aoi[["right"]] <- mtl_aoi[["right"]] + .05
mtl_aoi[["top"]] <- mtl_aoi[["top"]] + .025
mtl_aoi[["bottom"]] <- mtl_aoi[["bottom"]] - .025

bm <- get_stadiamap(mtl_aoi, zoom = 11, maptype = "stamen_toner_lite") %>%
  ggmap(extent = "device")
bm + geom_sf(data = st_jitter(home_location, .008), inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3) # see https://github.com/r-spatial/sf/issues/336

NB: Home locations have been randomly shifted from their original position to protect privacy.

# Number of participants by municipalites
home_by_municipalites <- st_join(home_location, municipalities["NAME"]) %>%
  as.data.frame()
home_by_mun_cnt <- home_by_municipalites %>%
  group_by(NAME) %>%
  dplyr::count() %>%
  arrange(desc(n), NAME)
home_by_mun_cnt$Shape <- NULL
kable(home_by_mun_cnt, caption = "Number of participants by municipalities") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of participants by municipalities
NAME	n
Montréal	662
Longueuil	63
Laval	44
Brossard	17
Saint-Lambert	17
Mont-Royal	5
Dollard-Des Ormeaux	4
Montréal-Ouest	4
Pointe-Claire	4
Westmount	4
Baie-D’Urfé	3
Côte-Saint-Luc	2
Beaconsfield	1
Dorval	1
Hampstead	1
Kirkland	1

2.1.2 When did you move to your current address?

# N of addresses by date of move
year_of_move <- veritas_main[c("interact_id", "home_move_date")]
year_of_move$home_move_date <- year(ymd(year_of_move$home_move_date))
ggplot(data = year_of_move) +
  geom_histogram(aes(x = home_move_date))

# recode date of move
year_of_move$home_move_date_recode <- as.character(year_of_move$home_move_date)
year_of_move$home_move_date_recode[year_of_move$home_move_date <= 2005] <- "2005 - 2001"
year_of_move$home_move_date_recode[year_of_move$home_move_date <= 2000] <- "2000 - 1991"
year_of_move$home_move_date_recode[year_of_move$home_move_date <= 1990] <- paste("1990 -", min(year_of_move$home_move_date))

year_of_move_cnt <- year_of_move %>%
  group_by(home_move_date_recode) %>%
  dplyr::count() %>%
  arrange(desc(home_move_date_recode))
kable(year_of_move_cnt, caption = "Year of move to current address") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Year of move to current address
home_move_date_recode	n
2019	1
2018	115
2017	88
2016	86
2015	74
2014	46
2013	42
2012	34
2011	28
2010	37
2009	15
2008	29
2007	13
2006	14
2005 - 2001	77
2000 - 1991	77
1990 - 1936	57

2.1.3 Please list all of the other places you have lived for six months or longer since 2006 and the corresponding time period.

NB all addresses since 2006, including the current one.

# Min, max, median & mean N of addresses by participant since 2006
histo_addr_cnt <- histo_address[c("interact_id")] %>%
  bind_rows(veritas_main[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()
kable(t(as.matrix(summary(histo_addr_cnt$n))),
  caption = "Number of residential addresses by participant since 2006",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of residential addresses by participant since 2006
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1	1	2	2.3	3	12

2.1.4 If you were asked to draw the boundaries of your neighbourhood, what would they be?

prn <- poly_geom[poly_geom$area_type == "neighborhood", ]

## version ggmap
bm + geom_sf(data = prn, inherit.aes = FALSE, fill = alpha("blue", 0.05), color = alpha("blue", 0.3))

# Min, max, median & mean area of PRN
prn <- prn %>%
  mutate(area_m2 = st_area(.))
kable(t(as.matrix(summary(prn$area_m2))),
  caption = "Area (in square meters) of the perceived residential neighborhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived residential neighborhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
286.9	1305192	2637861	4391360	5081679	49572318

NB only 697 valid neighborhoods were collected, as many participants struggled to draw polygons on the map.

2.1.5 How attached are you to your neighbourhood?

# extract and recode
.ngh_att <- veritas_main[veritas_main$neighbourhood_attach != 99, c("interact_id", "neighbourhood_attach")] %>%
  dplyr::rename(neighbourhood_attach_code = neighbourhood_attach)
.ngh_att$neighbourhood_attach <- factor(ifelse(.ngh_att$neighbourhood_attach_code == 1, "1 [Not attached at all]",
  ifelse(.ngh_att$neighbourhood_attach_code == 6, "6 [Very attached]",
    .ngh_att$neighbourhood_attach_code
  )
))

# histogram of attachment
ggplot(data = .ngh_att) +
  geom_histogram(aes(x = neighbourhood_attach), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "neighbourhood_attach")

.ngh_att_cnt <- .ngh_att %>%
  group_by(neighbourhood_attach) %>%
  dplyr::count() %>%
  arrange(neighbourhood_attach)
kable(.ngh_att_cnt, caption = "Neigbourhood attachment") %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Neigbourhood attachment
neighbourhood_attach	n
1 [Not attached at all]	22
2	51
3	81
4	138
5	276
6 [Very attached]	255

2.1.6 On average, how many hours per day do you spend outside of your home?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$hours_out))),
  caption = "Hours/day outside home",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside home
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	4	8	7.4	10	24

2.1.7 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

# histogram of n hours out
ggplot(data = veritas_main) +
  geom_histogram(aes(x = hours_out_neighb))

# Min, max, median & mean hours/day out of neighborhood
kable(t(as.matrix(summary(veritas_main$hours_out_neighb))),
  caption = "Hours/day outside neighbourhood",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Hours/day outside neighbourhood
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	7	5.6	9	24

2.1.8 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

# extract and recode
.unsafe <- veritas_main[c("interact_id", "unsafe_area")] %>% dplyr::rename(unsafe_area_code = unsafe_area)
.unsafe$unsafe_area <- factor(ifelse(.unsafe$unsafe_area_code == 1, "1 [Yes]",
  ifelse(.unsafe$unsafe_area_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .unsafe) +
  geom_histogram(aes(x = unsafe_area), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "unsafe_area")

.unsafe_cnt <- .unsafe %>%
  group_by(unsafe_area) %>%
  dplyr::count() %>%
  arrange(unsafe_area)
kable(.unsafe_cnt, caption = "unsafe_area") %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

unsafe_area
unsafe_area	n
1 [Yes]	103
2 [No]	730

# map
unsafe <- poly_geom[poly_geom$area_type == "unsafe area", ]

## version ggmap
bm + geom_sf(data = unsafe, inherit.aes = FALSE, fill = alpha("blue", 0.3), color = alpha("blue", 0.5))

# Min, max, median & mean area of PRN
unsafe <- unsafe %>%
  mutate(area_m2 = st_area(.))
kable(t(as.matrix(summary(unsafe$area_m2))),
  caption = "Area (in square meters) of the perceived unsafe area",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived unsafe area
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1408.7	90013.6	277801	670393.3	690785.5	5632692

2.1.9 Do you spend the night somewhere other than your home at least once per week?

# extract and recode
.o_res <- veritas_main[c("interact_id", "other_resid")] %>% dplyr::rename(other_resid_code = other_resid)
.o_res$other_resid <- factor(ifelse(.o_res$other_resid_code == 1, "1 [Yes]",
  ifelse(.o_res$other_resid_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .o_res) +
  geom_histogram(aes(x = other_resid), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other_resid")

.o_res_cnt <- .o_res %>%
  group_by(other_resid) %>%
  dplyr::count() %>%
  arrange(other_resid)
kable(.o_res_cnt, caption = "Other residence") %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Other residence
other_resid	n
1 [Yes]	101
2 [No]	732

2.2 Section 2: Occupation

2.2.1 Are you currently working?

# extract and recode
.work <- veritas_main[c("interact_id", "working")] %>% dplyr::rename(working_code = working)
.work$working <- factor(ifelse(.work$working_code == 1, "1 [Yes]",
  ifelse(.work$working_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .work) +
  geom_histogram(aes(x = working), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "working")

.work_cnt <- .work %>%
  group_by(working) %>%
  dplyr::count() %>%
  arrange(working)
kable(.work_cnt, caption = "Currently working") %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently working
working	n
1 [Yes]	601
2 [No]	232

2.2.2 Where do you work?

work_location <- locations[locations$location_category == 3, ]

bm + geom_sf(data = work_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.2.3 On average, how many hours per week do you work?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$working == 1, ]) +
  geom_histogram(aes(x = work_hours))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$work_hours[veritas_main$working == 1]))),
  caption = "Work hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Work hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	32	36	34.9	40	70

2.2.4 Which of the following categories best describes the amount of physical activity required for your job?

# extract and recode
.work_pa <- veritas_main[veritas_main$working == 1, c("interact_id", "work_pa")] %>% dplyr::rename(work_pa_code = work_pa)
.work_pa$work_pa <- factor(ifelse(.work_pa$work_pa_code == 1, "1 [Mainly sitting with slight arm movements]",
  ifelse(.work_pa$work_pa_code == 2, "2 [Sitting and standing with some walking]",
    ifelse(.work_pa$work_pa_code == 3, "3 [Walking, with some handling of materials generally weighing less than 25 kg (55 lbs)]",
      ifelse(.work_pa$work_pa_code == 4, "4 [Walking and heavy manual work often requiring handling of materials weighing over 25 kg (50 lbs)]", "N/A")
    )
  )
))

# histogram of answers
ggplot(data = .work_pa) +
  geom_histogram(aes(x = work_pa), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Physical activity at work")

.work_pa_cnt <- .work_pa %>%
  group_by(work_pa) %>%
  dplyr::count() %>%
  arrange(work_pa)
kable(.work_pa_cnt, caption = "Physical activity at work") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Physical activity at work
work_pa	n
1 [Mainly sitting with slight arm movements]	328
2 [Sitting and standing with some walking]	202
3 [Walking, with some handling of materials generally weighing less than 25 kg (55 lbs)]	63
4 [Walking and heavy manual work often requiring handling of materials weighing over 25 kg (50 lbs)]	8

2.2.5 Are you currently a registered student?

# extract and recode
.study <- veritas_main[c("interact_id", "studying")] %>% dplyr::rename(studying_code = studying)
.study$studying <- factor(ifelse(.study$studying_code == 1, "1 [Yes]",
  ifelse(.study$studying_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .study) +
  geom_histogram(aes(x = studying), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Studying")

.study_cnt <- .study %>%
  group_by(studying) %>%
  dplyr::count() %>%
  arrange(studying)
kable(.study_cnt, caption = "Currently studying") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Currently studying
studying	n
1 [Yes]	126
2 [No]	707

2.2.6 Where do you study?

study_location <- locations[locations$location_category == 4, ]

bm + geom_sf(data = study_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.2.7 On average, how many hours per week do you study?

# histogram of n hours out
ggplot(data = veritas_main[veritas_main$studying == 1, ]) +
  geom_histogram(aes(x = study_hours))

# Min, max, median & mean hours/day out
kable(t(as.matrix(summary(veritas_main$study_hours[veritas_main$studying == 1]))),
  caption = "study hours/week",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

study hours/week
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	10	20	23.5	34.2	65

2.3 Section 3: Shopping activities

The following questions are used to generate the locations grouped into this section:

Do you shop for groceries at a supermarket at least once per month?
Do you shop at a public/farmer’s market at least once per month?
Do you shop at a bakery at least once per month?
Do you go to a specialty food store at least once per month? For example: a cheese shop, fruit and vegetable store, butcher’s shop, natural and health food store.
Do you go to a convenience store at least once per month?
Do you go to a liquor store at least once per month?

shop_location <- locations[locations$location_category %in% c(5, 6, 7, 8, 9, 10), ] %>% dplyr::rename(location_category_code = location_category)
shop_location$location_category <- factor(ifelse(shop_location$location_category_code == 5, " 5 [Supermarket]",
  ifelse(shop_location$location_category_code == 6, " 6 [Public/farmer’s market]",
    ifelse(shop_location$location_category_code == 7, " 7 [Bakery]",
      ifelse(shop_location$location_category_code == 8, " 8 [Specialty food store]",
        ifelse(shop_location$location_category_code == 9, " 9 [Convenience store/Dépanneur]", "10 [Liquor store/SAQ]")
      )
    )
  )
))

# map
bm + geom_sf(data = shop_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = shop_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Shopping locations by categories")

.location_category_cnt <- as.data.frame(shop_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
5 [Supermarket]	2205
6 [Public/farmer’s market]	427
7 [Bakery]	621
8 [Specialty food store]	711
9 [Convenience store/Dépanneur]	466
10 [Liquor store/SAQ]	697

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(shop_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data.frame(
  interact_iid = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- distinct(.loc_iid_category_cnt[c("location_category")])
  .dmy$interact_id <- as.character(iid)
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_shop_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_shop_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
5 [Supermarket]	2.65	2	5
6 [Public/farmer’s market]	0.51	0	4
7 [Bakery]	0.75	0	5
8 [Specialty food store]	0.85	0	5
9 [Convenience store/Dépanneur]	0.56	0	5
10 [Liquor store/SAQ]	0.84	1	5

2.4 Section 4: Services

The following questions are used to generate the locations grouped into this section:

Where is the bank you go to most often located?
Where is the hair salon or barber shop you go to most often?
Where is the post office where you go to most often?
Where is the drugstore you go to most often?
If you need to visit a doctor or other healthcare provider, where do you go most often?

serv_location <- locations[locations$location_category %in% c(11, 12, 13, 14, 15), ] %>% dplyr::rename(location_category_code = location_category)
serv_location$location_category <- factor(ifelse(serv_location$location_category_code == 11, "11 [Bank]",
  ifelse(serv_location$location_category_code == 12, "12 [Hair salon/barbershop]",
    ifelse(serv_location$location_category_code == 13, "13 [Post office]",
      ifelse(serv_location$location_category_code == 14, "14 [Drugstore]", "15 Doctor/healthcare provider]")
    )
  )
))

# map
bm + geom_sf(data = serv_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = serv_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Shopping locations by categories")

.location_category_cnt <- as.data.frame(serv_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
11 [Bank]	464
12 [Hair salon/barbershop]	528
13 [Post office]	472
14 [Drugstore]	770
15 Doctor/healthcare provider]	734

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(serv_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data.frame(
  interact_iid = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- distinct(.loc_iid_category_cnt[c("location_category")])
  .dmy$interact_id <- as.character(iid)
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_serv_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)
.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_serv_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of shopping locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of shopping locations by participant and category
location_category	mean	median	max
11 [Bank]	0.56	1	1
12 [Hair salon/barbershop]	0.63	1	1
13 [Post office]	0.57	1	1
14 [Drugstore]	0.92	1	1
15 Doctor/healthcare provider]	0.88	1	5

2.5 Section 5: Transportation

2.5.1 Do you use public transit from your home?

# extract and recode
.transp <- veritas_main[c("interact_id", "public_transit")] %>% dplyr::rename(public_transit_code = public_transit)
.transp$public_transit <- factor(ifelse(.transp$public_transit_code == 1, "1 [Yes]",
  ifelse(.transp$public_transit_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .transp) +
  geom_histogram(aes(x = public_transit), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "public_transit")

.transp_cnt <- .transp %>%
  group_by(public_transit) %>%
  dplyr::count() %>%
  arrange(public_transit)
kable(.transp_cnt, caption = "Use public transit") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Use public transit
public_transit	n
1 [Yes]	706
2 [No]	127

2.5.2 Where are the public transit stops that you access from your home?

transp_location <- locations[locations$location_category == 16, ]

bm + geom_sf(data = transp_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.6 Section 6: Leisure activities

The following questions are used to generate the locations grouped into this section:

Do you participate in any (individual or group) sports or leisure-time physical activities at least once per month?
Do you visit a park at least once per month?
Do you participate in or attend as a spectator a cultural or non-sport leisure activity at least once per month? For example: singing or drawing lessons, book or poker club, concert or play.
Do you volunteer at least once per month?
Do you engage in any religious or spiritual activities at least once per month?
Do you go to a restaurant, café, bar or other food and drink establishment at least once per month?
Do you get take-out food at least once per month?
Do you regularly go for walks?

leisure_location <- locations[locations$location_category %in% c(17, 18, 19, 20, 21, 22, 23, 24), ] %>% dplyr::rename(location_category_code = location_category)
leisure_location$location_category <- factor(ifelse(leisure_location$location_category_code == 17, "17 [Leisure-time physical activity]",
  ifelse(leisure_location$location_category_code == 18, "18 [Park]",
    ifelse(leisure_location$location_category_code == 19, "19 [Cultural activity]",
      ifelse(leisure_location$location_category_code == 20, "20 [Volunteering place]",
        ifelse(leisure_location$location_category_code == 21, "21 [Religious or spiritual activity]",
          ifelse(leisure_location$location_category_code == 22, "22 [Restaurant, café, bar, etc. ]",
            ifelse(leisure_location$location_category_code == 23, "23 [Take-out]", "24 [Walk]")
          )
        )
      )
    )
  )
))

# map
bm + geom_sf(data = leisure_location, inherit.aes = FALSE, aes(color = location_category), size = 1.5, alpha = .3) +
  scale_color_brewer(palette = "Accent") +
  theme(legend.position = "bottom", legend.text = element_text(size = 8), legend.title = element_blank())

# compute number of shopping locations by category
ggplot(data = leisure_location) +
  geom_histogram(aes(x = location_category), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "Leisure locations by categories")

.location_category_cnt <- as.data.frame(leisure_location[c("location_category")]) %>%
  group_by(location_category) %>%
  dplyr::count() %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Shopping locations by categories") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Shopping locations by categories
location_category	n
17 [Leisure-time physical activity]	947
18 [Park]	1031
19 [Cultural activity]	704
20 [Volunteering place]	262
21 [Religious or spiritual activity]	62
22 [Restaurant, café, bar, etc. ]	1579
23 [Take-out]	491
24 [Walk]	921

# compute statistics on shopping locations by participants and categories
# > one needs to account for participants who did not report location for some categories
.loc_iid_category_cnt <- as.data.frame(leisure_location[c("interact_id", "location_category")]) %>%
  group_by(interact_id, location_category) %>%
  dplyr::count()

# (cont'd) simulate SQL JOIN TABLE ON TRUE
.dummy <- data.frame(
  interact_iid = character(),
  location_category = character()
)
for (iid in as.vector(veritas_main$interact_id)) {
  .dmy <- distinct(.loc_iid_category_cnt[c("location_category")])
  .dmy$interact_id <- as.character(iid)
  .dummy <- rbind(.dummy, .dmy)
}

# (cont'd) find iid/categ combination without match in veritas locations
.no_leisure_iid <- dplyr::setdiff(.dummy, .loc_iid_category_cnt[c("location_category", "interact_id")]) %>%
  mutate(n = 0)

.loc_iid_category_cnt <- bind_rows(.loc_iid_category_cnt, .no_leisure_iid)

.location_category_cnt <- .loc_iid_category_cnt %>%
  group_by(location_category) %>%
  dplyr::summarise(min = min(n), mean = round(mean(n), 2), median = median(n), max = max(n)) %>%
  arrange(location_category)
kable(.location_category_cnt, caption = "Number of leisure locations by participant and category") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of leisure locations by participant and category
location_category	mean	median	max
17 [Leisure-time physical activity]	1.14	1	5
18 [Park]	1.24	1	5
19 [Cultural activity]	0.85	0	5
20 [Volunteering place]	0.31	0	5
21 [Religious or spiritual activity]	0.07	0	3
22 [Restaurant, café, bar, etc. ]	1.90	1	5
23 [Take-out]	0.59	0	5
24 [Walk]	1.11	1	5

2.7 Section 7: Other places/activities

2.7.1 Are there other places that you go to at least once per month that we have not mentioned? For example: a mall, a daycare, a hardware store, or a community center.

# extract and recode
.other <- veritas_main[c("interact_id", "other")] %>% dplyr::rename(other_code = other)
.other$other <- factor(ifelse(.other$other_code == 1, "1 [Yes]",
  ifelse(.other$other_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .other) +
  geom_histogram(aes(x = other), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "other")

.other_cnt <- .other %>%
  group_by(other) %>%
  dplyr::count() %>%
  arrange(other)
kable(.other_cnt, caption = "Other places") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Other places
other	n
1 [Yes]	356
2 [No]	477

2.7.2 Can you locate this place?

other_location <- locations[locations$location_category == 25, ]

bm + geom_sf(data = other_location, inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

2.8 Section 8: Areas of change

2.8.1 Can you locate areas where you have noticed an improvement of the urban environment?

# extract and recode
.improv <- veritas_main[c("interact_id", "improvement_none")] %>% dplyr::rename(improvement_none_code = improvement_none)
.improv$improvement_none <- factor(ifelse(.improv$improvement_none_code == 1, "1 [TRUE]",
  ifelse(.improv$improvement_none_code == 0, "0 [FALSE]", "N/A")
))

# histogram of answers
ggplot(data = .improv) +
  geom_histogram(aes(x = improvement_none), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "improvement_none")

.improv_cnt <- .improv %>%
  group_by(improvement_none) %>%
  dplyr::count() %>%
  arrange(improvement_none)
kable(.improv_cnt, caption = "No area of improvement") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

No area of improvement
improvement_none	n
0 [FALSE]	502
1 [TRUE]	331

# polgon extraction
improv <- poly_geom[poly_geom$area_type == "improvement", ]

# Map
bm + geom_sf(data = improv, inherit.aes = FALSE, fill = alpha("blue", 0.3), color = alpha("blue", 0.5))

# Min, max, median & mean area of PRN
improv <- improv %>%
  mutate(area_m2 = st_area(.))
kable(t(as.matrix(summary(improv$area_m2))),
  caption = "Area (in square meters) of the perceived improvement areas",
  digits = 1
) %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived improvement areas
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
28.5	22760.3	62364.5	404274.8	214217.8	31631988

2.8.2 Can you locate areas where you have noticed a deterioration of the urban environment?

# extract and recode
.deter <- veritas_main[c("interact_id", "deterioration_none")] %>% dplyr::rename(deterioration_none_code = deterioration_none)
.deter$deterioration_none <- factor(ifelse(.deter$deterioration_none_code == 1, "1 [TRUE]",
  ifelse(.deter$deterioration_none_code == 0, "0 [FALSE]", "N/A")
))

# histogram of answers
ggplot(data = .deter) +
  geom_histogram(aes(x = deterioration_none), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "deterioration_none")

.deter_cnt <- .deter %>%
  group_by(deterioration_none) %>%
  dplyr::count() %>%
  arrange(deterioration_none)
kable(.deter_cnt, caption = "No area of deterioration") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

No area of deterioration
deterioration_none	n
0 [FALSE]	278
1 [TRUE]	555

# polgon extraction
deter <- poly_geom[poly_geom$area_type == "deterioration", ]

# Map
bm + geom_sf(data = deter, inherit.aes = FALSE, fill = alpha("blue", 0.3), color = alpha("blue", 0.5))

# Min, max, median & mean area of PRN
deter <- deter %>%
  mutate(area_m2 = st_area(.))
kable(t(as.matrix(summary(deter$area_m2))),
  caption = "Area (in square meters) of the perceived deterioration areas",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Area (in square meters) of the perceived deterioration areas
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
22.8	20862	72626.9	8436780	340538.2	1920251731

2.9 Section 9: Social contact

2.9.1 Do you visit anyone at his or her home at least once per month?

# extract and recode
.visiting <- veritas_main[c("interact_id", "visiting")] %>% dplyr::rename(visiting_code = visiting)
.visiting$visiting <- factor(ifelse(.visiting$visiting_code == 1, "1 [Yes]",
  ifelse(.visiting$visiting_code == 2, "2 [No]", "N/A")
))

# histogram of answers
ggplot(data = .visiting) +
  geom_histogram(aes(x = visiting), stat = "count") +
  scale_x_discrete(labels = function(lbl) str_wrap(lbl, width = 20)) +
  labs(x = "visiting")

.visiting_cnt <- .visiting %>%
  group_by(visiting) %>%
  count() %>%
  arrange(visiting)
kable(.visiting_cnt, caption = "Social contact") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Social contact
visiting	n
1 [Yes]	550
2 [No]	283

2.9.2 Where does this person live?

visiting_location <- locations[locations$location_category == 26, ]

bm + geom_sf(data = st_jitter(visiting_location, .008), inherit.aes = FALSE, color = "blue", size = 1.8, alpha = .3)

ATTENTION: a bug in Treksoft survey has prevented the collection of any visiting locations in Montréal.

2.9.3 Great, we are almost done completing this questionnaire. You have documented all your activity places on a map, and specified with whom you generally do these activities. These last few questions concern the people you documented earlier.

# compute statistics on groups / participant
# > one needs to account for participants who did not report any group
.gr_iid_cnt <- as.data.frame(group[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_gr_iid <- anti_join(veritas_main[c("interact_id")], .gr_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.gr_iid_cnt <- bind_rows(.gr_iid_cnt, .no_gr_iid)

kable(t(as.matrix(summary(.gr_iid_cnt$n))),
  caption = "Number of groups per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of groups per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	0.9	1	8

# compute statistics on people / participant
# > one needs to account for participants who did not report any group
.pl_iid_cnt <- as.data.frame(people[c("interact_id")]) %>%
  group_by(interact_id) %>%
  dplyr::count()

# (cont'd) find iid combination without match in veritas group
.no_pl_iid <- anti_join(veritas_main[c("interact_id")], .pl_iid_cnt, by = "interact_id") %>%
  mutate(n = 0)
.pl_iid_cnt <- bind_rows(.pl_iid_cnt, .no_pl_iid)

kable(t(as.matrix(summary(.pl_iid_cnt$n))),
  caption = "Number of people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	4.5	6	35

# histogram
.sc_iid_cnt <- .pl_iid_cnt %>% mutate(soc_type = "people")
.sc_iid_cnt <- .gr_iid_cnt %>%
  mutate(soc_type = "group") %>%
  bind_rows(.sc_iid_cnt)

ggplot(data = .sc_iid_cnt) +
  geom_histogram(aes(x = n, y = stat(count), fill = soc_type), position = "dodge") +
  labs(x = "Social network size by element type", fill = element_blank())

2.9.3.1 Among these people, who do you discuss important matters with?

# extract number of important people / participant
.n_important <- important %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_imp <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_important, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_important = n.y) %>%
  mutate(pct = 100 * n_important / n_people)

kable(t(as.matrix(summary(.n_people_imp$n_important))),
  caption = "Number of important people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of important people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	2.9	4	26

kable(t(as.matrix(summary(.n_people_imp$pct))),
  caption = "% of important people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of important people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	50	68.8	68	100	100	60

2.9.3.2 Among these people, who do you like to socialize with?

# extract number of important people / participant
.n_socialize <- socialize %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_soc <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_socialize, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_socialize = n.y) %>%
  mutate(pct = 100 * n_socialize / n_people)

kable(t(as.matrix(summary(.n_people_soc$n_socialize))),
  caption = "Number of people with whom to socialize per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of people with whom to socialize per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	3	3.7	5	32

kable(t(as.matrix(summary(.n_people_soc$pct))),
  caption = "% of people with whom to  socialize among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of people with whom to socialize among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	71.4	100	84.5	100	100	60

2.9.3.3 Among these people, who do you meet often with but do not necessarily feel close to?

# extract number of important people / participant
.n_not_close <- not_close %>% dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_not_close <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_not_close, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_not_close = n.y) %>%
  mutate(pct = 100 * n_not_close / n_people)

kable(t(as.matrix(summary(.n_people_not_close$n_not_close))),
  caption = "Number of not so close people per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of not so close people per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	0.7	1	15

kable(t(as.matrix(summary(.n_people_not_close$pct))),
  caption = "% of not so close people among social contact per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of not so close people among social contact per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	0	0	15.6	25	100	60

2.9.3.4 Among these people, who knows whom?

# extract number of who knows who relationships
.n_relat <- relationship %>%
  filter(relationship_type == 1) %>%
  dplyr::count(interact_id)
.n_people <- people %>% dplyr::count(interact_id)

.n_people_relat <- left_join(veritas_main[c("interact_id")], .n_people, by = "interact_id") %>%
  left_join(.n_relat, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), 0)) %>%
  dplyr::rename(n_people = n.x, n_relat = n.y) %>%
  mutate(pct = 100 * n_relat * 2 / (n_people * (n_people - 1))) # potential number of relationships = N x (N -1) / 2

kable(t(as.matrix(summary(.n_people_relat$n_relat))),
  caption = "Number of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Number of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	0.9	0	41

kable(t(as.matrix(summary(.n_people_relat$pct))),
  caption = "% of relationships « who knows who » per participant",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

% of relationships « who knows who » per participant
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	0	0	11.3	0	100	206

2.10 Derived metrics

2.10.1 Existence of improvement and deterioration areas by participant

Combination of improvement and/or deterioration areas per participant

# cross tab of improvement vs deteriation areas
.improv <- improv[c("interact_id")] %>%
  mutate(improv = "Improvement")
.deter <- deter[c("interact_id")] %>%
  mutate(deter = "Deterioration")
.ct_impr_deter <- veritas_main[c("interact_id")] %>%
  transmute(interact_id = as.character(interact_id)) %>%
  left_join(.improv, by = "interact_id") %>%
  left_join(.deter, by = "interact_id") %>%
  mutate_all(~ replace(., is.na(.), "N/A"))

kable(table(.ct_impr_deter$improv, .ct_impr_deter$deter), caption = "Improvement vs. deterioration") %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left", row_label_position = "r") %>%
  column_spec(1, bold = T)

Improvement vs. deterioration
	Deterioration	N/A
Improvement	177	271
N/A	70	315

2.10.2 Transportation mode preferences

Based on the answers to the question Usually, how do you go there? (Check all that apply.).

# code  en
# 1 By car and you drive
# 2 By car and someone else drives
# 3 By taxi/Uber
# 4 On foot
# 5 By bike
# 6 By bus
# 7 By subway
# 8 By train
# 99    Other

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.tm <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels)

.tm_grouped <- .tm %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "By car (driver)" = sum(location_tmode_1),
    "By car (passenger)" = sum(location_tmode_2),
    "By taxi/Uber" = sum(location_tmode_3),
    "On foot" = sum(location_tmode_4),
    "By bike" = sum(location_tmode_5),
    "By bus" = sum(location_tmode_6),
    "By train" = sum(location_tmode_7),
    "By subway" = sum(location_tmode_8),
    "Other" = sum(location_tmode_99)
  )

kable(.tm_grouped, caption = "Transportation mode preferences") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Transportation mode preferences
description	N	By car (driver)	By car (passenger)	By taxi/Uber	On foot	By bike	By bus	By train	By subway	Other
2 [Other residence]	105	55	20	3	21	23	21	2	27	4
3 [Work]	740	193	35	18	216	276	212	14	319	56
4 [School/College/University]	149	18	8	0	58	42	57	2	79	7
5 [Supermarket]	2205	851	261	10	1212	333	134	0	77	13
6 [Public/farmer’s market]	427	153	71	2	186	130	50	0	59	6
7 [Bakery]	621	132	44	2	436	133	39	0	20	8
8 [Specialty food store]	711	166	56	2	511	165	53	0	20	1
9 [Convenience store/Dépanneur]	466	55	13	1	405	37	11	1	13	2
10 [Liquor store/SAQ]	697	245	86	2	385	113	40	1	35	4
11 [Bank]	464	141	12	1	315	86	36	1	22	4
12 [Hair salon/barbershop]	528	177	18	6	257	101	80	1	98	6
13 [Post office]	472	116	23	1	369	75	33	0	11	5
14 [Drugstore]	770	206	41	2	577	107	45	0	17	5
15 [Doctor/healthcare provider]	734	296	47	13	241	115	155	2	167	9
16 [Public transit stop]	1624	41	22	1	1466	65	0	0	0	177
17 [Leisure-time physical activity]	947	268	67	2	473	257	103	3	116	24
18 [Park]	1031	85	34	0	806	316	62	0	41	2
19 [Cultural activity]	704	183	69	13	232	119	186	6	310	11
20 [Volunteering place]	262	96	15	3	100	56	50	1	51	19
21 [Religious/spiritual activity]	62	16	5	1	34	6	11	0	14	3
22 [Restaurant, café, bar, etc.]	1579	335	190	14	1004	246	173	3	185	20
23 [Take-out]	491	117	47	3	305	38	26	0	15	39
24 [Walk]	921	58	26	1	828	104	35	2	48	2
25 [Other place]	641	274	69	4	288	133	106	2	85	6

# graph
.tm1 <- .tm %>%
  filter(location_tmode_1 == 1) %>%
  mutate(tm = "[1] By car (driver)")
.tm2 <- .tm %>%
  filter(location_tmode_2 == 1) %>%
  mutate(tm = "[2] By car (passenger)")
.tm3 <- .tm %>%
  filter(location_tmode_3 == 1) %>%
  mutate(tm = "[3] By taxi/Uber")
.tm4 <- .tm %>%
  filter(location_tmode_4 == 1) %>%
  mutate(tm = "[4] On foot")
.tm5 <- .tm %>%
  filter(location_tmode_5 == 1) %>%
  mutate(tm = "[5] By bike")
.tm6 <- .tm %>%
  filter(location_tmode_6 == 1) %>%
  mutate(tm = "[6] By bus")
.tm7 <- .tm %>%
  filter(location_tmode_7 == 1) %>%
  mutate(tm = "[7] By train")
.tm8 <- .tm %>%
  filter(location_tmode_8 == 1) %>%
  mutate(tm = "[8] By subway")
.tm99 <- .tm %>%
  filter(location_tmode_99 == 1) %>%
  mutate(tm = "[99] Other")
.tm <- bind_rows(.tm1, .tm2) %>%
  bind_rows(.tm3) %>%
  bind_rows(.tm4) %>%
  bind_rows(.tm5) %>%
  bind_rows(.tm6) %>%
  bind_rows(.tm7) %>%
  bind_rows(.tm8) %>%
  bind_rows(.tm99)

# histogram of answers
ggplot(data = .tm) +
  geom_bar(aes(x = fct_rev(description), fill = tm), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Transport modes") +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of transportation mode by location category", x = element_blank()) +
  coord_flip() +
  theme(legend.position = "bottom", legend.justification = c(0, 0), legend.text = element_text(size = 8)) +
  guides(fill = guide_legend(nrow = 3))

2.10.3 Visiting places alone

Based on the answers to the question Do you usually go to this place alone or with other people?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.alone <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels) %>%
  mutate(location_alone_recode = case_when(
    location_alone == 1 ~ 1,
    location_alone == 2 ~ 0
  ))

.alone_grouped <- .alone %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), "Visited alone" = sum(location_alone_recode),
    "Visited alone (%)" = round(sum(location_alone_recode) * 100.0 / n(), digits = 1)
  )

kable(.alone_grouped, caption = "Visiting places alone") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visiting places alone
description	N	Visited alone	Visited alone (%)
2 [Other residence]	105	NA	NA
3 [Work]	740	249	33.6
4 [School/College/University]	149	101	67.8
5 [Supermarket]	2205	1532	69.5
6 [Public/farmer’s market]	427	231	54.1
7 [Bakery]	621	442	71.2
8 [Specialty food store]	711	509	71.6
9 [Convenience store/Dépanneur]	466	415	89.1
10 [Liquor store/SAQ]	697	504	72.3
11 [Bank]	464	430	92.7
12 [Hair salon/barbershop]	528	504	95.5
13 [Post office]	472	421	89.2
14 [Drugstore]	770	657	85.3
15 [Doctor/healthcare provider]	734	628	85.6
16 [Public transit stop]	1624	1417	87.3
17 [Leisure-time physical activity]	947	588	62.1
18 [Park]	1031	396	38.4
19 [Cultural activity]	704	185	26.3
20 [Volunteering place]	262	153	58.4
21 [Religious/spiritual activity]	62	35	56.5
22 [Restaurant, café, bar, etc.]	1579	389	24.6
23 [Take-out]	491	291	59.3
24 [Walk]	921	463	50.3
25 [Other place]	641	318	49.6

# histogram of answers
ggplot(data = .alone) +
  geom_bar(aes(x = fct_rev(description), fill = factor(location_alone)), position = "fill") +
  scale_fill_brewer(palette = "Set3", name = "Visiting places", labels = c("N/A", "Alone", "With someone")) +
  scale_y_continuous(labels = percent) +
  labs(y = "Proportion of places visited alone", x = element_blank()) +
  coord_flip()

2.10.4 Visit frequency

Based on the answers to the question How often do you go there?.

loc_labels <- data.frame(location_category = c(2:26), description = c(
  " 2 [Other residence]",
  " 3 [Work]",
  " 4 [School/College/University]",
  " 5 [Supermarket]",
  " 6 [Public/farmer’s market]",
  " 7 [Bakery]",
  " 8 [Specialty food store]",
  " 9 [Convenience store/Dépanneur]",
  "10 [Liquor store/SAQ]",
  "11 [Bank]",
  "12 [Hair salon/barbershop]",
  "13 [Post office]",
  "14 [Drugstore]",
  "15 [Doctor/healthcare provider]",
  "16 [Public transit stop]",
  "17 [Leisure-time physical activity]",
  "18 [Park]",
  "19 [Cultural activity]",
  "20 [Volunteering place]",
  "21 [Religious/spiritual activity]",
  "22 [Restaurant, café, bar, etc.]",
  "23 [Take-out]",
  "24 [Walk]",
  "25 [Other place]",
  "26 [Social contact residence]"
))

# extract and summary stats
.freq <- locations %>%
  st_set_geometry(NULL) %>%
  filter(location_category != 1) %>%
  left_join(loc_labels)

.freq_grouped <- .freq %>%
  group_by(description) %>%
  dplyr::summarise(
    N = n(), min = min(location_freq_visit),
    max = max(location_freq_visit),
    mean = mean(location_freq_visit),
    median = median(location_freq_visit),
    sd = sd(location_freq_visit)
  )

kable(.freq_grouped,
  caption = "Visit frequency (expressed in times/year)",
  digits = 1
) %>%
  kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Visit frequency (expressed in times/year)
description	N	min	max	mean	median	sd
2 [Other residence]	105	6	364	92.1	52	71.7
3 [Work]	740	0	1040	197.8	260	98.5
4 [School/College/University]	149	0	364	142.9	104	95.1
5 [Supermarket]	2205	0	1664	58.8	52	70.4
6 [Public/farmer’s market]	427	0	520	39.3	24	47.6
7 [Bakery]	621	2	260	43.2	24	46.3
8 [Specialty food store]	711	1	1196	39.8	24	61.5
9 [Convenience store/Dépanneur]	466	1	520	58.1	36	67.3
10 [Liquor store/SAQ]	697	1	260	26.1	12	28.1
11 [Bank]	464	1	260	23.6	12	26.9
12 [Hair salon/barbershop]	528	1	36	6.5	5	4.3
13 [Post office]	472	1	312	12.4	6	19.2
14 [Drugstore]	770	1	260	31.9	24	33.2
15 [Doctor/healthcare provider]	734	0	208	4.1	2	11.8
16 [Public transit stop]	1624	0	780	110.6	52	134.8
17 [Leisure-time physical activity]	947	3	364	93.6	52	78.7
18 [Park]	1031	1	1456	72.0	36	100.6
19 [Cultural activity]	704	1	365	16.5	6	30.1
20 [Volunteering place]	262	0	365	58.1	24	78.3
21 [Religious/spiritual activity]	62	2	728	110.0	52	141.1
22 [Restaurant, café, bar, etc.]	1579	1	520	25.0	12	39.4
23 [Take-out]	491	2	208	20.3	12	26.2
24 [Walk]	921	1	728	94.1	52	109.1
25 [Other place]	641	1	728	62.8	24	93.2

# graph
ggplot(data = .freq) +
  geom_boxplot(aes(x = fct_rev(description), y = location_freq_visit)) +
  scale_y_continuous(limits = c(0, 365)) +
  labs(y = "Visits/year (Frequency over 1 visit/day not shown)", x = element_blank()) +
  coord_flip()

2.10.5 Spatial indicators: Camille Perchoux’s toolbox

Below is a list of indicators proposed by Camille Perchoux in her paper Assessing patterns of spatial behavior in health studies: Their socio-demographic determinants and associations with transportation modes (the RECORD Cohort Study).

-- Reading Camille tbx indics from Essence table
SELECT interact_id,
  n_acti_places, n_weekly_vst, n_acti_types,
  cvx_perimeter, cvx_surface,
  min_length, max_length, median_length, 
  pct_visits_neighb, 
  n_acti_prn, pct_visits_prn, prn_area_km2
FROM essence_table.essence_perchoux_tbx
WHERE city_id = 'Montréal' AND wave_id = 1

2.10.5.1 Indicators related to the lifestyle

Indicators	Measurement approach
Number of activity places (`n_acti_places`)	Count of activity places
Number of visits to places per week (`n_weekly_vst`)	Number of activity places per individual multiplied by the frequency of visit per week to each location, excluding the residence
Number of activity types (`n_acti_types`)	6 types of activities considered: 1-Residential; 2-Occupation; 3-Shopping activities; 4-Services; 5-Transportation; 6-Leisure activities (NB original categories were as follow: 1-Residential; 2-work; 3-food and other services; 4-transport station/stop; 5-recreational activity; 6-social activity)

LLindic <- ess.tab.camille %>%
  select("interact_id", "n_acti_places", "n_weekly_vst", "n_acti_types")
.llmtx <- as.matrix(summary(LLindic[c("n_acti_places", "n_weekly_vst", "n_acti_types")], digits = 1))
.llmtx <- apply(.llmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.llmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
kable(.llmtx, caption = "Lifestyle statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Lifestyle statistics
	n_acti_places	n_weekly_vst	n_acti_types
Min.	4	3	3
1st Qu.	16	15	5
Median	21	22	6
Mean	22	24	6
3rd Qu.	26	30	6
Max.	56	85	6

2.10.5.2 Indicators related to the geometry of the activity space

Indicators	Measurement approach
Perimeter of the convex hull (`cvx_perimeter`)	Perimeter of the smallest polygon containing all the activity locations of the participant (unit: km)
Surface of the convex hull (`cvx_surface`)	Surface of the smallest polygon containing all the activity locations of the participant (unit: km2)
Minimal road network distance from the residence to an activity place (`min_length`)	Minimal distance from the residence to an activity place using the road network (in meters)
Maximal road network distance from the residence to an activity place (`max_length`)	Maximal distance from the residence to an activity place using the road network (in meters)
Median road network distance from the residence to all activity places (`median_length`)	Median distance from home to all activity places using the road network (in meters)

ASindic <- ess.tab.camille %>%
  select("interact_id", "cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")
.asmtx <- as.matrix(summary(ASindic[c("cvx_perimeter", "cvx_surface", "min_length", "max_length", "median_length")], digits = 1))
.asmtx <- apply(.asmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.asmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ")
.asmtx[is.na(.asmtx)] <- 0
kable(.asmtx, caption = "Activity space statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Activity space statistics
	cvx_perimeter	cvx_surface	min_length	max_length	median_length
Min.	2	2e-01	0	1e+03	191
1st Qu.	16	1e+01	71	6e+03	883
Median	23	2e+01	158	9e+03	1374
Mean	49	2e+02	261	2e+04	2090
3rd Qu.	41	7e+01	278	2e+04	2374
Max.	1629	5e+04	5243	2e+05	59098

2.10.5.3 Indicators related to the importance of the residential neighborhood

Indicators	Measurement approach
Percentage of visits to places in the residential neighborhood (`pct_visits_neighb`)	Count of visits to places within the 500 m road network buffer centered on the residence divided by the total number of visits to places
Number of activity locations in the PRN (`n_acti_prn`)	Count of activity locations in the PRN
Percentage of visits in the PRN (`pct_visits_prn`)	Count of visits to places in the PRN divided by the total number of visits to places
Surface of the PRN (`prn_area_km2`)	Unit: km2

RNindic <- ess.tab.camille %>%
  select("interact_id", "pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")
RNindic <- RNindic[RNindic$interact_id %in% prn$interact_id, ]
.rnmtx <- as.matrix(summary(RNindic[c("pct_visits_neighb", "n_acti_prn", "pct_visits_prn", "prn_area_km2")], digits = 1))
.rnmtx <- apply(.rnmtx, 1:2, function(x) strsplit(as.character(x), ":")[[1]][2])
rownames(.rnmtx) <- c("Min.   ", "1st Qu.", "Median ", "Mean   ", "3rd Qu.", "Max.   ", "N/A.   ")
.rnmtx[is.na(.rnmtx)] <- 0
kable(.rnmtx, caption = "Residential neighbourhood statistics") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

Residential neighbourhood statistics
	pct_visits_neighb	n_acti_prn	pct_visits_prn	prn_area_km2
Min.	0	0	0	3e-04
1st Qu.	13	4	31	1e+00
Median	27	9	52	3e+00
Mean	30	10	50	4e+00
3rd Qu.	41	14	71	5e+00
Max.	94	37	100	5e+01
N/A.	513	0	0	0

Only participants with valid PRN considered in Residential Neighbourhood indicators (= 697 participants).

2.10.6 Social indicators: Alexandre Naud’s toolbox

See Alex’s document for a more comprehensive presentation of the social indicators.

-- Reading Alex tbx indics from Essence table
SELECT interact_id,
  people_degree, 
  socialize_size, socialize_meet, socialize_chat,
  important_size, group_degree, simmelian
FROM essence_table.essence_naud_social
WHERE city_id = 'Montréal' AND wave_id = 1

2.10.6.1 Number of people in the network (`people_degree`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = people_degree))

kable(t(as.matrix(summary(ess.tab.alex$people_degree))), caption = "people_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

people_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	2	3	4.503001	6	35

2.10.6.2 Simmelian Brokerage (`simmelian`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = simmelian))

kable(t(as.matrix(summary(ess.tab.alex$simmelian))), caption = "simmelian") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

simmelian
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
1	3	6	8.067963	10.30303	39.14286	130

2.10.6.3 Number of people with whom the participant like to socialize (`socialize_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = socialize_size))

kable(t(as.matrix(summary(ess.tab.alex$socialize_size))), caption = "socialize_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	3	3.663865	5	32

2.10.6.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

ggplot(filter(ess.tab.alex, socialize_meet < 100)) +
  geom_histogram(aes(x = socialize_meet)) +
  annotate(geom = "text", x = 75, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_meet))), caption = "socialize_meet") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_meet
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
0	156	468	63843.63	1040	52000572	4

2.10.6.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

ggplot(filter(ess.tab.alex, socialize_chat < 100)) +
  geom_histogram(aes(x = socialize_chat)) +
  annotate(geom = "text", x = 55, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$socialize_chat))), caption = "socialize_chat") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

socialize_chat
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	156	388	626014.2	728	520000116

2.10.6.6 Number of people with whom the participant discuss important matters (`important_size`)

ggplot(ess.tab.alex) +
  geom_histogram(aes(x = important_size))

kable(t(as.matrix(summary(ess.tab.alex$important_size))), caption = "important_size") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

important_size
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	1	2	2.87515	4	26

2.10.6.7 Number of people in all groups (`group_degree`)

ggplot(filter(ess.tab.alex, group_degree < 100)) +
  geom_histogram(aes(x = group_degree)) +
  annotate(geom = "text", x = 20, y = 100, label = "X-axis: values over 100 not displayed", alpha = .5)

kable(t(as.matrix(summary(ess.tab.alex$group_degree))), caption = "group_degree") %>% kable_styling(bootstrap_options = "striped", full_width = T, position = "left")

group_degree
Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0	0	3.989196	6	110

INTERACT Montréal Participant VERITAS Summary - W1

Benoit THIERRY

04 April, 2024

1 VERITAS dataset description

2 Basic descriptive statistics

2.1 Section 1: Residence and Neighbourhood

2.1.1 Now, let’s start with your home. What is your address?

2.1.2 When did you move to your current address?

2.1.3 Please list all of the other places you have lived for six months or longer since 2006 and the corresponding time period.

2.1.4 If you were asked to draw the boundaries of your neighbourhood, what would they be?

2.1.5 How attached are you to your neighbourhood?

2.1.6 On average, how many hours per day do you spend outside of your home?

2.1.7 Of this time spent outside your home, on average how many hours do you spend outside your neighbourhood?

2.1.8 Are there one or more areas close to where you live that you tend to avoid because you do not feel safe there (for any reason)?

2.1.9 Do you spend the night somewhere other than your home at least once per week?

2.2 Section 2: Occupation

2.2.1 Are you currently working?

2.2.2 Where do you work?

2.2.3 On average, how many hours per week do you work?

2.2.4 Which of the following categories best describes the amount of physical activity required for your job?

2.2.5 Are you currently a registered student?

2.2.6 Where do you study?

2.2.7 On average, how many hours per week do you study?

2.3 Section 3: Shopping activities

2.4 Section 4: Services

2.5 Section 5: Transportation

2.5.1 Do you use public transit from your home?

2.5.2 Where are the public transit stops that you access from your home?

2.6 Section 6: Leisure activities

2.7 Section 7: Other places/activities

2.7.1 Are there other places that you go to at least once per month that we have not mentioned? For example: a mall, a daycare, a hardware store, or a community center.

2.7.2 Can you locate this place?

2.8 Section 8: Areas of change

2.8.1 Can you locate areas where you have noticed an improvement of the urban environment?

2.8.2 Can you locate areas where you have noticed a deterioration of the urban environment?

2.9 Section 9: Social contact

2.9.1 Do you visit anyone at his or her home at least once per month?

2.9.2 Where does this person live?

2.9.3 Great, we are almost done completing this questionnaire. You have documented all your activity places on a map, and specified with whom you generally do these activities. These last few questions concern the people you documented earlier.

2.9.3.1 Among these people, who do you discuss important matters with?

2.9.3.2 Among these people, who do you like to socialize with?

2.9.3.3 Among these people, who do you meet often with but do not necessarily feel close to?

2.9.3.4 Among these people, who knows whom?

2.10 Derived metrics

2.10.1 Existence of improvement and deterioration areas by participant

2.10.2 Transportation mode preferences

2.10.3 Visiting places alone

2.10.4 Visit frequency

2.10.5 Spatial indicators: Camille Perchoux’s toolbox

2.10.5.1 Indicators related to the lifestyle

2.10.5.2 Indicators related to the geometry of the activity space

2.10.5.3 Indicators related to the importance of the residential neighborhood

2.10.6 Social indicators: Alexandre Naud’s toolbox

2.10.6.1 Number of people in the network (people_degree)

2.10.6.2 Simmelian Brokerage (simmelian)

2.10.6.3 Number of people with whom the participant like to socialize (socialize_size)

2.10.6.4 Weekly face-to-face interactions among people with whom the participant like to socialize (socialize_meet)

2.10.6.5 Weekly ICT interactions among people with whom the participant like to socialize (socialize_chat)

2.10.6.6 Number of people with whom the participant discuss important matters (important_size)

2.10.6.7 Number of people in all groups (group_degree)

2.10.6.1 Number of people in the network (`people_degree`)

2.10.6.2 Simmelian Brokerage (`simmelian`)

2.10.6.3 Number of people with whom the participant like to socialize (`socialize_size`)

2.10.6.4 Weekly face-to-face interactions among people with whom the participant like to socialize (`socialize_meet`)

2.10.6.5 Weekly ICT interactions among people with whom the participant like to socialize (`socialize_chat`)

2.10.6.6 Number of people with whom the participant discuss important matters (`important_size`)

2.10.6.7 Number of people in all groups (`group_degree`)