class: logo-slide --- class: title-slide ## The Tidyverse ### Applications of Data Science - Class 1 ### Giora Simchoni #### `gsimchoni@gmail.com and add #dsapps in subject` ### Stat. and OR Department, TAU ### 2023-02-23 --- layout: true <div class="my-footer"> <span> <a href="https://dsapps-2023.github.io/Class_Slides/" target="_blank">Applications of Data Science </a> </span> </div> --- class: section-slide # I don't need to know about wrangling data, I get by. --- # So, what's wrong with Excel? (MS Excel is one amazing software. But it lacks:) - Structure (or rather, structure is up to the user) - Types to variables - Automation (you could learn VBA Excel, but the horror) - Reproducibility - Open Source - Extensibility - Speed and Scale - Modeling (there *is* a t-test, but the horror) [Excel might be the most dangerous software on the planet](https://www.forbes.com/sites/timworstall/2013/02/13/microsofts-excel-might-be-the-most-dangerous-software-on-the-planet/#667084f5633d) (Forbes) [Excel horror stories](https://eusprig.org/research-info/horror-stories/) --- # So, what's wrong with base R? (Base R is one amazing software. But it lacks:) - Consistency: - Function names - Function arguments names - Function arguments order - Function return types (sometimes the same function!) - Meaningful errors and warnings - Good choices of default values to arguments - Speed - Good and easy visualizations - One other thing --- ### (In) Consistency - Example 1: Strings .font80percent[ ```r # split a string by pattern: strsplit(string, pattern) strsplit("Who dis?", " ") ``` ``` ## [[1]] ## [1] "Who" "dis?" ``` ```r # find if a pattern exists in a string: grepl(pattern, string) grepl("di", "Who dis?") ``` ``` ## [1] TRUE ``` ```r # substitute a pattern in a string: sub(pattern, replace, string) sub("di", "thi", "Who dis?") ``` ``` ## [1] "Who this?" ``` ```r # length of a string: nchar(string); length of object: length(obj) c(nchar("Who dis?"), length("Who dis?")) ``` ``` ## [1] 8 1 ``` ] --- ### (In) Consistency - Example 2: Models ```r n <- 10000 x1 <- runif(n) x2 <- runif(n) t <- 1 + 2 * x1 + 3 * x2 y <- rbinom(n, 1, 1 / (1 + exp(-t))) ``` ```r glm(y ~ x1 + x2, family = "binomial") ``` ```r glmnet(as.matrix(cbind(x1, x2)), as.factor(y), family = "binomial") ``` ```r randomForest(as.factor(y) ~ x1 + x2) ``` ```r gbm(y ~ x1 + x2, data = data.frame(x1 = x1, x2 = x2, y = y)) ``` 😱 --- ### (Un) Meaningful Errors - Example ```r df <- data.frame(Education = 1:5, Ethnicity = c(2, 4, 5, 2, 1)) table(df$Eduction, df$Ethnicity) ``` <pre style="color: red;"><code>## Error in table(df$Eduction, df$Ethnicity): all arguments must have the same length </code></pre> --- ### (Bad) Default Values - Example ![](images/bad_args_table.png) ```r # In R 3.6... in R 4.0 this was fixed! df <- read.csv("../data/bad_args_test.csv") df$col3 ``` ``` ## [1] a b c d ## Levels: a b c d ``` ```r df <- read.csv("../data/bad_args_test.csv", stringsAsFactors = FALSE) df$col3 ``` ``` ## [1] "a" "b" "c" "d" ``` --- ### (No) Speed - Example ```r file_path <- "../data/mediocre_file.csv" df <- read.csv(file_path) dim(df) ``` ``` ## [1] 9180 14 ``` ```r library(microbenchmark) microbenchmark( read_base = read.csv(file_path), read_tidy = read_csv(file_path, col_types = cols()), read_dt = data.table::fread(file_path), times = 10) ``` ``` ## Unit: milliseconds ## expr min lq mean median uq max neval ## read_base 42.623101 43.625400 45.69589 44.70140 45.254901 56.8816 10 ## read_tidy 28.173101 28.844001 40.23915 30.37425 31.458502 131.9949 10 ## read_dt 8.418402 8.626601 15.49107 8.81210 9.441102 74.9264 10 ``` --- class: section-slide # Detour: The OKCupid Dataset --- ## The OKCupid Dataset - ~60K active OKCupid users scraped on June 2012 - 35K Male, 25K Female (less awareness for non-binary back then) - Answers to questions like: - Body Type - Diet - Substance Abuse - Education - Do you like pets? - Open questions, e.g. "On a typical Friday night I am..." - And the more boring demographic details like age, height, location, sign, religion etc. - See [here](https://github.com/rudeboybert/JSE_OkCupid/blob/master/okcupid_codebook_revised.txt) for the full codebook --- ### BTW <img src="images/okcupid_issue.png" style="width: 80%" /> See [here](https://www.tandfonline.com/doi/full/10.1080/26939169.2021.1930812). --- class: section-slide # End of Detour --- ### (Not) Good Vizualizations - Example .font80percent[ ```r okcupid <- read_csv("~/okcupid.csv.zip", col_types = cols()) okcupid$income[okcupid$income == -1] <- NA okcupid$height_cm <- okcupid$height * 2.54 ``` ```r plot(okcupid$height_cm, log10(okcupid$income + 1), col = c("red", "green")[as.factor(okcupid$sex)]) ``` <img src="images/Viz-Base-1.png" width="70%" /> ] --- ```r ggplot(okcupid, aes(height_cm, log10(income + 1), color = sex)) + geom_point() ``` <pre style="color: red;"><code>## Warning: Removed 48442 rows containing missing values (`geom_point()`). </code></pre><img src="images/Viz-Tidy-1.png" width="70%" /> --- ## One other thing Manager: "Give me the average income of women respondents above age 30 grouped by sexual orientation!" You: ```r mean_bi <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "bisexual"], na.rm = TRUE) mean_gay <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "gay"], na.rm = TRUE) mean_straight <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "straight"], na.rm = TRUE) data.frame(orientation = c("bisexual", "gay", "straight"), income_mean = c(mean_bi, mean_gay, mean_straight)) ``` ``` ## orientation income_mean ## 1 bisexual 133421.05 ## 2 gay 86489.36 ## 3 straight 85219.74 ``` --- Or the slightly better you: ```r mean_income_function <- function(orientation) { mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == orientation], na.rm = TRUE) } mean_bi <- mean_income_function("bisexual") mean_gay <- mean_income_function("gay") mean_straight <- mean_income_function("straight") data.frame(orientation = c("bisexual", "gay", "straight"), income_mean = c(mean_bi, mean_gay, mean_straight)) ``` ``` ## orientation income_mean ## 1 bisexual 133421.05 ## 2 gay 86489.36 ## 3 straight 85219.74 ``` --- Or the even better you: ```r orientations <- c("bisexual", "gay", "straight") income_means <- numeric(3) for (i in seq_along(orientations)) { income_means[i] <- mean_income_function(orientations[i]) } data.frame(orientation = orientations, income_mean = income_means) ``` ``` ## orientation income_mean ## 1 bisexual 133421.05 ## 2 gay 86489.36 ## 3 straight 85219.74 ``` --- Or the best you: ```r okcupid_females_over30 <- with(okcupid, okcupid[sex == "f" & age > 30, ]) aggregate(okcupid_females_over30$income, by = list(orientation = okcupid_females_over30$orientation), FUN = mean, na.rm = TRUE) ``` ``` ## orientation x ## 1 bisexual 133421.05 ## 2 gay 86489.36 ## 3 straight 85219.74 ``` <br> Manager: "What? Why would bisexual women have a higher income than straight or gay women? Could you add the median, trimmed mean, standard error and n?" You: 😱 --- class: section-slide # The Tidyverse --- ## What *is* The Tidyverse? > The [tidyverse](https://www.tidyverse.org/) is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. - `tibble`: the `data.frame` re-imagined - `readr`: importing/exporting (mostly rectangular) data for humans - `dplyr` + `tidyr`: a grammar of data manipulation - `purrr`: functional programming in R - `stringr`: string manipulation - `ggplot2`: a grammar of graphics --- The above can all be installed and loaded under the `tidyverse` package: ```r library(tidyverse) ``` Many more: - `lubridate`: manipulating dates - `tidymodels`: tidy modeling/statistics - `rvest`: web scraping - `tidytext`: tidy text analysis (life saver) - `tidygraph` + `ggraph`: manipulating and plotting networks - `glue`: print like a boss - countless `gg` extensions (`ggmosaic`, `ggbeeswarm`, `gganimate`, `ggridges` etc.) --- ## What's so great about the Tidyverse? - Tidy Data - Consistency (in function names, args, return types, documentation) - The Pipe - Speed (C++ under the hood) - `ggplot2` - The Community --- ## Tidy Data - Each variable must have its own column. - Each observation must have its own row. - Each value must have its own cell. <br> <img src="images/tidy_data.png" style="width: 90%" /> --- ### Which one of these datasets is tidy? (I) ```r table1 ``` ``` ## # A tibble: 315 × 4 ## religion yob n_straight n_total ## <chr> <dbl> <dbl> <dbl> ## 1 atheist 1950 26 29 ## 2 buddhist 1950 6 6 ## 3 christian 1950 28 32 ## 4 hindu 1950 0 0 ## 5 jewish 1950 21 24 ## 6 muslim 1950 0 0 ## 7 unspecified 1950 71 76 ## 8 atheist 1951 31 33 ## 9 buddhist 1951 11 11 ## 10 christian 1951 23 24 ## # … with 305 more rows ``` --- ### Which one of these datasets is tidy? (II) ```r table2 ``` ``` ## # A tibble: 630 × 4 ## religion yob type n ## <chr> <dbl> <chr> <dbl> ## 1 atheist 1950 straight 26 ## 2 atheist 1950 total 29 ## 3 buddhist 1950 straight 6 ## 4 buddhist 1950 total 6 ## 5 christian 1950 straight 28 ## 6 christian 1950 total 32 ## 7 hindu 1950 straight 0 ## 8 hindu 1950 total 0 ## 9 jewish 1950 straight 21 ## 10 jewish 1950 total 24 ## # … with 620 more rows ``` --- ### Which one of these datasets is tidy? (III) ```r table3 ``` ``` ## # A tibble: 315 × 3 ## religion yob pct_straight ## <chr> <dbl> <chr> ## 1 atheist 1950 26/29 ## 2 buddhist 1950 6/6 ## 3 christian 1950 28/32 ## 4 hindu 1950 0/0 ## 5 jewish 1950 21/24 ## 6 muslim 1950 0/0 ## 7 unspecified 1950 71/76 ## 8 atheist 1951 31/33 ## 9 buddhist 1951 11/11 ## 10 christian 1951 23/24 ## # … with 305 more rows ``` --- ### Which one of these datasets is tidy? (IV) ```r table4 ``` ``` ## # A tibble: 7 × 91 ## religion n_total_1950 n_total_1951 n_total_1952 n_total_1953 n_total_1954 ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 atheist 29 33 34 37 40 ## 2 buddhist 6 11 14 16 11 ## 3 christian 32 24 37 47 37 ## 4 hindu 0 0 0 1 1 ## 5 jewish 24 29 27 23 25 ## 6 muslim 0 0 0 0 0 ## 7 unspecified 76 79 83 97 83 ## # … with 85 more variables: n_total_1955 <dbl>, n_total_1956 <dbl>, ## # n_total_1957 <dbl>, n_total_1958 <dbl>, n_total_1959 <dbl>, ## # n_total_1960 <dbl>, n_total_1961 <dbl>, n_total_1962 <dbl>, ## # n_total_1963 <dbl>, n_total_1964 <dbl>, n_total_1965 <dbl>, ## # n_total_1966 <dbl>, n_total_1967 <dbl>, n_total_1968 <dbl>, ## # n_total_1969 <dbl>, n_total_1970 <dbl>, n_total_1971 <dbl>, ## # n_total_1972 <dbl>, n_total_1973 <dbl>, n_total_1974 <dbl>, … ``` --- ### Why Tidy? > Happy families are all alike; every unhappy family is unhappy in its own way. (Leo Tolstoy) <br> > It allows R’s vectorised nature to shine. (Hadley Wickham) --- ### A Tidy dataset will be much easier to transform ```r table1$pct_straight = table1$n_straight / table1$n_total table1 ``` ``` ## # A tibble: 315 × 5 ## religion yob n_straight n_total pct_straight ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 atheist 1950 26 29 0.897 ## 2 buddhist 1950 6 6 1 ## 3 christian 1950 28 32 0.875 ## 4 hindu 1950 0 0 NaN ## 5 jewish 1950 21 24 0.875 ## 6 muslim 1950 0 0 NaN ## 7 unspecified 1950 71 76 0.934 ## 8 atheist 1951 31 33 0.939 ## 9 buddhist 1951 11 11 1 ## 10 christian 1951 23 24 0.958 ## # … with 305 more rows ``` --- ### A Tidy dataset will be much easier to plot ```r ggplot(table1, aes(x = yob, y = pct_straight, color = religion)) + geom_smooth(method = "loess", formula = y ~ x, se = FALSE) ``` <img src="images/Table1-Plot-1.png" width="70%" /> --- class: section-slide # Detour: The `tibble` --- ### The `tibble`: the `data.frame` re-imagined - Prints nicer: ```r tib1 <- tibble(day = lubridate::today() + runif(1e3) * 30, type = sample(letters, 1e3, replace = TRUE), quantity = sample(seq(0, 100, 10), 1e3, replace = TRUE)) tib1 ``` ``` ## # A tibble: 1,000 × 3 ## day type quantity ## <date> <chr> <dbl> ## 1 2023-03-17 j 90 ## 2 2023-03-15 o 40 ## 3 2023-02-28 q 60 ## 4 2023-02-28 c 0 ## 5 2023-03-11 v 90 ## 6 2023-03-06 s 30 ## 7 2023-03-23 h 50 ## 8 2023-02-25 q 10 ## 9 2023-03-04 x 70 ## 10 2023-03-23 m 90 ## # … with 990 more rows ``` --- ```r df1 <- data.frame(day = lubridate::today() + runif(1e3) * 30, type = sample(letters, 1e3, replace = TRUE), quantity = sample(seq(0, 100, 10), 1e3, replace = TRUE)) df1 ``` ``` ## day type quantity ## 1 2023-03-07 d 80 ## 2 2023-03-07 u 90 ## 3 2023-03-02 c 60 ## 4 2023-03-11 h 0 ## 5 2023-03-01 j 90 ## 6 2023-03-21 t 100 ## 7 2023-03-06 g 70 ## 8 2023-03-09 r 0 ## 9 2023-03-24 n 70 ## 10 2023-03-22 f 40 ## 11 2023-03-18 a 70 ## 12 2023-03-19 n 100 ## 13 2023-03-15 o 10 ## 14 2023-03-20 u 60 ## 15 2023-03-07 i 70 ## 16 2023-03-12 o 50 ## 17 2023-03-07 h 70 ## 18 2023-03-11 r 70 ## 19 2023-03-18 r 50 ## 20 2023-03-07 h 20 ## 21 2023-03-05 i 30 ## 22 2023-03-20 k 70 ## 23 2023-03-10 s 80 ## 24 2023-03-16 y 70 ## 25 2023-02-25 e 40 ## 26 2023-03-23 c 30 ## 27 2023-03-10 m 100 ## 28 2023-03-08 a 100 ## 29 2023-03-21 z 90 ## 30 2023-02-28 d 50 ## 31 2023-03-16 u 40 ## 32 2023-03-02 o 100 ## 33 2023-03-04 l 60 ## 34 2023-03-08 n 80 ## 35 2023-02-26 s 40 ## 36 2023-03-12 v 60 ## 37 2023-03-13 u 100 ## 38 2023-03-04 h 10 ## 39 2023-03-07 x 20 ## 40 2023-03-23 l 60 ## 41 2023-03-07 h 70 ## 42 2023-03-24 h 40 ## 43 2023-03-11 h 70 ## 44 2023-03-15 i 0 ## 45 2023-03-05 h 50 ## 46 2023-02-25 c 60 ## 47 2023-02-24 l 20 ## 48 2023-03-12 k 50 ## 49 2023-03-10 t 70 ## 50 2023-03-08 u 30 ## 51 2023-03-04 y 40 ## 52 2023-03-21 h 30 ## 53 2023-03-18 p 50 ## 54 2023-03-20 d 70 ## 55 2023-03-24 f 0 ## 56 2023-02-24 k 20 ## 57 2023-03-16 q 30 ## 58 2023-02-25 x 70 ## 59 2023-03-23 p 10 ## 60 2023-03-17 g 70 ## 61 2023-03-23 a 70 ## 62 2023-03-05 n 30 ## 63 2023-03-18 p 30 ## 64 2023-03-15 n 60 ## 65 2023-03-08 n 0 ## 66 2023-02-27 j 10 ## 67 2023-03-08 k 100 ## 68 2023-03-18 h 100 ## 69 2023-03-02 b 80 ## 70 2023-03-22 t 70 ## 71 2023-03-08 a 10 ## 72 2023-03-07 q 50 ## 73 2023-03-15 l 0 ## 74 2023-02-23 g 60 ## 75 2023-03-12 e 30 ## 76 2023-03-19 e 20 ## 77 2023-03-19 b 90 ## 78 2023-03-14 e 90 ## 79 2023-03-15 c 20 ## 80 2023-03-23 c 10 ## 81 2023-03-19 o 30 ## 82 2023-03-12 z 0 ## 83 2023-03-18 t 100 ## 84 2023-03-09 s 60 ## 85 2023-03-11 o 10 ## 86 2023-02-26 h 30 ## 87 2023-03-23 f 20 ## 88 2023-03-07 r 0 ## 89 2023-03-19 o 100 ## 90 2023-03-01 p 20 ## 91 2023-03-01 i 100 ## 92 2023-03-18 a 10 ## 93 2023-03-06 b 90 ## 94 2023-03-20 o 60 ## 95 2023-03-15 v 70 ## 96 2023-03-22 j 60 ## 97 2023-03-01 k 10 ## 98 2023-03-10 r 90 ## 99 2023-03-22 q 70 ## 100 2023-03-01 p 20 ## 101 2023-03-03 w 80 ## 102 2023-03-17 o 80 ## 103 2023-03-21 f 40 ## 104 2023-03-05 s 60 ## 105 2023-02-28 z 30 ## 106 2023-03-12 u 80 ## 107 2023-03-05 j 90 ## 108 2023-03-11 v 10 ## 109 2023-03-10 k 0 ## 110 2023-02-24 o 70 ## 111 2023-02-25 s 40 ## 112 2023-03-13 x 20 ## 113 2023-03-04 z 60 ## 114 2023-02-24 m 100 ## 115 2023-03-17 q 10 ## 116 2023-03-15 g 50 ## 117 2023-03-24 x 100 ## 118 2023-02-27 v 10 ## 119 2023-02-27 y 70 ## 120 2023-03-06 d 40 ## 121 2023-02-24 a 30 ## 122 2023-03-16 p 70 ## 123 2023-03-10 c 100 ## 124 2023-02-28 x 90 ## 125 2023-02-26 s 70 ## 126 2023-03-21 v 90 ## 127 2023-03-16 x 40 ## 128 2023-02-23 h 40 ## 129 2023-03-21 y 70 ## 130 2023-02-24 u 100 ## 131 2023-02-26 v 40 ## 132 2023-03-22 h 10 ## 133 2023-03-15 k 60 ## 134 2023-02-27 k 30 ## 135 2023-02-24 m 30 ## 136 2023-03-04 j 100 ## 137 2023-02-25 p 40 ## 138 2023-03-08 r 90 ## 139 2023-02-28 a 0 ## 140 2023-03-23 l 0 ## 141 2023-03-12 n 100 ## 142 2023-03-02 c 60 ## 143 2023-03-13 s 80 ## 144 2023-03-17 h 100 ## 145 2023-03-11 v 80 ## 146 2023-03-18 x 80 ## 147 2023-03-20 q 100 ## 148 2023-03-05 j 30 ## 149 2023-03-13 n 70 ## 150 2023-03-14 z 100 ## 151 2023-03-20 f 0 ## 152 2023-03-03 v 60 ## 153 2023-02-24 j 100 ## 154 2023-02-24 q 50 ## 155 2023-03-18 h 60 ## 156 2023-03-02 d 30 ## 157 2023-03-15 v 40 ## 158 2023-03-13 f 100 ## 159 2023-03-21 w 40 ## 160 2023-03-17 r 30 ## 161 2023-02-23 h 30 ## 162 2023-03-21 s 0 ## 163 2023-03-02 v 10 ## 164 2023-03-04 h 80 ## 165 2023-02-28 e 100 ## 166 2023-03-12 z 90 ## 167 2023-03-16 q 10 ## 168 2023-03-17 j 20 ## 169 2023-03-14 i 70 ## 170 2023-03-12 n 50 ## 171 2023-03-15 h 10 ## 172 2023-03-03 b 30 ## 173 2023-03-15 f 50 ## 174 2023-03-13 f 80 ## 175 2023-03-03 g 0 ## 176 2023-03-18 d 20 ## 177 2023-03-20 e 100 ## 178 2023-03-13 x 80 ## 179 2023-03-10 s 40 ## 180 2023-03-23 a 20 ## 181 2023-03-11 q 50 ## 182 2023-03-01 s 40 ## 183 2023-03-17 y 80 ## 184 2023-03-23 y 10 ## 185 2023-03-16 w 10 ## 186 2023-02-24 v 10 ## 187 2023-03-01 k 20 ## 188 2023-03-24 b 80 ## 189 2023-03-03 p 30 ## 190 2023-03-18 d 10 ## 191 2023-03-24 i 30 ## 192 2023-02-26 h 60 ## 193 2023-03-20 y 20 ## 194 2023-03-13 d 40 ## 195 2023-03-12 s 90 ## 196 2023-02-23 v 80 ## 197 2023-02-28 w 100 ## 198 2023-03-02 v 90 ## 199 2023-03-19 i 0 ## 200 2023-03-11 s 10 ## 201 2023-03-09 v 40 ## 202 2023-02-25 h 0 ## 203 2023-03-11 h 40 ## 204 2023-03-12 s 100 ## 205 2023-02-25 o 40 ## 206 2023-03-18 e 10 ## 207 2023-03-15 b 0 ## 208 2023-03-21 n 60 ## 209 2023-02-27 s 10 ## 210 2023-02-25 a 40 ## 211 2023-02-27 e 40 ## 212 2023-02-25 x 20 ## 213 2023-03-24 u 100 ## 214 2023-03-05 j 90 ## 215 2023-03-18 z 0 ## 216 2023-03-06 h 10 ## 217 2023-03-09 y 50 ## 218 2023-03-02 p 50 ## 219 2023-02-26 t 10 ## 220 2023-03-19 o 50 ## 221 2023-03-15 q 30 ## 222 2023-02-25 l 100 ## 223 2023-02-23 v 40 ## 224 2023-02-28 m 60 ## 225 2023-03-04 o 10 ## 226 2023-03-18 u 0 ## 227 2023-03-03 h 60 ## 228 2023-02-23 f 90 ## 229 2023-02-27 a 30 ## 230 2023-03-04 a 60 ## 231 2023-03-24 e 10 ## 232 2023-03-20 c 20 ## 233 2023-03-03 i 70 ## 234 2023-03-07 p 40 ## 235 2023-02-27 m 40 ## 236 2023-03-04 f 70 ## 237 2023-03-18 k 10 ## 238 2023-03-09 l 100 ## 239 2023-03-18 n 70 ## 240 2023-03-16 u 50 ## 241 2023-03-12 w 80 ## 242 2023-03-19 k 30 ## 243 2023-03-20 g 0 ## 244 2023-03-03 x 100 ## 245 2023-03-10 n 20 ## 246 2023-03-11 t 30 ## 247 2023-03-23 g 30 ## 248 2023-03-10 d 20 ## 249 2023-02-24 f 20 ## 250 2023-03-20 f 40 ## 251 2023-03-04 u 80 ## 252 2023-03-18 u 70 ## 253 2023-03-06 s 50 ## 254 2023-03-14 r 50 ## 255 2023-03-24 i 50 ## 256 2023-03-18 n 60 ## 257 2023-03-08 q 0 ## 258 2023-03-03 t 40 ## 259 2023-03-12 b 10 ## 260 2023-03-01 p 50 ## 261 2023-03-22 w 90 ## 262 2023-02-24 r 60 ## 263 2023-03-13 w 10 ## 264 2023-03-22 s 60 ## 265 2023-03-08 a 40 ## 266 2023-03-16 f 100 ## 267 2023-03-13 d 80 ## 268 2023-02-25 k 60 ## 269 2023-02-25 b 10 ## 270 2023-03-21 o 90 ## 271 2023-03-06 c 80 ## 272 2023-02-23 w 0 ## 273 2023-03-07 q 70 ## 274 2023-03-13 y 80 ## 275 2023-03-11 w 90 ## 276 2023-03-08 q 0 ## 277 2023-03-16 e 0 ## 278 2023-02-25 g 20 ## 279 2023-03-12 u 0 ## 280 2023-03-22 e 90 ## 281 2023-03-12 w 30 ## 282 2023-03-07 h 40 ## 283 2023-03-20 r 80 ## 284 2023-03-15 d 100 ## 285 2023-03-22 p 60 ## 286 2023-03-03 w 30 ## 287 2023-03-02 d 60 ## 288 2023-03-05 f 50 ## 289 2023-02-24 x 50 ## 290 2023-02-24 i 90 ## 291 2023-03-13 u 50 ## 292 2023-03-09 g 10 ## 293 2023-02-27 g 50 ## 294 2023-03-14 q 70 ## 295 2023-03-03 w 20 ## 296 2023-03-16 z 30 ## 297 2023-03-20 j 80 ## 298 2023-03-24 b 90 ## 299 2023-03-21 k 50 ## 300 2023-03-20 z 50 ## 301 2023-03-12 q 50 ## 302 2023-03-16 v 0 ## 303 2023-03-21 d 40 ## 304 2023-03-23 d 50 ## 305 2023-02-28 o 40 ## 306 2023-03-05 e 50 ## 307 2023-03-02 t 30 ## 308 2023-03-02 m 0 ## 309 2023-03-06 r 60 ## 310 2023-03-17 e 90 ## 311 2023-03-23 h 20 ## 312 2023-03-06 p 100 ## 313 2023-03-23 k 60 ## 314 2023-03-01 t 100 ## 315 2023-03-12 g 100 ## 316 2023-03-20 k 0 ## 317 2023-03-20 r 30 ## 318 2023-03-07 p 50 ## 319 2023-03-21 i 90 ## 320 2023-02-26 t 0 ## 321 2023-03-08 w 100 ## 322 2023-03-09 y 30 ## 323 2023-03-08 q 30 ## 324 2023-02-26 m 60 ## 325 2023-03-04 t 0 ## 326 2023-03-24 d 30 ## 327 2023-03-08 n 10 ## 328 2023-03-03 a 50 ## 329 2023-02-23 v 80 ## 330 2023-02-26 e 50 ## 331 2023-03-06 q 0 ## 332 2023-03-03 d 40 ## 333 2023-02-28 i 10 ## 334 2023-02-23 g 30 ## 335 2023-03-08 o 30 ## 336 2023-03-03 u 90 ## 337 2023-03-02 o 10 ## 338 2023-03-13 t 40 ## 339 2023-03-11 a 10 ## 340 2023-03-12 r 0 ## 341 2023-03-18 p 10 ## 342 2023-03-03 j 100 ## 343 2023-02-25 w 90 ## 344 2023-02-23 s 30 ## 345 2023-02-26 f 70 ## 346 2023-03-05 r 30 ## 347 2023-03-23 v 60 ## 348 2023-03-19 y 10 ## 349 2023-02-25 s 90 ## 350 2023-02-27 z 40 ## 351 2023-03-19 x 30 ## 352 2023-03-06 k 40 ## 353 2023-02-24 h 0 ## 354 2023-03-16 h 90 ## 355 2023-03-24 h 90 ## 356 2023-03-14 a 80 ## 357 2023-03-12 p 40 ## 358 2023-03-01 a 90 ## 359 2023-03-20 j 0 ## 360 2023-03-19 e 60 ## 361 2023-03-04 x 10 ## 362 2023-03-20 c 40 ## 363 2023-03-24 o 10 ## 364 2023-03-11 h 20 ## 365 2023-03-06 f 60 ## 366 2023-03-24 c 10 ## 367 2023-03-02 s 60 ## 368 2023-03-18 x 0 ## 369 2023-02-25 o 20 ## 370 2023-03-15 t 60 ## 371 2023-02-24 o 40 ## 372 2023-03-06 s 30 ## 373 2023-03-20 l 80 ## 374 2023-03-07 m 0 ## 375 2023-03-02 r 60 ## 376 2023-03-17 b 90 ## 377 2023-03-16 a 100 ## 378 2023-03-06 j 80 ## 379 2023-03-21 o 40 ## 380 2023-03-17 d 50 ## 381 2023-02-25 f 30 ## 382 2023-02-27 k 40 ## 383 2023-03-17 o 40 ## 384 2023-03-21 m 20 ## 385 2023-02-25 h 50 ## 386 2023-02-28 c 0 ## 387 2023-03-13 e 80 ## 388 2023-03-14 o 20 ## 389 2023-03-08 q 70 ## 390 2023-03-11 m 50 ## 391 2023-03-02 i 60 ## 392 2023-03-11 t 20 ## 393 2023-03-14 m 0 ## 394 2023-03-08 n 90 ## 395 2023-02-27 c 80 ## 396 2023-03-12 o 30 ## 397 2023-03-17 q 90 ## 398 2023-03-02 t 100 ## 399 2023-03-07 b 60 ## 400 2023-03-17 y 50 ## 401 2023-03-01 j 60 ## 402 2023-02-23 h 0 ## 403 2023-03-06 a 50 ## 404 2023-03-05 o 60 ## 405 2023-03-17 n 40 ## 406 2023-02-25 u 90 ## 407 2023-03-16 d 100 ## 408 2023-03-22 z 80 ## 409 2023-02-28 q 40 ## 410 2023-03-11 d 40 ## 411 2023-03-19 q 0 ## 412 2023-03-02 r 80 ## 413 2023-03-21 a 90 ## 414 2023-02-27 m 20 ## 415 2023-03-19 v 100 ## 416 2023-03-04 m 10 ## 417 2023-03-10 i 100 ## 418 2023-03-08 n 60 ## 419 2023-03-05 q 20 ## 420 2023-03-17 y 20 ## 421 2023-03-18 y 70 ## 422 2023-03-02 u 60 ## 423 2023-03-12 k 40 ## 424 2023-03-16 n 100 ## 425 2023-02-27 r 30 ## 426 2023-03-07 l 40 ## 427 2023-03-05 i 80 ## 428 2023-02-27 i 30 ## 429 2023-03-03 q 20 ## 430 2023-03-18 n 30 ## 431 2023-03-17 y 30 ## 432 2023-02-23 f 0 ## 433 2023-03-23 l 100 ## 434 2023-03-04 t 80 ## 435 2023-02-27 l 50 ## 436 2023-02-27 g 70 ## 437 2023-02-26 n 20 ## 438 2023-03-18 p 20 ## 439 2023-02-27 p 50 ## 440 2023-03-23 w 50 ## 441 2023-03-02 t 20 ## 442 2023-03-21 u 30 ## 443 2023-03-19 o 40 ## 444 2023-03-08 i 90 ## 445 2023-03-15 s 0 ## 446 2023-03-12 t 40 ## 447 2023-03-01 j 100 ## 448 2023-02-25 j 0 ## 449 2023-03-21 c 20 ## 450 2023-03-13 h 80 ## 451 2023-02-23 x 60 ## 452 2023-03-08 x 100 ## 453 2023-03-12 a 80 ## 454 2023-03-06 d 60 ## 455 2023-03-21 l 90 ## 456 2023-03-02 q 70 ## 457 2023-03-05 c 70 ## 458 2023-03-09 h 70 ## 459 2023-03-19 z 90 ## 460 2023-03-21 d 80 ## 461 2023-03-12 t 80 ## 462 2023-03-08 d 100 ## 463 2023-02-26 s 60 ## 464 2023-03-12 h 0 ## 465 2023-03-03 d 40 ## 466 2023-03-03 g 90 ## 467 2023-03-18 g 80 ## 468 2023-02-27 g 40 ## 469 2023-03-14 c 40 ## 470 2023-03-20 f 40 ## 471 2023-02-25 n 100 ## 472 2023-03-10 z 0 ## 473 2023-02-23 s 80 ## 474 2023-03-17 l 40 ## 475 2023-03-01 m 70 ## 476 2023-03-04 q 0 ## 477 2023-03-22 n 0 ## 478 2023-03-22 b 0 ## 479 2023-03-16 o 0 ## 480 2023-02-27 j 30 ## 481 2023-03-05 s 100 ## 482 2023-03-10 e 30 ## 483 2023-03-13 z 60 ## 484 2023-03-04 f 0 ## 485 2023-03-01 g 0 ## 486 2023-02-25 i 60 ## 487 2023-03-11 e 60 ## 488 2023-03-12 v 60 ## 489 2023-03-20 d 90 ## 490 2023-02-27 g 70 ## 491 2023-02-24 i 80 ## 492 2023-03-23 h 30 ## 493 2023-02-25 v 0 ## 494 2023-03-11 i 10 ## 495 2023-02-28 n 90 ## 496 2023-03-05 m 90 ## 497 2023-03-12 e 60 ## 498 2023-03-17 j 10 ## 499 2023-03-13 s 50 ## 500 2023-03-08 a 90 ## 501 2023-03-23 r 60 ## 502 2023-03-06 m 50 ## 503 2023-03-24 l 60 ## 504 2023-02-23 r 20 ## 505 2023-02-24 d 30 ## 506 2023-03-22 t 0 ## 507 2023-03-11 a 90 ## 508 2023-03-16 b 30 ## 509 2023-03-11 a 10 ## 510 2023-03-20 v 60 ## 511 2023-03-07 j 10 ## 512 2023-02-28 q 70 ## 513 2023-03-23 t 10 ## 514 2023-03-23 j 40 ## 515 2023-03-04 z 0 ## 516 2023-02-28 d 80 ## 517 2023-03-24 f 50 ## 518 2023-03-11 q 10 ## 519 2023-03-24 j 60 ## 520 2023-02-25 a 80 ## 521 2023-03-13 v 70 ## 522 2023-03-08 d 30 ## 523 2023-03-04 g 10 ## 524 2023-02-24 x 90 ## 525 2023-03-13 a 50 ## 526 2023-03-12 i 10 ## 527 2023-03-05 j 60 ## 528 2023-03-06 g 100 ## 529 2023-02-23 r 70 ## 530 2023-03-05 k 30 ## 531 2023-03-08 e 40 ## 532 2023-02-23 z 80 ## 533 2023-03-18 h 80 ## 534 2023-03-02 v 80 ## 535 2023-03-16 v 0 ## 536 2023-02-27 z 50 ## 537 2023-03-23 n 10 ## 538 2023-02-24 e 90 ## 539 2023-02-27 h 20 ## 540 2023-03-04 k 90 ## 541 2023-03-11 r 90 ## 542 2023-02-26 m 30 ## 543 2023-03-16 g 90 ## 544 2023-03-21 r 50 ## 545 2023-03-11 a 10 ## 546 2023-02-27 k 0 ## 547 2023-02-25 f 20 ## 548 2023-03-18 a 10 ## 549 2023-03-02 y 80 ## 550 2023-02-25 k 70 ## 551 2023-03-09 n 20 ## 552 2023-03-03 f 50 ## 553 2023-02-25 q 90 ## 554 2023-03-11 g 100 ## 555 2023-03-02 j 60 ## 556 2023-03-04 j 100 ## 557 2023-03-15 c 60 ## 558 2023-02-24 z 60 ## 559 2023-03-05 v 100 ## 560 2023-03-09 p 20 ## 561 2023-02-25 g 40 ## 562 2023-03-01 o 80 ## 563 2023-03-08 v 20 ## 564 2023-03-05 n 10 ## 565 2023-02-27 l 10 ## 566 2023-02-28 p 50 ## 567 2023-03-15 p 20 ## 568 2023-03-14 v 100 ## 569 2023-02-23 z 0 ## 570 2023-03-09 n 40 ## 571 2023-03-09 t 20 ## 572 2023-03-08 g 70 ## 573 2023-03-02 r 90 ## 574 2023-03-24 p 90 ## 575 2023-03-14 c 50 ## 576 2023-03-11 i 90 ## 577 2023-03-11 f 20 ## 578 2023-02-23 l 90 ## 579 2023-03-02 b 80 ## 580 2023-03-15 t 30 ## 581 2023-02-24 l 90 ## 582 2023-03-10 a 10 ## 583 2023-03-08 i 60 ## 584 2023-03-09 x 40 ## 585 2023-03-14 y 40 ## 586 2023-03-01 a 90 ## 587 2023-03-03 n 90 ## 588 2023-03-22 t 100 ## 589 2023-03-12 c 50 ## 590 2023-03-17 o 10 ## 591 2023-03-02 c 60 ## 592 2023-02-24 m 100 ## 593 2023-02-27 w 90 ## 594 2023-03-21 i 50 ## 595 2023-03-09 y 40 ## 596 2023-02-23 i 10 ## 597 2023-03-22 f 90 ## 598 2023-03-19 s 50 ## 599 2023-03-02 n 70 ## 600 2023-02-25 d 30 ## 601 2023-03-24 r 0 ## 602 2023-03-08 k 100 ## 603 2023-03-23 b 100 ## 604 2023-03-01 p 60 ## 605 2023-03-01 h 0 ## 606 2023-03-02 b 70 ## 607 2023-03-18 z 10 ## 608 2023-03-20 r 10 ## 609 2023-03-23 a 80 ## 610 2023-03-24 c 40 ## 611 2023-02-26 r 100 ## 612 2023-03-22 a 70 ## 613 2023-03-23 z 0 ## 614 2023-02-23 e 0 ## 615 2023-03-08 z 30 ## 616 2023-03-16 e 0 ## 617 2023-03-11 v 80 ## 618 2023-03-04 y 30 ## 619 2023-03-23 m 10 ## 620 2023-02-26 r 80 ## 621 2023-03-17 f 90 ## 622 2023-03-12 d 50 ## 623 2023-03-03 r 60 ## 624 2023-03-05 m 20 ## 625 2023-03-08 k 40 ## 626 2023-03-01 v 100 ## 627 2023-03-07 r 30 ## 628 2023-03-03 j 80 ## 629 2023-03-02 y 10 ## 630 2023-03-14 z 40 ## 631 2023-02-28 g 70 ## 632 2023-03-08 c 40 ## 633 2023-03-21 p 90 ## 634 2023-02-27 z 50 ## 635 2023-03-13 j 30 ## 636 2023-02-25 u 50 ## 637 2023-03-17 y 60 ## 638 2023-02-26 z 100 ## 639 2023-03-18 x 70 ## 640 2023-03-01 l 40 ## 641 2023-03-05 r 10 ## 642 2023-03-16 v 100 ## 643 2023-02-28 u 0 ## 644 2023-03-15 h 70 ## 645 2023-03-09 d 20 ## 646 2023-03-20 t 30 ## 647 2023-03-07 m 20 ## 648 2023-03-11 z 10 ## 649 2023-03-05 g 90 ## 650 2023-03-24 l 60 ## 651 2023-03-02 b 0 ## 652 2023-03-04 r 10 ## 653 2023-03-08 x 80 ## 654 2023-03-03 v 70 ## 655 2023-03-18 v 70 ## 656 2023-02-26 h 90 ## 657 2023-03-21 j 60 ## 658 2023-03-08 q 0 ## 659 2023-03-06 w 60 ## 660 2023-03-05 k 70 ## 661 2023-03-11 o 80 ## 662 2023-03-08 c 70 ## 663 2023-03-23 z 40 ## 664 2023-03-11 r 40 ## 665 2023-03-02 r 80 ## 666 2023-03-02 u 20 ## 667 2023-03-10 r 90 ## 668 2023-03-07 k 30 ## 669 2023-03-15 y 40 ## 670 2023-03-11 s 60 ## 671 2023-03-07 f 10 ## 672 2023-03-16 n 90 ## 673 2023-03-20 w 70 ## 674 2023-03-17 t 100 ## 675 2023-02-26 v 90 ## 676 2023-02-27 q 90 ## 677 2023-02-26 b 20 ## 678 2023-03-13 s 30 ## 679 2023-03-04 r 80 ## 680 2023-02-24 w 20 ## 681 2023-02-28 e 70 ## 682 2023-03-19 n 30 ## 683 2023-03-19 e 80 ## 684 2023-03-24 s 20 ## 685 2023-03-11 l 40 ## 686 2023-03-21 e 20 ## 687 2023-02-26 s 0 ## 688 2023-03-06 m 20 ## 689 2023-03-05 i 100 ## 690 2023-02-23 c 50 ## 691 2023-03-02 n 100 ## 692 2023-03-16 m 40 ## 693 2023-03-05 v 0 ## 694 2023-03-23 n 60 ## 695 2023-03-22 a 20 ## 696 2023-03-14 f 70 ## 697 2023-03-15 l 50 ## 698 2023-02-23 s 90 ## 699 2023-03-24 x 80 ## 700 2023-03-11 e 70 ## 701 2023-03-12 j 100 ## 702 2023-03-18 l 100 ## 703 2023-03-12 r 0 ## 704 2023-02-27 a 60 ## 705 2023-03-13 s 20 ## 706 2023-03-12 h 60 ## 707 2023-03-22 e 70 ## 708 2023-03-21 c 40 ## 709 2023-03-05 y 70 ## 710 2023-03-06 y 40 ## 711 2023-02-23 x 20 ## 712 2023-03-21 h 40 ## 713 2023-03-09 p 60 ## 714 2023-03-24 j 90 ## 715 2023-03-04 h 70 ## 716 2023-03-05 u 80 ## 717 2023-02-26 i 20 ## 718 2023-03-19 i 100 ## 719 2023-02-26 g 90 ## 720 2023-03-15 i 0 ## 721 2023-03-11 m 0 ## 722 2023-03-19 z 20 ## 723 2023-03-10 t 0 ## 724 2023-02-24 b 30 ## 725 2023-03-17 c 70 ## 726 2023-02-28 v 40 ## 727 2023-03-02 x 40 ## 728 2023-03-10 b 60 ## 729 2023-03-07 i 0 ## 730 2023-02-28 y 50 ## 731 2023-03-23 q 80 ## 732 2023-03-18 m 20 ## 733 2023-03-24 d 0 ## 734 2023-03-12 l 40 ## 735 2023-03-06 g 40 ## 736 2023-03-23 n 50 ## 737 2023-03-16 f 30 ## 738 2023-03-07 j 40 ## 739 2023-03-19 k 80 ## 740 2023-03-07 j 70 ## 741 2023-02-27 s 0 ## 742 2023-02-24 s 90 ## 743 2023-02-24 u 30 ## 744 2023-03-08 r 90 ## 745 2023-03-16 v 100 ## 746 2023-03-12 a 0 ## 747 2023-02-26 o 80 ## 748 2023-03-01 w 60 ## 749 2023-03-06 k 60 ## 750 2023-03-18 e 60 ## 751 2023-03-21 l 50 ## 752 2023-03-01 g 0 ## 753 2023-03-03 p 0 ## 754 2023-02-26 a 70 ## 755 2023-03-19 a 40 ## 756 2023-03-09 j 20 ## 757 2023-03-14 w 70 ## 758 2023-03-14 m 60 ## 759 2023-03-06 r 90 ## 760 2023-03-12 v 20 ## 761 2023-02-23 b 70 ## 762 2023-03-23 k 10 ## 763 2023-02-26 a 80 ## 764 2023-03-18 n 60 ## 765 2023-02-26 f 30 ## 766 2023-03-23 i 80 ## 767 2023-03-12 p 80 ## 768 2023-03-01 x 90 ## 769 2023-03-06 l 0 ## 770 2023-03-10 g 100 ## 771 2023-03-01 t 90 ## 772 2023-03-18 j 100 ## 773 2023-03-14 e 60 ## 774 2023-03-10 n 90 ## 775 2023-03-06 f 50 ## 776 2023-03-15 f 70 ## 777 2023-03-19 r 70 ## 778 2023-02-23 p 0 ## 779 2023-03-05 g 70 ## 780 2023-03-10 h 90 ## 781 2023-03-19 u 0 ## 782 2023-03-20 n 0 ## 783 2023-03-12 q 90 ## 784 2023-03-16 q 10 ## 785 2023-03-13 p 100 ## 786 2023-02-25 y 20 ## 787 2023-03-05 r 100 ## 788 2023-03-01 f 0 ## 789 2023-02-23 a 0 ## 790 2023-03-11 j 20 ## 791 2023-03-23 k 0 ## 792 2023-03-13 m 40 ## 793 2023-03-08 p 70 ## 794 2023-03-01 a 10 ## 795 2023-03-17 c 80 ## 796 2023-03-16 j 50 ## 797 2023-03-12 b 30 ## 798 2023-03-13 u 40 ## 799 2023-03-03 r 70 ## 800 2023-03-17 i 40 ## 801 2023-03-20 m 100 ## 802 2023-03-12 s 50 ## 803 2023-03-11 l 80 ## 804 2023-03-16 c 30 ## 805 2023-03-15 j 40 ## 806 2023-03-04 c 90 ## 807 2023-03-03 a 0 ## 808 2023-03-15 d 90 ## 809 2023-03-06 h 20 ## 810 2023-03-07 n 10 ## 811 2023-03-11 j 90 ## 812 2023-02-26 r 50 ## 813 2023-02-28 w 40 ## 814 2023-03-10 b 40 ## 815 2023-02-25 s 60 ## 816 2023-03-15 f 40 ## 817 2023-03-23 f 40 ## 818 2023-03-07 f 50 ## 819 2023-03-17 k 10 ## 820 2023-03-09 y 10 ## 821 2023-03-20 r 30 ## 822 2023-03-05 w 100 ## 823 2023-03-01 y 100 ## 824 2023-02-23 o 80 ## 825 2023-03-02 c 90 ## 826 2023-03-18 u 70 ## 827 2023-03-22 y 50 ## 828 2023-03-16 s 50 ## 829 2023-03-24 r 40 ## 830 2023-03-21 r 80 ## 831 2023-03-10 l 40 ## 832 2023-03-09 t 20 ## 833 2023-03-23 b 20 ## 834 2023-03-11 k 40 ## 835 2023-03-24 a 70 ## 836 2023-03-06 g 90 ## 837 2023-03-23 r 70 ## 838 2023-03-06 x 10 ## 839 2023-03-20 k 100 ## 840 2023-02-23 i 40 ## 841 2023-03-14 f 30 ## 842 2023-02-23 c 20 ## 843 2023-03-07 e 40 ## 844 2023-03-19 c 50 ## 845 2023-03-05 p 0 ## 846 2023-03-17 r 40 ## 847 2023-02-26 r 0 ## 848 2023-03-23 d 60 ## 849 2023-03-20 f 90 ## 850 2023-03-02 b 60 ## 851 2023-03-10 n 50 ## 852 2023-03-02 o 50 ## 853 2023-03-15 u 50 ## 854 2023-03-21 q 100 ## 855 2023-03-17 t 40 ## 856 2023-02-24 n 0 ## 857 2023-02-28 m 90 ## 858 2023-03-18 n 20 ## 859 2023-03-18 b 50 ## 860 2023-03-07 n 10 ## 861 2023-02-28 c 60 ## 862 2023-02-24 c 40 ## 863 2023-03-16 n 60 ## 864 2023-03-20 z 50 ## 865 2023-03-23 y 50 ## 866 2023-03-11 q 40 ## 867 2023-03-14 n 40 ## 868 2023-03-03 p 40 ## 869 2023-03-15 b 60 ## 870 2023-03-20 j 80 ## 871 2023-03-06 i 10 ## 872 2023-03-07 m 80 ## 873 2023-03-22 w 60 ## 874 2023-03-15 s 10 ## 875 2023-03-13 p 90 ## 876 2023-03-12 h 90 ## 877 2023-03-09 k 0 ## 878 2023-03-10 t 30 ## 879 2023-03-18 b 100 ## 880 2023-03-15 f 30 ## 881 2023-03-22 w 70 ## 882 2023-03-06 m 70 ## 883 2023-03-07 f 30 ## 884 2023-03-05 d 40 ## 885 2023-02-26 p 0 ## 886 2023-03-11 a 100 ## 887 2023-03-16 o 100 ## 888 2023-02-27 u 10 ## 889 2023-02-24 b 50 ## 890 2023-03-10 g 10 ## 891 2023-03-17 i 20 ## 892 2023-03-05 c 20 ## 893 2023-02-25 a 50 ## 894 2023-03-14 v 10 ## 895 2023-03-06 h 20 ## 896 2023-03-23 e 10 ## 897 2023-03-08 p 50 ## 898 2023-03-14 q 40 ## 899 2023-02-24 j 70 ## 900 2023-03-18 d 50 ## 901 2023-02-26 z 100 ## 902 2023-03-07 n 100 ## 903 2023-03-22 b 80 ## 904 2023-03-16 r 80 ## 905 2023-03-19 t 30 ## 906 2023-03-15 d 80 ## 907 2023-03-18 f 80 ## 908 2023-02-25 o 10 ## 909 2023-03-15 y 50 ## 910 2023-02-28 p 40 ## 911 2023-03-01 j 60 ## 912 2023-03-07 e 0 ## 913 2023-02-28 v 90 ## 914 2023-02-24 t 70 ## 915 2023-03-17 q 100 ## 916 2023-02-25 o 0 ## 917 2023-03-09 u 0 ## 918 2023-03-22 z 100 ## 919 2023-03-21 g 100 ## 920 2023-03-06 m 20 ## 921 2023-02-27 j 50 ## 922 2023-03-21 p 0 ## 923 2023-03-14 i 60 ## 924 2023-03-22 o 30 ## 925 2023-03-22 b 30 ## 926 2023-03-23 m 90 ## 927 2023-03-03 e 50 ## 928 2023-03-12 o 0 ## 929 2023-03-16 j 80 ## 930 2023-02-23 j 50 ## 931 2023-03-04 a 80 ## 932 2023-03-11 t 90 ## 933 2023-03-10 a 40 ## 934 2023-02-28 f 90 ## 935 2023-03-06 u 20 ## 936 2023-03-15 o 40 ## 937 2023-03-03 v 10 ## 938 2023-03-04 c 30 ## 939 2023-03-11 a 20 ## 940 2023-02-24 z 50 ## 941 2023-03-10 s 50 ## 942 2023-03-06 s 40 ## 943 2023-03-16 r 70 ## 944 2023-03-15 j 70 ## 945 2023-02-27 r 80 ## 946 2023-03-10 h 10 ## 947 2023-03-07 x 100 ## 948 2023-03-16 t 80 ## 949 2023-03-23 x 90 ## 950 2023-03-03 t 50 ## 951 2023-03-01 q 90 ## 952 2023-03-05 k 50 ## 953 2023-03-18 o 70 ## 954 2023-03-03 u 60 ## 955 2023-03-16 h 40 ## 956 2023-03-19 n 0 ## 957 2023-03-15 x 90 ## 958 2023-03-03 i 20 ## 959 2023-03-08 d 90 ## 960 2023-03-11 z 50 ## 961 2023-03-08 b 10 ## 962 2023-03-05 l 50 ## 963 2023-03-12 q 0 ## 964 2023-03-03 c 30 ## 965 2023-03-23 w 0 ## 966 2023-03-10 w 30 ## 967 2023-03-14 x 60 ## 968 2023-03-22 n 60 ## 969 2023-03-17 h 30 ## 970 2023-03-13 h 0 ## 971 2023-02-25 x 30 ## 972 2023-03-16 d 50 ## 973 2023-03-04 w 30 ## 974 2023-03-06 k 0 ## 975 2023-03-19 g 50 ## 976 2023-03-11 o 10 ## 977 2023-03-17 p 80 ## 978 2023-02-27 q 60 ## 979 2023-03-24 g 0 ## 980 2023-03-15 o 0 ## 981 2023-02-23 h 30 ## 982 2023-03-12 o 70 ## 983 2023-03-20 e 80 ## 984 2023-02-26 z 0 ## 985 2023-03-17 h 100 ## 986 2023-03-08 k 60 ## 987 2023-03-07 q 40 ## 988 2023-03-09 g 60 ## 989 2023-02-23 h 10 ## 990 2023-03-23 u 20 ## 991 2023-03-20 f 50 ## 992 2023-03-13 g 80 ## 993 2023-03-02 i 0 ## 994 2023-02-28 u 70 ## 995 2023-03-24 v 40 ## 996 2023-03-10 b 80 ## 997 2023-03-19 j 100 ## 998 2023-03-13 x 100 ## 999 2023-03-01 p 30 ## 1000 2023-03-15 l 90 ``` --- - Warns you when you make mistakes (!): ```r tib1$quanitty ``` <pre style="color: red;"><code>## Warning: Unknown or uninitialised column: `quanitty`. </code></pre> ``` ## NULL ``` ```r df1$quanitty ``` ``` ## NULL ``` --- - Can also create via `tribble()`: ```r tribble( ~a, ~b, ~c, "a", 1, 2.2, "b", 2, 4.3, "c", 3, 3.4 ) ``` ``` ## # A tibble: 3 × 3 ## a b c ## <chr> <dbl> <dbl> ## 1 a 1 2.2 ## 2 b 2 4.3 ## 3 c 3 3.4 ``` --- - Can build on top of variables during creation: ```r tibble(x = 1:5, y = x^2) ``` ``` ## # A tibble: 5 × 2 ## x y ## <int> <dbl> ## 1 1 1 ## 2 2 4 ## 3 3 9 ## 4 4 16 ## 5 5 25 ``` ```r data.frame(x = 1:5, y = x^2) ``` <pre style="color: red;"><code>## Error in data.frame(x = 1:5, y = x^2): object 'x' not found </code></pre> --- - Will never turn your strings into factors, will never change your column names: ```r tib1 <- readr::read_csv("../data/another_bad_test_args.csv", col_types = cols()) colnames(tib1) ``` ``` ## [1] "col 1" "col 2" "col 3" ``` ```r tib1$`col 3` ``` ``` ## [1] 2 3 NA 4 ``` ```r df1 <- read.csv("../data/another_bad_test_args.csv") colnames(df1) ``` ``` ## [1] "col.1" "col.2" "col.3" ``` ```r df1$col.3 ``` ``` ## [1] 2 3 NA 4 ``` --- Though one ought to remember a `tibble` is still a `data.frame`: ```r class(tib1) ``` ``` ## [1] "spec_tbl_df" "tbl_df" "tbl" "data.frame" ``` ```r class(df1) ``` ``` ## [1] "data.frame" ``` --- class: section-slide # End of Detour --- ## Consistency - Example: `stringr` > a cohesive set of functions designed to make working with strings as easy as possible. ```r strings_vec <- c("I'm feeling fine", "I'm perfectly OK", "Nothing is wrong!") str_length(strings_vec) ``` ``` ## [1] 16 16 17 ``` ```r str_c(strings_vec, collapse = ", ") ``` ``` ## [1] "I'm feeling fine, I'm perfectly OK, Nothing is wrong!" ``` ```r str_sub(strings_vec, 1, 3) ``` ``` ## [1] "I'm" "I'm" "Not" ``` --- ```r str_detect(strings_vec, "I'm") ``` ``` ## [1] TRUE TRUE FALSE ``` ```r str_replace(strings_vec, "I'm", "You're") ``` ``` ## [1] "You're feeling fine" "You're perfectly OK" "Nothing is wrong!" ``` ```r str_split("Do you know regex?", " ") ``` ``` ## [[1]] ## [1] "Do" "you" "know" "regex?" ``` ```r str_extract(strings_vec, "[aeiou]") ``` ``` ## [1] "e" "e" "o" ``` ```r str_count(strings_vec, "[A-Z]") ``` ``` ## [1] 1 3 1 ``` --- ## The Pipe Remember you? ```r mean_bi <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "bisexual"], na.rm = TRUE) mean_gay <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "gay"], na.rm = TRUE) mean_straight <- mean(okcupid$income[okcupid$sex == "f" & okcupid$age > 30 & okcupid$orientation == "straight"], na.rm = TRUE) data.frame(orientation = c("bisexual", "gay", "straight"), income_mean = c(mean_bi, mean_gay, mean_straight)) ``` ``` ## orientation income_mean ## 1 bisexual 133421.05 ## 2 gay 86489.36 ## 3 straight 85219.74 ``` --- Doesn't this make much more sense? ```r okcupid %>% filter(sex == "f", age > 30) %>% group_by(orientation) %>% summarize(income_mean = mean(income, na.rm = TRUE)) ``` ``` ## # A tibble: 3 × 2 ## orientation income_mean ## <chr> <dbl> ## 1 bisexual 133421. ## 2 gay 86489. ## 3 straight 85220. ``` - Read as: - Take the OKCupid data, - Filter only women above the age of 30, - And for each group of sexual orientation, - Give me the average income --- - Make verbs, not nouns - Can always access the dataset last stage with "`.`": ```r okcupid %>% filter(str_count(essay0) > median(str_count(.$essay0), na.rm = T)) ``` - Operates not just on data frames or tibbles: ```r strings_vec %>% str_to_title() ``` ``` ## [1] "I'm Feeling Fine" "I'm Perfectly Ok" "Nothing Is Wrong!" ``` - No intermediate objects - Don't strive to make the longest possible pipe (though it is a fun experiment) - Tools exist for debugging (e.g. `%T>%`, the ViewPipeStep package, ...) --- And, if you want to throw in the n, the median: ```r okcupid %>% filter(sex == "f", age > 30) %>% group_by(orientation) %>% summarize(income_mean = mean(income, na.rm = TRUE), income_median = median(income, na.rm = TRUE), n = n()) ``` ``` ## # A tibble: 3 × 4 ## orientation income_mean income_median n ## <chr> <dbl> <dbl> <int> ## 1 bisexual 133421. 50000 652 ## 2 gay 86489. 40000 664 ## 3 straight 85220. 60000 10436 ``` --- And if you want this for the age as well: ```r okcupid %>% filter(sex == "f", age > 30) %>% group_by(orientation) %>% summarize(across(c(income, age), list(mean = ~mean(.x, na.rm = TRUE), median = ~median(.x, na.rm = TRUE)))) ``` ``` ## # A tibble: 3 × 5 ## orientation income_mean income_median age_mean age_median ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 bisexual 133421. 50000 37.8 36 ## 2 gay 86489. 40000 40.4 38 ## 3 straight 85220. 60000 40.7 38 ``` Now *this* is a language for Data Science. But we're getting ahead of ourselves. --- ### In fact! This became so popular that since R 4.1 there is a built-in pipe operator: ```r okcupid |> filter(sex == "f", age > 30) |> group_by(orientation) |> summarize(across(c(income, age), list(mean = \(x) mean(x, na.rm = TRUE), median = \(x) median(x, na.rm = TRUE)))) ``` ``` ## # A tibble: 3 × 5 ## orientation income_mean income_median age_mean age_median ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 bisexual 133421. 50000 37.8 36 ## 2 gay 86489. 40000 40.4 38 ## 3 straight 85220. 60000 40.7 38 ``` .insight[ 💡 What else changed in R 4.1? ] --- ## `ggplot2` <img src="images/ave_mariah_ggridges.png" style="width: 80%" /> .font80percent[ [Ave Mariah / Giora Simchoni](http://giorasimchoni.com/2017/12/10/2017-12-10-ave-mariah/) ] --- ## `ggplot2` <img src="images/soviet_space_dogs.png" style="width: 50%" /> .font80percent[ [Soviet Space Dogs / David Smale](https://davidsmale.netlify.com/portfolio/soviet-space-dogs-part-2/) ] --- ## `ggplot2` <img src="images/tennisBig4.gif" style="width: 70%" /> .font80percent[ [Federer, Nadal, Djokovic and Murray, Love. / Giora Simchoni](http://giorasimchoni.com/2017/05/01/2017-05-01-federer-nadal-djokovic-and-murray-love/) ] --- ## `ggplot2` <img src="images/washington_heat.png" style="width: 40%" /> .font80percent[ [NYT-style urban heat island maps / Katie Jolly](https://www.katiejolly.io/blog/2019-08-28/nyt-urban-heat) ] --- ## `ggplot2` <img src="images/marriage_by_state.png" style="width: 90%" /> .font80percent[ [A map of marriage rates, state by state / Unkown](https://www.r-graph-gallery.com/328-hexbin-map-of-the-usa.html) ] --- ## `ggplot2` <img src="images/calendar_graph.png" style="width: 70%" /> .font80percent[ [Calendar-based graphics for visualizing people’s daily schedules / Earo Wang](https://pdf.earo.me/calendar-vis.pdf) ] --- ## The Community .pull-left[ - 100% Open Source on Github - Cheatsheet for everything - Documentation for humans, Packages websites, Webinars, Free Books (start with [R4DS](https://r4ds.had.co.nz/)) - [Rstudio Community forum](https://community.rstudio.com/) - [RLadies](https://rladies.org/) worldwide branches .font80percent[(who will pick up the 🥊 and create RLadies TLV?)] - Very strong on Twitter [#rstats](https://twitter.com/search?q=%23rstats) ] .pull-right[ <a href="https://rstudio.com/resources/cheatsheets/"><img src="images/stringr_cheatsheet.png" style="width: 100%" /></a> ]