-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
In data-raw/quote-stats.R I calculated length of quotes in characters, words, sentences. Some quotes seem very long to me and perhaps should be reviewed manually.
library(statquotes)
library(stringr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
...
qt <- get_quotes()
text <- qt$textcount characters, words, sentences
stats <- data.frame(
qid = qt$qid,
chars = str_count(text, boundary("character")),
words = str_count(text, boundary("word")),
sent = str_count(text, boundary("sentence")),
txt = substr(qt$text, 1, 40)
)which are the longest?
stats |>
dplyr::slice_max(words, n=12) |>
dplyr::arrange(qid)
#> qid chars words sent txt
#> 1 297 860 151 10 The goals in statistics are to use data
#> 2 336 996 171 6 It is difficult to understand why statis
#> 3 349 1034 158 7 Scholars feel the need to present tables
#> 4 371 1155 194 9 It's not easy to select more than a few
#> 5 394 818 149 6 It was always important for the biometri
#> 6 426 1157 178 11 In contrast to the logical development a
#> 7 442 1071 161 7 An important distinction needs to be mad
#> 8 472 1015 146 7 In marked contrast to what is advocated
#> 9 521 920 172 6 What is the probability of obtaining a d
#> 10 524 988 170 8 We admit with Sir Winston Churchill that
#> 11 531 1145 191 8 An important part of the explanation [of
#> 12 602 926 156 6 An observation is judged significant, ifCreated on 2023-10-08 with reprex v2.0.2
Metadata
Metadata
Assignees
Labels
No labels