Skip to content

Some quotes seem to be excessively long #35

@friendly

Description

@friendly

In data-raw/quote-stats.R I calculated length of quotes in characters, words, sentences. Some quotes seem very long to me and perhaps should be reviewed manually.

library(statquotes)
library(stringr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
 ...
qt <- get_quotes()
text <- qt$text

count characters, words, sentences

stats <- data.frame(
  qid = qt$qid,
  chars = str_count(text, boundary("character")),
  words = str_count(text, boundary("word")),
  sent = str_count(text, boundary("sentence")),
  txt = substr(qt$text, 1, 40)
)

which are the longest?

stats |>
  dplyr::slice_max(words, n=12) |>
  dplyr::arrange(qid)
#>    qid chars words sent                                      txt
#> 1  297   860   151   10 The goals in statistics are to use data 
#> 2  336   996   171    6 It is difficult to understand why statis
#> 3  349  1034   158    7 Scholars feel the need to present tables
#> 4  371  1155   194    9 It's not easy to select more than a few 
#> 5  394   818   149    6 It was always important for the biometri
#> 6  426  1157   178   11 In contrast to the logical development a
#> 7  442  1071   161    7 An important distinction needs to be mad
#> 8  472  1015   146    7 In marked contrast to what is advocated 
#> 9  521   920   172    6 What is the probability of obtaining a d
#> 10 524   988   170    8 We admit with Sir Winston Churchill that
#> 11 531  1145   191    8 An important part of the explanation [of
#> 12 602   926   156    6 An observation is judged significant, if

Created on 2023-10-08 with reprex v2.0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions