Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
@@ -1 +1,6 @@
exportPattern("^[[:alpha:]]+")
# Generated by roxygen2: do not edit by hand

export(meanimpute)
export(transform_log)
export(windsorize)
import(stats)
14 changes: 13 additions & 1 deletion R/meanimpute.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
#' Meanimputation
#' Meanimpute
#'
#' Replace NA's with mean value
#' @param x a numeric vector
#' @return new vector, where NA's are replace by mean \code{x}
#' @examples
#' example_vector=c(1,5,NA,NA)
#' meanimpute(example_vector)
#' @export
meanimpute <- function(x) {

if(is.null(x)) {stop("Input vector cannot be NULL.")}
if(all(is.na(x))) {stop("Input vector should contain at least one numeric element.")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being a bit picky here: ´Input vector should contain at least one numeric element which is not NA´.

if(any(is.numeric(x)==FALSE)) {stop("Input vector should contain at least one numeric element.")}

x[is.na(x)] <- mean(x, na.rm = TRUE)
x
}
20 changes: 20 additions & 0 deletions R/transform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#' transform_log
#'
#' log-transformation of a numeric vector. For details about log-transformation please see you basic school math textbook.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the last sentence and either think of

  1. Why log transformations are typically used and/or
  2. Hint to a Web link which describes the transformation in detail

#' @param x a numeric vector
#' @return log-transformed vector \code{x}
#' @examples
#' example_vector=c(1,2,3,4,5,6,7,8,9,10)
#' transform_log(example_vector)
#' @export

transform_log <- function(x){

if( is.null(x) ) stop("Input vector is not allowed to be NULL.")
if( any(is.na(x)) ) stop("There is at least one NA value in input vector.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stop() could also be turned into a warning()

if( any(x <= 0) ) stop("There is at least one negative value.")
if( any(is.numeric(x) == FALSE) ) stop("There is at least one non-numeric value.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better here use the negation sign: !is.numeric(x)

y<-log(x)
return(y)

}
36 changes: 31 additions & 5 deletions R/windsorize.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,36 @@
#' Windsorize
#'
#' Do some windsorization.
#' Its purposes is to eliminate outliers in a following way. Values of (0.5 +- p/2)th quantiles are calculated and all
#' values above(below) those quantiles are replaced by the quantiles.
#' @param x a numeric vector
#' @param p quantile
#' @return Windsorized vector \code{x}
#' @examples
#' example_vector=c(-1000,1,2,3,4,5,6,7,8,9,1000)
#' windsorize(example_vector, 0.9)
#'
#' example_vector=rnorm(100)
#' windsorize(example_vector, 0.9)
#' @export
#' @import stats

windsorize <- function(x, p = .90) {
q <- quantile(x, p)
x[x >= q] <- q
x

if(is.null(x)) {stop("Input vector cannot be NULL.")}
if(any(is.na(x))) {stop("There should be no NA's in input vector.")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why are no NA's allowed? What could be the workaround here to still windsorize and keep NAs?
  2. Typo: There should be no NA's in the input...

if(all(is.numeric(x)==FALSE)) {stop("There should only numeric values in the input vector.")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, use the negation sign !


if(is.na(p)==TRUE) {stop("Input quantile should be a number between 0 and 1 ")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test does not match with the error message.

if(is.numeric(p)==FALSE) {stop("Input quantile should be a number between 0 and 1 ")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negation sign !.

if(p > 1) {stop("Input quantile should be a number between 0 and 1 ")}
if(p < 0) {stop("Input quantile should be a number between 0 and 1 ")}


q_u <- quantile(x, 0.5 + p/2)
x[x >= q_u] <- q_u

q_l <- quantile(x, 0.5 - p/2)
x[x <= q_l] <- q_l

return(x)
}

14 changes: 12 additions & 2 deletions man/meanimpute.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions man/transform_log.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 17 additions & 1 deletion man/windsorize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions tests/testthat.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
library(testthat)
library(datacleaner)

test_check("datacleaner")
7 changes: 7 additions & 0 deletions tests/testthat/test_correct_input_meanimpute.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
test_that("Incorrect input of meanimpute", {
#tests related to input vector
expect_error(meanimpute(NULL),"Input vector cannot be NULL.")
expect_error(meanimpute(c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA)),"Input vector should contain at least one numeric element.")
expect_error(meanimpute(c(1,2,3,4,"Dracula"), "Input vector should contain at least one numeric element."))

})
8 changes: 8 additions & 0 deletions tests/testthat/test_correct_input_transform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
test_that("Input of transform_log() is correct.", {
#tests related to input vector
expect_error(transform_log(NULL), "Input vector is not allowed to be NULL.")
expect_error(transform_log(c(NA,NA,NA,6,NA,5,NA,NA,7,NA,NA)), "There is at least one NA value in input vector.")
expect_error(transform_log(c(1,2,3,4,5,6,7,8,"string",1000)), "There is at least one non-numeric value.")
expect_error(transform_log(c(1,2,3,4,5,6,7,8,-5)), "There is at least one negative value.")

})
10 changes: 10 additions & 0 deletions tests/testthat/test_correct_input_windsorization.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
test_that("Incorrect input", {
#tests related to input vector
expect_error(windsorize(NULL, .9), "Input vector cannot be NULL.")
expect_error(windsorize(c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA), .9), "There should be no NA's in input vector.")
expect_error(windsorize(c(-1000,1,2,3,4,5,6,7,8,"string",1000), .9), "There should only numeric values in the input vector.")
expect_error(windsorize(c(-1000,1,2,3,4,6,7,8,9,1000), "string"), "Input quantile should be a number between 0 and 1.")
expect_error(windsorize(c(-1000,1,2,3,4,6,7,8,9,1000), 2), "Input quantile should be a number between 0 and 1")
expect_error(windsorize(c(-1000,1,2,3,4,6,7,8,9,1000), -2), "Input quantile should be a number between 0 and 1")

})
3 changes: 3 additions & 0 deletions tests/testthat/test_correct_result_meanimpute.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
test_that("NA's in vector are correctly replace by mean", {
expect_equal(meanimpute(c(2,4,6,NA)), c(2,4,6,4))
})
3 changes: 3 additions & 0 deletions tests/testthat/test_correct_result_transform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
test_that("Vector is correctly log-transformed", {
expect_equal(transform_log(c(1,1,1)), c(0,0,0))
})
3 changes: 3 additions & 0 deletions tests/testthat/test_correct_result_windsorization.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
test_that("vector is correctly windsorized", {
expect_equal(windsorize(c(-1000,1,2,3,4,5,6,7,8,9,1000),0.9), c(-499.5,1,2,3,4,5,6,7,8,9,504.5) )
})