Skip to content

incorrect results from empirical_cdf? #18

@mikoontz

Description

@mikoontz

Hello!

I'm getting unexpected results from mltools::empirical_cdf() that differ from a base R implementation as well as the Emcdf::emcdf() implementation. Am I missing something that would make the {mltools} implementation correct and the other ones wrong?

Here's a reproducible example:

library(data.table)
library(mltools)
library(Emcdf)
library(purrr)

set.seed(1235)
(data <- as.matrix(data.frame(x = rnorm(n = 5), y = rnorm(n = 5), z = rnorm(n = 5))))

The data look like this:

              x          y          z
[1,] -0.6979879 1.69819652 -0.9403661
[2,] -1.2848539 0.04784562  1.0849639
[3,]  0.9899590 0.65486241 -0.7501569
[4,]  0.1117758 1.36528367 -0.4216928
[5,]  0.1142077 0.40257296 -0.8231759

And the implementation of the ecdf looks like this:

dt <- data.table(data)
obj <- initF(data, 2)

(base_R <- pmap_dbl(dt, .f = function(x, y, z) {
  mean(data[, "x"] <= x & data[, "y"] <= y & data[, "z"] <= z)
}))
(Emcdf_package <- emcdf(obj, data))
(mltools_package <- empirical_cdf(dt, ubounds = dt)$CDF)

(results <- data.frame(base_R, Emcdf_package, mltools_package))

Which yields:

  base_R Emcdf_package mltools_package
1    0.2           0.2             0.2
2    0.2           0.2             0.2
3    0.4           0.4             0.2
4    0.2           0.2             0.2
5    0.2           0.2             0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions