Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R-devel changes set operations leading to a need for an update #132

Closed
eddelbuettel opened this issue Sep 9, 2024 · 19 comments
Closed

R-devel changes set operations leading to a need for an update #132

eddelbuettel opened this issue Sep 9, 2024 · 19 comments
Assignees

Comments

@eddelbuettel
Copy link
Owner

eddelbuettel commented Sep 9, 2024

Fresh in via email from Vienna:

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_nanotime.html.

Specifically, please see the ERRORs shown for the r-devel checks.

These are from ongoing efforts to "improve" the set operations: whereas
they previously always transformed their operands via as.vector(), they
now no longer do do when both operands are of the same kind and appear
to be vector-like. In addition, non-NULL operands are now always
canonicalized via applying unique() and dropping names.

The former seems to cause problems in your code: can you please fix "as
necessary"?

Note that results of set operations are best compared using setequal().

Please correct before 2024-09-30 to safely retain your package on CRAN.

I am building rocker/r-devel weekly, the last one is currently dated Sep 8 ie was rebuilt last night.

@lsilvest can you take a look? I will try to repro here as well.

@eddelbuettel
Copy link
Owner Author

A first attempt to reproduce under very current r-devel works. Details follow.

  1. I start for convenience from docker pull rocker/r-devel which shows the output that follows. and launch that container from the nanotime source repository (options -v $(pwd):/work to mount outer current dir as /work inside container and -w /work to start in /work).
root@ec707d7c7bd1:/work/tests# RDscript --version
Rscript (R) version 4.5.0 Under development (unstable) (2024-09-08 r87106)
root@ec707d7c7bd1:/work/tests# 
  1. I run apt update -qq and then apt install the r-cran-* packages for cctz, date, bit64, data.table, zoo, xts as well as tinytable. They end up (as "distro packaged") in /usr/lib/R/site-library.

  2. For a reason I no longer recall I once decided R-devel in that container should not access that directory. So I use an editor (vi) to edit /usr/local/lib/R/etc/Renviron and add the directory /usr/lib/R/site-library in the third slot in the last line. Check with RDscript -e '.libPaths()'

  3. From the mounted source directory I install using R-devel: RD CMD INSTALL . (note the RD; an initial ./cleanup is good too.

  4. The run RDscript tests/tinytest.R to see one fail on nanoival or run RDscript -e 'tinytest::run_test_file("tests/test_nanoival.R")'

@eddelbuettel
Copy link
Owner Author

The crux seems to be that

> setdiff(nanotime(1:10), nanotime(2:9))
 [1] 1970-01-01T00:00:00.000000001+00:00 1970-01-01T00:00:00.000000002+00:00
 [3] 1970-01-01T00:00:00.000000003+00:00 1970-01-01T00:00:00.000000004+00:00
 [5] 1970-01-01T00:00:00.000000005+00:00 1970-01-01T00:00:00.000000006+00:00
 [7] 1970-01-01T00:00:00.000000007+00:00 1970-01-01T00:00:00.000000008+00:00
 [9] 1970-01-01T00:00:00.000000009+00:00 1970-01-01T00:00:00.000000010+00:00
>

is now 'bad'. The hint by @kurthornik to rely on setequal() instead does not really work yet as we need to compute a difference.

> setequal(nanotime(1:10), nanotime(2:9))
[1] FALSE
>

While setdiff() croaks, casting works

> setdiff(as.numeric(nanotime(1:10)), as.numeric(nanotime(2:9)))
[1]  1 10
> setdiff(bit64::as.integer64(nanotime(1:10)), bit64::as.integer64(nanotime(2:9)))
integer64
[1] 1  10
> 

so I guess we need a custom setdiff.nanotime here, or its ops equivalent for S4?

@eddelbuettel
Copy link
Owner Author

Hm, but we have

##' @rdname set_operations
setMethod("intersect",
          c("nanotime", "nanotime"),
          function(x, y) {
              res <- callNextMethod()
              oldClass(res) <- "integer64"
              new("nanotime", res)
          })

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Sep 9, 2024

This works:

##' @rdname set_operations
setMethod("setdiff",
          c("nanotime", "nanotime"),
          function(x, y) {
              #res <- callNextMethod()
              #oldClass(res) <- "integer64"
              res <- setdiff(as.integer64(x), as.integer64(y))
              new("nanotime", res)
          })

but it is above my S4 paygrade why callNextMethod() does not. @kurthornik, any idea?

root@ec707d7c7bd1:/work# RDscript -e 'library(nanotime); setdiff(nanotime(1:10), nanotime(2:9))'
[1] 1970-01-01T00:00:00.000000001+00:00 1970-01-01T00:00:00.000000010+00:00
root@ec707d7c7bd1:/work# 

and with that

root@ec707d7c7bd1:/work# RDscript tests/tinytest.R 
test_data.frame.R.............   16 tests OK 0.2s
test_data.table.R.............   16 tests OK 100ms
test_nanoduration.R...........  252 tests OK 0.2s
test_nanoival.R...............  304 tests OK 0.6s
test_nanoperiod.R.............  336 tests OK 0.2s
test_nanotime.R...............  267 tests OK 0.2s
test_ops.R....................   58 tests OK 59ms
test_xts.R....................    0 tests    1ms [Exited at #43: Skip xts tests for now]
test_zoo.R....................    1 tests OK 8ms
All ok, 1250 results (1.5s)
root@ec707d7c7bd1:/work# 

@kurthornik
Copy link

Dirk, not really.
I would have thought that using
callNextMethod(as.integer64(x), as.integer64(y))
would work too, but I just verified that it does not, whereas
setdiff(as.integer64(x), as.integer64(y))
indeed makes the checks pass again.
However, unfortunately only for R-devel ...

@eddelbuettel
Copy link
Owner Author

Thanks for the follow-up, Kurt. What is very, very confusing to me is that we have three essentially identical dispatchers there for intersect(), union() and setdiff() and the first two work and the third does not? Hm -- definitely weird. In our current main branch:

nanotime/R/nanoival.R

Lines 796 to 825 in 519c4b7

## provide 'nanotime'-'nanotime' set operations and document here
## ---------------------
##' @rdname set_operations
setMethod("intersect",
c("nanotime", "nanotime"),
function(x, y) {
res <- callNextMethod()
oldClass(res) <- "integer64"
new("nanotime", res)
})
##' @rdname set_operations
setMethod("union",
c("nanotime", "nanotime"),
function(x, y) {
res <- callNextMethod()
oldClass(res) <- "integer64"
new("nanotime", res)
})
##' @rdname set_operations
setMethod("setdiff",
c("nanotime", "nanotime"),
function(x, y) {
res <- callNextMethod()
oldClass(res) <- "integer64"
new("nanotime", res)
})

@eddelbuettel
Copy link
Owner Author

Could it be that the dispatches to bit64 and its integer64 "somehow" get lost? When I run the commands in setdiff by hand all is well too:

> library(nanotime)
> u <- as.vector(nanotime(1:10))
> v <- as.vector(nanotime(2:9))
>
> !duplicated(unclass(u)) & (match(u, v, 0L) == 0L)      # inner expression
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
> u[!duplicated(unclass(u)) & (match(u, v, 0L) == 0L)]    # works as subset
[1] 4.940656e-324 4.940656e-323
> as.nanotime(u[!duplicated(unclass(u)) & (match(u, v, 0L) == 0L)])   # reclassed
[1] 1970-01-01T00:00:00+00:00 1970-01-01T00:00:00+00:00
> 

@kurthornik
Copy link

kurthornik commented Sep 9, 2024 via email

@eddelbuettel
Copy link
Owner Author

My bad for being asleep in an emacs buffer with R-release...

The missing unique() seems like a viable avenue. I am sure we will hear from @lsilvest soon too.

@eddelbuettel
Copy link
Owner Author

Well well well:

Unstaged changes (2)
modified   R/nanoival.R
@@ -814,6 +814,14 @@ setMethod("union",
               new("nanotime", res)
           })
 
+setMethod("unique",
+          "nanotime",
+          function(x) {
+              res <- callNextMethod()
+              oldClass(res) <- "integer64"
+              new("nanotime", res)
+          })
+
 ##' @rdname set_operations
 setMethod("setdiff",
           c("nanotime", "nanotime"),

leads to

root@ec707d7c7bd1:/work# RDscript -e 'library(nanotime); setdiff(nanotime(1:10), nanotime(2:9))'
[1] 1970-01-01T00:00:00.000000001+00:00 1970-01-01T00:00:00.000000010+00:00
root@ec707d7c7bd1:/work# 

which is what we want. No side-effects as far as I can tell from a quick look:

root@ec707d7c7bd1:/work# RDscript tests/tinytest.R 
test_data.frame.R.............   16 tests OK 0.1s
test_data.table.R.............   16 tests OK 98ms
test_nanoduration.R...........  252 tests OK 0.2s
test_nanoival.R...............  304 tests OK 0.6s
test_nanoperiod.R.............  336 tests OK 0.2s
test_nanotime.R...............  267 tests OK 0.2s
test_ops.R....................   58 tests OK 58ms
test_xts.R....................    0 tests    1ms [Exited at #43: Skip xts tests for now]
test_zoo.R....................    1 tests OK 9ms
All ok, 1250 results (1.5s)
root@ec707d7c7bd1:/work# 

@eddelbuettel
Copy link
Owner Author

Sadly with this (new method for unique) approach we end up in the same quagmire we were in last week: passes interactovely, passes the tests yet RD CMD check croaks:

  ----- FAILED[data]: test_nanoival.R<1105--1105>
   call| expect_identical(setdiff(nanotime(1:10), nanotime(2:9)), c(nanotime(1), 
   call| -->    nanotime(10)))
   diff| Numeric: lengths (2, 10) differ
  Error: 1 out of 1250 tests failed
  Execution halted

Uggh.

@eddelbuettel
Copy link
Owner Author

The following tests ok under r-release and r-devel, but it is of 'not pure' as we should S4 dispatch 🙄

##' @rdname set_operations
setMethod("setdiff",
          c("nanotime", "nanotime"),
          function(x, y) {
              ##res <- callNextMethod()
              res <- setdiff(as.integer64(x), as.integer64(y))
              oldClass(res) <- "integer64"
              new("nanotime", res)
          })

@lsilvest
Copy link
Collaborator

@eddelbuettel, when you define the unique method above, you do add the exportMethods(unique) in the NAMESPACE file, right?

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Sep 10, 2024

I had not yet let me add that -- good catch.

Which promptly gets me into trouble:

Error: package or namespace load failed for ‘nanotime’:
 Function found when exporting methods from the namespace ‘nanotime’ which is not S4 generic: ‘unique

... because I had the function commented out. My bad.

This leads to

  • two failures under r-release (see below)
  • no issues under r-devel

Under r-release:

  ----- FAILED[data]: test_nanoival.R<750--750>
   call| expect_identical(intersect(nanotime(1:10), nanotime(5:15)), nanotime(5:10))
   diff| Numeric: lengths (6, 10) differ
  ----- FAILED[data]: test_nanoival.R<861--861>
   call| expect_identical(union(nanotime(1:2), nanotime(8:10)), c(nanotime(1:2), 
   call| -->    nanotime(8:10)))
   diff| Numeric: lengths (5, 2) differ
  Error: 2 out of 1250 tests failed
  Execution halted

PS Scratch that. That was a side effect from temp. directory issues.

@eddelbuettel
Copy link
Owner Author

So with a little break for the morning run I am now at a state where r-release and r-devel pass checks.

But it is kinda ugly:

R code

##' @rdname set_operations
setMethod("setdiff",
          c("nanotime", "nanotime"),
          function(x, y) {
              res <- if (getRversion() >= "4.5.0") setdiff(as.integer64(x), as.integer64(y)) else callNextMethod()
              oldClass(res) <- "integer64"
              new("nanotime", res)
          })

and

if (getRversion() > "4.5.0")  {
##' @rdname nanotime
setMethod("unique",
          "nanotime",
          function(x) {
              res <- callNextMethod()
              oldClass(res) <- "integer64"
              new("nanotime", res)
          })
}

NAMESPACE

if (getRversion() > "4.5.0") exportMethods(unique)

I cannot get it to work without the ugly-ish dispatch to setdiff() for integer64.

@kurthornik Any sage advice?

@eddelbuettel
Copy link
Owner Author

Given the time limit we have to address this, and the nod in email by @lsilvest, I plan to upload this in a day or two to get us out of the deadline. We can probably revisit later as needed, this is all conditional on R 4.5.0 or later anyway.

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Sep 15, 2024

Guess I used 'fixes #132' instead of 'closes #132' in just-merged PR #133 so I get to do this by hand. May as well wait til the update is at CRAN.

@eddelbuettel
Copy link
Owner Author

Addressed in release 0.3.10 now on CRAN.

@kurthornik
Copy link

Great. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants