Time namespaces

By Jonathan Corbet
September 21, 2018

The kernel's namespace abstraction allows different groups of processes to have different views of the system. This feature is most often used with containers; it allows each container to have its own view of the set of running processes, the network environment, the filesystem hierarchy, and more. One aspect of the system that remains universal, though, is the concept of the system time. The recently posted time namespace patch set (from Dmitry Safonov with a lot of work by Andrei Vagin) seeks to change that.

Creating a virtualized view of the system time is not a new concept; Jeff Dike posted an implementation back in 2006 to support his user-mode Linux project. Those patches were not merged at the time but, since then, the use of containers has taken off and the interest has increased. One might view time as a universal concept, but there are use cases for a per-container notion of time; they can be as simple as testing software at different points in time. The driving force behind this patch set, though, is likely to be problems associated with the checkpointing of processes and migrating them between physical hosts. When a process is restarted, it should have a consistent view of time, and that may require applying some adjustments at restart time.

The implementation is straightforward enough. Each time namespace contains a set of offsets to be added to the system's notion of the current time. The kernel maintains a number of clocks with different characteristics (documented here), each of which can have a different offset. Some of these clocks, such as CLOCK_MONOTONIC, have an undefined start point that will vary from one running system to the next, so they will need their own offsets to maintain consistent behavior for a container that has been migrated. System calls that adjust the system time will, when called outside of the root time namespace, adjust the namespace-specific offsets instead.

There is one small complication, in that some of the time-related system calls are implemented as virtual system calls on some architectures for performance reasons. Querying the current time can be a frequent operation, so it can be worth the trouble to answer such queries without actually entering the kernel. Making the virtual system calls aware of time namespaces requires making the clock offsets available to user space; the good news is that there is a small piece of the address space called the "VVAR page" (even though it is larger than one page) meant to hold just this kind of data. The time namespace work adds another page to this VVAR region to hold the time offsets, allowing calls like gettimeofday() to continue to work without entering the kernel.

Namespace maintainer Eric Biederman has expressed support for time namespaces, but he has also suggested some changes. His observation is that the timekeeper structure used within the kernel to implement the various clocks already contains a set of offsets relating those clocks to the hardware's idea of the current time. Rather than adding a second layer of offsets, he suggested, each namespace could be given its own timekeeper structure and the offsets found there could be tweaked instead. That might add to the complexity of the implementation, but this approach would have some advantages. Most of the kernel's current timekeeping code would just work with namespaces, allowing better testing overall with fewer special cases. Integrating namespaces at this level would also allow each container to run its own NTP process, and different containers could, for example, use different leap-second policies.

Biederman raised the possibility of security issues if time namespaces can be used to manipulate dates on files in filesystems, though he was not sure if that actually mattered. He also suggested that access to the realtime clock (the hardware clock that, in the end, drives the system's timekeeping) should perhaps be left out of the time namespace until it is clear that there are actual use cases for it. If that use case does arise, he said, some thought will have to be given to how the realtime clock, which is a global resource, should be presented to non-root namespaces.

There are, in other words, a few details remaining to be worked out regarding how time namespaces will work. There do not, however, appear to be any real obstacles to a solution, so chances are good that the kernel's collection of namespaces will be enhanced by time namespaces sometime in the not-too-distant future. Given how long the idea has been around, one might say it's about time.

Index entries for this article
Kernel	Namespaces

Time namespaces

Posted Sep 21, 2018 20:33 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link] (8 responses)

The assumption that "containers" are surely not part of a distributed system running on different internet nodes which is based on the notion of "the time" as commonly understood by people who don't seriously believe this was an evil conspiracy of railway operators of about 150 years ago seems a bit ... optimistic.

It's also difficult to imagine an application which could break because it resumes execution at some unspecified time in future relative to when it was stopped. This a common-place situation in a preemptive multitasking system, after all. What could conceivably cause problems here is moving application between systems whose ideas of "the time" differ because of "run ntpdate from cron" disease or ignoring clock drift altogether, IOW, the application suddenly finds itself in the past of it's earlier state.

Time namespaces

Posted Sep 21, 2018 20:47 UTC (Fri) by corbet (editor, #1) [Link] (1 responses)

I guess I wasn't clear enough on that...having a bad week, it seems.

System times are based on internal clocks that will vary across systems, even when time synchronization is in place. If you're not careful, a migrated process has a high probability of seeing CLOCK_MONOTONIC going backward, for example, which is going to create confusion. People do have reasons for doing this kind of work...

Time namespaces

Posted Sep 21, 2018 21:10 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link]

That was just me being dense. Thanks for the additional explanation.

Time namespaces

Posted Sep 23, 2018 16:08 UTC (Sun) by kiryl (subscriber, #41516) [Link] (3 responses)

> It's also difficult to imagine an application which could break because it resumes execution at some unspecified time in future relative to when it was stopped.

All sorts of timeouts can fire up just because the application was resumed few hours in the future.

Time namespaces

Posted Sep 23, 2018 16:50 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link] (2 responses)

Well, yes. But if they were scheduled for some absolute time, they should fire. It's not generally possible to stop a real-time bound task and restart it much later without wreaking some havoc on it.

Time namespaces

Posted Sep 23, 2018 17:36 UTC (Sun) by kiryl (subscriber, #41516) [Link] (1 responses)

Of course it's possible. ntpd is able to adjust time from wrong to right in a safer manner.

Time namespaces

Posted Sep 23, 2018 18:52 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link]

Quoting the ntpd documentation (-x option)

Normally, the time is slewed if the offset is less than the step threshold, which is 128 ms by default, and stepped if above the threshold. This option sets the threshold to 600 s, which is well within the accuracy window to set the clock manually. Note: Since the slew rate of typical Unix kernels is limited to 0.5 ms/s, each second of adjustment requires an amortization interval of 2000 s. Thus, an adjustment as much as 600 s will take almost 14 days to complete.

Leaving this non-possibilty aside, this doesn't help with a real-time bound task. Eg, using an example I'm familiar with, an IKEv1 ISAKMP SA usually has a fixed, negotiated lifetime and there's another party to it. It's not possible to stop the task manageing the SA and later restart it in a virtual past because the lifetime of the SA will end on time, regardless of any local clock fudging. The outcome will be a VPN communication breakdown until the 'confused' IKE task has again caught up with the real universe outside of it.

Time namespaces

Posted Sep 24, 2018 7:54 UTC (Mon) by k8to (guest, #15413) [Link]

I'm not sure I'm understanding you properly. I see programs break all the time if they get descheduled longer than an engineer thought was reasonable.

I also see programs break just because the system clock jumps ahead for some reason (administration, hardware flaws, etc), or because the delta in the wallclock time between two systems is stable but over "what is reasonable".

Maybe you weren't talking about these scenarios? They're bad software for sure, but in my experience most software is bad software.

Time namespaces

Posted Sep 27, 2018 14:24 UTC (Thu) by cew5550 (guest, #122770) [Link]

Exactly true and what I was thinking as well. About the only real issue is going back in time - that can break things.

Time namespaces

Posted Sep 21, 2018 21:31 UTC (Fri) by amarao (subscriber, #87073) [Link] (6 responses)

I just realized a scenario where time namespaces are excellent. It's the access to expired SSL. Yes, you shouldn't trust expired certificates. But what if some device have hardcoded certificate and it expired in 2016? Device (f.e. ip camera made of chainizium) is good and can be used. But it had certificate expiring on 2016... Currently it's pain in the arse to use such stuff, and every next browser version made it harder and harder. With time namespaces I can just run a browser instance in namespace with eternal 2015 and use such device with no issues.

Time namespaces

Posted Sep 21, 2018 21:47 UTC (Fri) by dtlin (subscriber, #36537) [Link]

If you just need to adjust the time for a single browser, libfaketime would likely be easier.

Chrome explicitly tries to guard against wrong client time and might not cooperate with your time tweaking either way, though.

//chromium/src/components/network_time/network_time_tracker.cc

// Network time queries are enabled on all desktop platforms except ChromeOS,
// which uses tlsdated to set the system time.

//chromium/src/components/ssl_errors/error_classification.cc

  if (now_system < build_time - base::TimeDelta::FromDays(2)) {
    build_time_state = CLOCK_STATE_PAST;
  } else if (now_system > build_time + base::TimeDelta::FromDays(365)) {
    build_time_state = CLOCK_STATE_FUTURE;
  }

Expired Certificates

Posted Sep 23, 2018 13:43 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link] (3 responses)

There's no technical reason why one shouldn't trust an expired certificate. 'A certificate' is a public key plus some metainformation which both have been digitially signed utilizing some usually unrelated private key (if the private key corresponding to the public key in the certificate has been used, the certificate is said to be self-signed). The owner of the certificate will also have the secret private key corresponding with the public key and hence, someone who has access to the certificate can create messages only the certificate owner can decrypt. It's considered prudent to change encryption keys regular, that's why certificates "expire". But that's just encouraging a key change (which implies generating a new certificate) and doesn't enforce it: Unless the private key has been compromised, there's no need to stop using it.

Expired Certificates

Posted Sep 23, 2018 14:51 UTC (Sun) by Sesse (subscriber, #53779) [Link] (2 responses)

Other reasons why certificates expire include that the domain may have been transferred to another entity. And if somebody manages to generate a bad certificate somehow, one wants to limit the amount of damage that can be done.

Expired certificates are also generally not part of OSCP, so it's hard to revoke them in practice.

Expired Certificates

Posted Sep 23, 2018 16:57 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link] (1 responses)

A certificate as two time attributes called "not before" and "not after" which form the bounds of "certificate lifetime". That's a property of the certificate and has absolutely no relation to "domain ownership". In case of a domain changing owner, not that this would be applicable to the camera case, old certificates would probably be revoked, that is, put on a special "this certificate isn't considered valid anymore" list published by a CA (simplification).

'Bad certificates' would also usually be dealt with by revocation. Standard lifetime of commercial certificates is a year and "Oh well, the guy who pretends to be your bank in order to rob your account will be forced to stop next year!" wouldn't exactly be fit-for-purpose as security policy here.

Expired Certificates

Posted Sep 24, 2018 12:30 UTC (Mon) by KaiRo (subscriber, #1987) [Link]

One of the problems is that most certificate checks in software do not check revoking information, and even the sources for revoking information (CRLs, OCSP, etc.) have various issues, including privacy leaks and more. That's one reason why expiry has more weight than it should have in theory, because it puts a time limit on the issues around revocation.

Time namespaces

Posted Sep 24, 2018 14:49 UTC (Mon) by rriggs (guest, #11598) [Link]

The primary reason I would use time namespaces is to test code around DST changes, leap seconds, leap years, etc. It's rather difficult to mock that stuff because a lot of code acquires the time from system calls. I've done by overriding libc functions in an LD_PRELOADed library, but that doesn't provide the same coverage that changing the actual system time does.

Time namespaces

Posted Sep 27, 2018 0:11 UTC (Thu) by mhelsley-vmw (guest, #122101) [Link] (1 responses)

I wonder if this could also be useful when kicking off reproducible builds. Lots of software/package/container build scripts use the current time, at some granularity, to stamp (parts of) a build. Being able to set a specific build time without having to modify thousands of bespoke build scripts might be handy for anyone who wants to verify that builds can indeed be reproduced.

Time namespaces

Posted Sep 27, 2018 21:01 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Well, then your build also needs to go at the same speed as the other builds. If I build on an RPi, I'm going to get a different embedded timestamp at the start and end compared to a Xeon running QEMU.

Time namespaces

Posted Oct 8, 2018 18:50 UTC (Mon) by yxejamir (guest, #103429) [Link]

Would any of the proposed changes allow to freeze time inside a namespace? It's somewhat realted to the question of reproducible builds raised before, because it will guarantee all timestamps to be the same.