Skip to content

experiment with speeding up Time.local #13968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

catlee
Copy link
Contributor

@catlee catlee commented Jul 22, 2025

On macOS, methods like Time.local can be up to hundreds of times slower when run in a forked child process than when run in the parent process. This simple benchmark demonstrates the issue:

require "benchmark/ips"

def time_local
  Time.local(2025,7,21,14,23)
end

Benchmark.ips do |x|
  x.report("non-forked") { time_local }
  x.report("forked") do |times|
    pid = fork do
      times.times { time_local }
    end
    Process.wait(pid)
  end
  x.compare!
end
ruby 3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [arm64-darwin20]
Warming up --------------------------------------
          non-forked     7.873k i/100ms
              forked   131.000 i/100ms
Calculating -------------------------------------
          non-forked     77.203k (± 1.0%) i/s   (12.95 μs/i) -    393.650k in   5.099458s
              forked      1.248k (± 2.1%) i/s  (801.54 μs/i) -      6.288k in   5.042401s

Comparison:
          non-forked:    77202.9 i/s
              forked:     1247.6 i/s - 61.88x  slower

This impacts all kinds of things from logging, parsing times, working with zipfiles with rubyzip, etc.

This PR implements a small cache for offsets from UTC for any given 15-minute quarters derived from the value passed to rb_localtime_r - this avoids multiple calls to libc's localtime_r for the same time values within the same 15-minute block of time. As far as I can tell there are no timezones past or present which have an offset that doesn't fall on a 15-minute boundary (e.g. +0/15/30/45 minutes).

The cache is cleared if the timezone is changed.

@catlee catlee force-pushed the catlee/time_local branch 2 times, most recently from deefc8c to e2066b2 Compare July 22, 2025 05:57
@catlee catlee force-pushed the catlee/time_local branch from e2066b2 to e8595b9 Compare July 22, 2025 06:02
@nobu
Copy link
Member

nobu commented Jul 22, 2025

That benchmark looks including fork and waitpid times and doesn't feel fair enough.

Rather, I feel that this problem should be reported to Apple.
Might be intentional since fork is declared as deprecated, though.

#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

void
localtime_benchmark(const char *cond)
{
    struct timespec ts, te;
    clock_gettime(CLOCK_MONOTONIC , &ts);
    for (long n = 0; n < 10000; ++n) {
        time_t lt = 1753075380;
        struct tm t;
        localtime_r(&lt, &t);
    }
    clock_gettime(CLOCK_MONOTONIC , &te);
    time_t diff = te.tv_sec - ts.tv_sec;
    unsigned long ndiff = te.tv_nsec;
    if (te.tv_nsec < ts.tv_nsec) {
        --diff;
        ndiff += 1000000000UL;
    }
    ndiff -= ts.tv_nsec;
    printf("%s: %ld.%.9lu\n", cond, diff, ndiff);
}

int
main(void)
{
    localtime_benchmark("non-forked");
    pid_t forked = fork();
    if (forked) {
        int stat;
        waitpid(forked, &stat, 0);
    }
    else {
        localtime_benchmark("    forked");
    }
    return 0;
}
non-forked: 0.001993000
    forked: 2.986874000

time.c Outdated
Comment on lines 831 to 876
/* Days in each month (non-leap year) */
static const int days_in_month[12] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};

/* Check if year is a leap year */
static inline int
is_leap_year(int year)
{
return (year % 4 == 0 && year % 100 != 0) || (year % 400 == 0);
}

/* Get days in month, accounting for leap years */
static inline int
get_days_in_month(int year, int month)
{
if (month == 2 && is_leap_year(year)) {
return 29;
}
return days_in_month[month - 1];
}

/* Calculate day of week from year/month/day */
static int
calculate_wday(int year, int month, int day)
{
/* Zeller's congruence algorithm */
if (month < 3) {
month += 12;
year--;
}
int k = year % 100;
int j = year / 100;
int h = (day + (13 * (month + 1)) / 5 + k + k / 4 + j / 4 - 2 * j) % 7;
/* Convert to tm_wday format (Sunday = 0) */
return (h + 6) % 7;
}

/* Calculate day of year from year/month/day */
static int
calculate_yday(int year, int month, int day)
{
int yday = 0;
for (int m = 1; m < month; m++) {
yday += get_days_in_month(year, m);
}
return yday + day - 1; /* tm_yday is 0-based */
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like duplicates to the existing functions.

@catlee
Copy link
Contributor Author

catlee commented Jul 22, 2025

Rather, I feel that this problem should be reported to Apple.

Indeed. I did some searching around and found a few references which make me think this won't be fixed:

https://developer.apple.com/forums/thread/747499
https://developer.apple.com/forums/thread/737464?answerId=764686022#764686022
https://stackoverflow.com/questions/27932330/why-is-tzset-a-lot-slower-after-forking-on-mac-os-x

And from fork's manpage:

CAVEATS
There are limits to what you can do in the child process. To be totally safe you should restrict yourself to only executing async-signal
safe operations until such time as one of the exec functions is called. All APIs, including global data symbols, in any framework or library
should be assumed to be unsafe after a fork() unless explicitly documented to be safe or async-signal safe. If you need to use these frame-
works in the child process, you must exec. In this situation it is reasonable to exec yourself.

AIUI there is IPC signaling in place to notify processes of timezone changes, and these don't work after fork() before exec().

time.c Outdated
Comment on lines 840 to 854
/* Calculate day of week from year/month/day */
static int
calculate_wday(int year, int month, int day)
{
/* Zeller's congruence algorithm */
if (month < 3) {
month += 12;
year--;
}
int k = year % 100;
int j = year / 100;
int h = (day + (13 * (month + 1)) / 5 + k + k / 4 + j / 4 - 2 * j) % 7;
/* Convert to tm_wday format (Sunday = 0) */
return (h + 6) % 7;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use calc_wday.

@@ -703,6 +704,11 @@ static struct vtm *localtimew(wideval_t timew, struct vtm *result);

static int leap_year_p(long y);
#define leap_year_v_p(y) leap_year_p(NUM2LONG(modv((y), INT2FIX(400))))
static int calc_tm_yday(long tm_year, int tm_mon, int tm_mday);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static int calc_wday(int year_mod400, int month, int day);

time.c Outdated
tm->tm_mday = day;

/* Recalculate wday and yday */
tm->tm_wday = calculate_wday(year, month, day);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tm->tm_wday = calculate_wday(year, month, day);
tm->tm_wday = calc_wday(year % 400, month, day);

Comment on lines +788 to +795
/* Cache key: 15-minute precision */
typedef struct {
int year;
int month;
int day;
int hour;
int quarter; /* 0-3 representing 00, 15, 30, 45 minute intervals */
} offset_cache_key;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be possible to pack the cache keys, so that the validness can be checked atomically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I think now we can probably just use our time_t value/15 for our cache key?

I haven't quite convinced myself that it's safe to use 15 minute quarters to cache by..... There are some historical timezones with odd offsets. I don't know of any yet where a dst boundary doesn't happen on a 15 minute boundary in UTC time though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps modern time zones do not have such odd UTC offsets, but "zdump(1)" refers to "Europe/Astrahan".

$ LC_TIME=C /bin/date -z Europe/Astrakhan -j {-f,+}%Y-%m-%dT%H:%M:%S%Z 1924-04-30T00:00:00GMT
1924-04-30T03:12:12LMT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be OK to abandon such cases, i.e., do not cache if the gmtoff % (15 * 60) != 0.

Comment on lines +2256 to +2268
AC_ARG_ENABLE(macos-localtime-cache,
AS_HELP_STRING([--enable-macos-localtime-cache],
[enable localtime cache optimization for macOS to improve forked process performance]),
[macos_localtime_cache=$enableval], [macos_localtime_cache=no])
AS_IF([test "$macos_localtime_cache" = yes], [
AS_CASE(["$target_os"],
[darwin*], [
AC_DEFINE(ENABLE_MACOS_LOCALTIME_CACHE, 1, [Enable macOS localtime cache optimization])
AC_MSG_NOTICE([macOS localtime cache optimization enabled])
],
[AC_MSG_WARN([--enable-macos-localtime-cache is only supported on macOS, ignoring])]
)
])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be limited to macOS in the future.
And do you have any reason not to enable it on macOS by the default?

Suggested change
AC_ARG_ENABLE(macos-localtime-cache,
AS_HELP_STRING([--enable-macos-localtime-cache],
[enable localtime cache optimization for macOS to improve forked process performance]),
[macos_localtime_cache=$enableval], [macos_localtime_cache=no])
AS_IF([test "$macos_localtime_cache" = yes], [
AS_CASE(["$target_os"],
[darwin*], [
AC_DEFINE(ENABLE_MACOS_LOCALTIME_CACHE, 1, [Enable macOS localtime cache optimization])
AC_MSG_NOTICE([macOS localtime cache optimization enabled])
],
[AC_MSG_WARN([--enable-macos-localtime-cache is only supported on macOS, ignoring])]
)
])
AC_ARG_ENABLE(localtime-cache,
AS_HELP_STRING([--enable-localtime-cache],
[enable localtime cache optimization to improve forked process performance]),
[localtime_cache=$enableval],
[AS_CASE(["$target_os"], [darwin*], [localtime_cache=yes], [localtime_cache=no])])
AS_IF([test "$localtime_cache" = yes], [
AC_DEFINE(ENABLE_LOCALTIME_CACHE, 1, [Enable localtime cache optimization])
AC_MSG_NOTICE([localtime cache optimization enabled])
])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants