-
-
Notifications
You must be signed in to change notification settings - Fork 8k
Improve the cache when getting font metrics #28831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This tripped me up. "Please look separately at (i) commit 1, (ii) commit 1+2". is the way to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to sum up the differences:
- API: option 1 takes a weak ref, option 2 takes the renderer itself. - API-wise 2 is slightly more comfortable. The user does not have to care for weak refs. Also, the weakref-handling is better contained.
- Behavior: The renderer cache in option 2 is unbound. Can that be an issue for long-running processes?
- Both compared to the original implementation: We create separate caches per renderer. Before we had one large cache for renderer+settings. I assume this does not make a relevant difference
lib/matplotlib/text.py
Outdated
@functools.lru_cache(4096) | ||
def _text_metrics(text, fontprop, ismath, dpi): | ||
# dpi is unused, but participates in cache invalidation (via the renderer). | ||
return renderer_ref().get_text_width_height_descent(text, fontprop, ismath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In option 2: don't we have to raise a RuntimeError (as in option 1) if the weakref does not resolve (renderer_ref()
is None)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but if we are being called then we have (on behalf of the user) recently called _get_text_metrics_with_cache
with a hard-ref so we are sure it is alive. That said, I'll put it back in.
This helper should probably be made __
private as well?
🤦🏻 yeah, that makes sense, the second diff in going to be weird. I thought putting them in the same PR would be simpler than two PRs and easier to deal with diffs in the comments of the issue. Currently the only place we call the matplotlib/lib/matplotlib/text.py Lines 65 to 70 in 64d45cb
_get_text_metrics_with_cache is matplotlib/lib/matplotlib/text.py Lines 372 to 384 in 64d45cb
option 2 is unbounded, but it is tied to the lifetime of the renderers it should not grow past the number of renderers the user implicitly keeps alive by the number of In option 1 we keep some number of renderers alive (25) which is either going to cause incorrect misses for someone who has more than 25 figures alive (good idea or not, someone might really want this) and keep the cache too long in cases where the user is churning through figures one at a time. I think the pro of option 1 is that it is "simpler" by using layered LRU cache, but option 2 is more complex but technically better. In either case the two-tiered cache is the main "fix". |
e93968c
to
b2f73a6
Compare
That's clever. 👍 I hadn't realized that. In this case I'm for option 2 plus a good comment why we do this and how it works 😃. |
lib/matplotlib/text.py
Outdated
# use mutable default to carry hidden global state | ||
def _get_text_metrics_function(inp_renderer, _cache=weakref.WeakKeyDictionary()): | ||
if (_text_metrics := _cache.get(inp_renderer, None)) is None: | ||
renderer_ref = weakref.ref(inp_renderer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually need to explicitly create a weakref here? Doesn't using a WeakKeyDictionary basically do this for you for free already (i.e. aren't you having two layers of weakref'ing here?)? (not sure...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if we do not the closure holds a hard reference mediated in the value side of the weakkey dict and they become immortal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I missed the fact that you are caching a closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the closure can be made more explicit, something like (untested)
cache = WeakKeyDictionary() # Can be hidden as an attribute or a private parameter...
def _get_text_metrics_with_cache(renderer, text, fp, ismath, dpi):
if renderer not in cache:
cache[renderer] = functools.lru_cache(4096)(
functools.partial(_weak_text_metrics, weakref.ref(renderer)))
return cache[renderer](text, fp.copy(), ismath, dpi)
def _weak_text_metrics(renderer_ref, text, fp, ismath, dpi):
return renderer_ref().get_text_width_height_descent(text, fp, ismath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a some comments. I think that the closure is marginally clearer than the use of partial
.
lib/matplotlib/text.py
Outdated
|
||
|
||
# use mutable default to carry hidden global state | ||
def _get_text_metrics_function(inp_renderer, _cache=weakref.WeakKeyDictionary()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hiding the cache in a hidden mutable kwarg seems a bit weird to me (but sure, why not); I would rather have defined it as a custom attribute on the function (_get_text_metrics_function._cache = WeakKeyDictionary()
). Just a stylistic choice, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When put into a keyword, it's easier to access it inside the function. - But needs documentation 😄
lib/matplotlib/text.py
Outdated
@functools.lru_cache(4096) | ||
def _text_metrics(text, fontprop, ismath, dpi): | ||
# dpi is unused, but participates in cache invalidation (via the renderer). | ||
if (lcl_renderer := renderer_ref()) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I assume lcl
here is local
as well? Probably can fit the two letters here too...
db121ec
to
a0e0809
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice documentation 👍
@@ -919,6 +919,7 @@ def call(*args, **kwargs): | |||
|
|||
|
|||
def test_metrics_cache2(): | |||
plt.close('all') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests should start with no figure open due to the test fixture, so this should be unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a long shot that something was getting not cleared because the assert that is failing is before we do anything in this test.
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
1e50b6f
to
b047764
Compare
We have now had 2 nearly identical reports tied to this. @QuLogic Does this conflict with all of your font work? |
I don't believe I've made any changes here, and though I would like to replace it with FreeType's caching mechanism, I haven't really done any work on that. |
Please look at each of the first commit (option 1) and then the combination of both commits (net to option 2). Once we pick which one we like better I'll drop the other commit and fix / add tests.We settled on option 2 in the discussion so I squashed option 1 out of existence.
Fixes #28827 fixes #30400