-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
gh-81719: Add private members to zipfileZipFile to make it easier to subclass #137101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gh-81719: Add private members to zipfileZipFile to make it easier to subclass #137101
Conversation
dd4fbc8
to
54b7e31
Compare
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
I have updated the code.
I was thinking that we can keep these attributes private, at least as a start, while downstream projects try to extend the code.
Once we have more projects using the code, we can look into making this public and defining a public interface.
I have made the requested changes; please review again |
Thanks for making the requested changes! @ZeroIntensity: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now please add a test case. If you're confused about that in any way, feel free to ask me or take a look at the devguide. We should also add an entry to the "What's New in Python 3.15" page.
Once we have more projects using the code, we can look into making this public and defining a public interface.
Well, we don't really want people to be using a private interface. If we encourage people to use it (i.e., by documenting it), then we still have to follow all the usual backwards compatibility rules, even if it's "private". If we don't document it, then people won't really know to use it.
Misc/NEWS.d/next/Library/2025-07-25-14-27-29.gh-issue-81719.hWp7Mn.rst
Outdated
Show resolved
Hide resolved
Thanks. I will clean the code and add automated tests. |
Thanks Peter for the review. Docs was updated here https://cpython-previews--137101.org.readthedocs.build/en/137101/whatsnew/3.15.html#zipfile I now see that we don't have auto-doc enabled for zipfile Should I update the documentation page at The current changes are targeted to develpers that want to extend the ZipFile code, So, if you want to extend the code for ZipFile, you will have to read the source code. So, maybe leave Doc/library/zipfile.rst as it is, since it is targeted to "end users" Have My bad for forgetting about the tests. I tried to write the minimum amount of test to cover all the new code.
Let me know if you prefer to split the current tests into separate test for each use case. I have made the requested changes; please review again |
Thanks for making the requested changes! @ZeroIntensity: please review the changes made to this pull request. |
Lib/test/test_zipfile/test_core.py
Outdated
""" | ||
Test in which the files inside the archive are not compressed. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring isn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I have removed it. No problem.
I had to read the other tests to decide where I should add the new tests and I thought that this comment can help.
Lib/test/test_zipfile/test_core.py
Outdated
@@ -676,6 +680,104 @@ def test_add_file_after_2107(self): | |||
self.assertEqual(zinfo.date_time, (2107, 12, 31, 23, 59, 59)) | |||
|
|||
|
|||
class CustomZipInfo(zipfile.ZipInfo): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here; please just avoid adding docstrings. This is all private code and keeping them up to date tends to be extra maintenance that we don't want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to document why this class was use and decide if you can reuse it in another test.
I have removed the docstring.
Lib/test/test_zipfile/test_core.py
Outdated
destination = io.BytesIO() | ||
with zipfile.ZipFile( | ||
destination, 'w', zipinfo_class=self.CustomZipInfo) as zipfp: | ||
# It can write using the specific custom classe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo:
# It can write using the specific custom classe. | |
# It can write using the specific custom class. |
Lib/test/test_zipfile/test_core.py
Outdated
# When creating a new member using just the name, | ||
# the custom ZipInfo is used internally. | ||
with zipfp.open('other-member.txt', mode='w') as memberfp: | ||
self.assertIsInstance(memberfp._zinfo, self.CustomZipInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid using private members in tests. If we change how they work, it adds additional maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Make sense.
Lib/test/test_zipfile/test_core.py
Outdated
# When writing from an external file, the file is created using | ||
# the custom ZipInfo | ||
with temp_dir() as source_dir: | ||
source_file = pathlib.Path(source_dir).joinpath('source.txt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to just use Path(source_dir) / 'source.txt'
here.
Lib/test/test_zipfile/test_core.py
Outdated
with temp_dir() as extract_dir: | ||
zipfp.extract(dir_info, path=extract_dir) | ||
zipfp.extract('dir-as-text/', path=extract_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this part of the test is necessary. We aren't stressing anything on the custom classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to test this change
def _extract_member(self, member, targetpath, pwd):
"""Extract the ZipInfo object 'member' to a physical
file on the path targetpath.
"""
- if not isinstance(member, ZipInfo):
+ if not isinstance(member, self._ZipInfo):
member = self.getinfo(member)
There is not much public API to check for this method.
I have updated the assertions as an end to end test.
Sorry, forgot to mention this in the review. Yeah, please update the documentation. Docstrings are generally very broad and can go out of date quickly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
I will wait for the test to pass and then I will request a new review.
I tried to add some "narrative" documentation.
I find it hard, as there is a lot of private API
For example in ZipFile.writestr we have
zinfo = self._ZipInfo(zinfo_or_arcname)._for_archive(self)
Maybe as a followup, we should refactor to use a public method and document the ZipInfo and ZipExtFile public API.
I tried to add tests to cover all changes... but the tests are kind of useless with class CustomZipInfo(zipfile.ZipInfo):
as CustomZipInfo
will return true for both isinstance(obj, ZipInfo)
or isinstance(obj, self._ZipInfo)
I tried to write the tests using delegation for ZipInfo...but I was not to write sometime useful
I got something like this...but I a not sure if this will not make it hard to maintain the tests
class CustomZipInfo:
"""
Support for testing extending and subclassing ZipFile.
"""
def __init__(self, *args, **kwargs):
self._delegated = zipfile.ZipInfo(*args, **kwargs)
def _for_archive(self, archive):
self._delegated._for_archive(archive)
return self
def __getattribute__(self, name):
if name in ['_delegated', '_for_archive']:
return object.__getattribute__(self, name)
return getattr(self._delegated, name)
def __setattribute__(self, name, value):
if name == '_delegated':
object.__setattribute__(self, name, value)
return
return setattr(self._delegated, name, value)
@classmethod
def from_file(cls, filename, arcname=None, *, strict_timestamps=True):
if isinstance(filename, os.PathLike):
filename = os.fspath(filename)
st = os.stat(filename)
isdir = stat.S_ISDIR(st.st_mode)
mtime = time.localtime(st.st_mtime)
date_time = mtime[0:6]
if not strict_timestamps and date_time[0] < 1980:
date_time = (1980, 1, 1, 0, 0, 0)
elif not strict_timestamps and date_time[0] > 2107:
date_time = (2107, 12, 31, 23, 59, 59)
# Create ZipInfo instance to store file information
if arcname is None:
arcname = filename
arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
while arcname[0] in (os.sep, os.altsep):
arcname = arcname[1:]
if isdir:
arcname += '/'
zinfo = cls(arcname, date_time)
zinfo.external_attr = (st.st_mode & 0xFFFF) << 16 # Unix attributes
if isdir:
zinfo.file_size = 0
zinfo.external_attr |= 0x10 # MS-DOS directory flag
else:
zinfo.file_size = st.st_size
return zinfo
Lib/test/test_zipfile/test_core.py
Outdated
with temp_dir() as extract_dir: | ||
zipfp.extract(dir_info, path=extract_dir) | ||
zipfp.extract('dir-as-text/', path=extract_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to test this change
def _extract_member(self, member, targetpath, pwd):
"""Extract the ZipInfo object 'member' to a physical
file on the path targetpath.
"""
- if not isinstance(member, ZipInfo):
+ if not isinstance(member, self._ZipInfo):
member = self.getinfo(member)
There is not much public API to check for this method.
I have updated the assertions as an end to end test.
Lib/test/test_zipfile/test_core.py
Outdated
# When creating a new member using just the name, | ||
# the custom ZipInfo is used internally. | ||
with zipfp.open('other-member.txt', mode='w') as memberfp: | ||
self.assertIsInstance(memberfp._zinfo, self.CustomZipInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Make sense.
Lib/test/test_zipfile/test_core.py
Outdated
@@ -676,6 +680,104 @@ def test_add_file_after_2107(self): | |||
self.assertEqual(zinfo.date_time, (2107, 12, 31, 23, 59, 59)) | |||
|
|||
|
|||
class CustomZipInfo(zipfile.ZipInfo): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to document why this class was use and decide if you can reuse it in another test.
I have removed the docstring.
Lib/test/test_zipfile/test_core.py
Outdated
""" | ||
Test in which the files inside the archive are not compressed. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I have removed it. No problem.
I had to read the other tests to decide where I should add the new tests and I thought that this comment can help.
I have made the requested changes; please review again |
Thanks for making the requested changes! @ZeroIntensity: please review the changes made to this pull request. |
Misc/NEWS.d/next/Library/2025-07-25-14-27-29.gh-issue-81719.hWp7Mn.rst
Outdated
Show resolved
Hide resolved
Feedback @serhiy-storchaka from from the issue discussion #136741 (comment)
|
@serhiy-storchaka thanks for your feedback. The I think that for this PR the focus is whether we want public init arguments or private class variables Regarding
This looks like a refactoring that should be done in a separate issue/PR. I have also mentioned usage of private API in
|
I really don't want to encourage using private class or instance variables to expose use of |
(side note: it looks like there's a typo in the email address that was configured for your last few commits, which is preventing GitHub from associating them with your account. I think there's a missing |
Fixes #81719
This is the bare minimum changes to allow subclassing ZipFile to implement a custom encryption method.
I am using this code to implement WinZIP AES.
This is my first contribution to CPyton , so I try to keep it minimum.
Maybe the private method should not be documented... but just leave a comment in the code.
I am using this code to implement WinZip AES on top of this patch
There are a few other changes that will make it easier to reuse code... but those are not criticial.
For example, the current code validates the Zip password outside of the
_ZipDecrypter
code.This means, that if you implement a differrent ZipDecrypter that has a different password validation method, you can't just rewrite
ZipExtFile._init_decrypter()
and reuseZipExtFile.__init__
Let me know if you are happy for a bit more of refactoring
Thanks