Skip to content

gh-81719: Add private members to zipfileZipFile to make it easier to subclass #137101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

adiroiban
Copy link

@adiroiban adiroiban commented Jul 25, 2025

Fixes #81719

This is the bare minimum changes to allow subclassing ZipFile to implement a custom encryption method.

I am using this code to implement WinZIP AES.

This is my first contribution to CPyton , so I try to keep it minimum.


Maybe the private method should not be documented... but just leave a comment in the code.


I am using this code to implement WinZip AES on top of this patch

There are a few other changes that will make it easier to reuse code... but those are not criticial.

For example, the current code validates the Zip password outside of the _ZipDecrypter code.
This means, that if you implement a differrent ZipDecrypter that has a different password validation method, you can't just rewrite ZipExtFile._init_decrypter() and reuse ZipExtFile.__init__

Let me know if you are happy for a bit more of refactoring

Thanks

@python-cla-bot
Copy link

python-cla-bot bot commented Jul 25, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@adiroiban adiroiban force-pushed the 81719-zipfile-refactor-for-subclass branch from dd4fbc8 to 54b7e31 Compare July 25, 2025 13:51
@bedevere-app
Copy link

bedevere-app bot commented Jul 25, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Copy link
Author

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

I have updated the code.


I was thinking that we can keep these attributes private, at least as a start, while downstream projects try to extend the code.

Once we have more projects using the code, we can look into making this public and defining a public interface.

@adiroiban adiroiban requested a review from ZeroIntensity July 25, 2025 15:24
@adiroiban
Copy link
Author

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Jul 25, 2025

Thanks for making the requested changes!

@ZeroIntensity: please review the changes made to this pull request.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now please add a test case. If you're confused about that in any way, feel free to ask me or take a look at the devguide. We should also add an entry to the "What's New in Python 3.15" page.

Once we have more projects using the code, we can look into making this public and defining a public interface.

Well, we don't really want people to be using a private interface. If we encourage people to use it (i.e., by documenting it), then we still have to follow all the usual backwards compatibility rules, even if it's "private". If we don't document it, then people won't really know to use it.

@adiroiban
Copy link
Author

Thanks. I will clean the code and add automated tests.

@adiroiban
Copy link
Author

adiroiban commented Jul 27, 2025

Thanks Peter for the review.

Docs was updated here https://cpython-previews--137101.org.readthedocs.build/en/137101/whatsnew/3.15.html#zipfile


I now see that we don't have auto-doc enabled for zipfile

https://cpython-previews--137101.org.readthedocs.build/en/137101/library/zipfile.html#zipfile.ZipFile

Should I update the documentation page at Doc/library/zipfile.rst?

The current changes are targeted to develpers that want to extend the ZipFile code,
and not to developers that want to use the code.

So, if you want to extend the code for ZipFile, you will have to read the source code.

So, maybe leave Doc/library/zipfile.rst as it is, since it is targeted to "end users"

Have zipinfo_class documentated only inside the source code.


My bad for forgetting about the tests.

I tried to write the minimum amount of test to cover all the new code.

./python -m pytest Lib/test/test_zipfile/ --cov zipfile --cov-report=xml
./python -m diff_cover.diff_cover_tool coverage.xml 
-------------
Diff Coverage
Diff: origin/main...HEAD, staged and unstaged changes
-------------
Lib/zipfile/__init__.py (100%)
-------------
Total:   12 lines
Missing: 0 lines
Coverage: 100%
-------------

Let me know if you prefer to split the current tests into separate test for each use case.
I tried to split them based on the main functionality: read, write and extract


I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Jul 27, 2025

Thanks for making the requested changes!

@ZeroIntensity: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from ZeroIntensity July 27, 2025 12:22
Comment on lines +489 to +491
"""
Test in which the files inside the archive are not compressed.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring isn't needed.

@@ -676,6 +680,104 @@ def test_add_file_after_2107(self):
self.assertEqual(zinfo.date_time, (2107, 12, 31, 23, 59, 59))


class CustomZipInfo(zipfile.ZipInfo):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; please just avoid adding docstrings. This is all private code and keeping them up to date tends to be extra maintenance that we don't want.

destination = io.BytesIO()
with zipfile.ZipFile(
destination, 'w', zipinfo_class=self.CustomZipInfo) as zipfp:
# It can write using the specific custom classe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

Suggested change
# It can write using the specific custom classe.
# It can write using the specific custom class.

# When creating a new member using just the name,
# the custom ZipInfo is used internally.
with zipfp.open('other-member.txt', mode='w') as memberfp:
self.assertIsInstance(memberfp._zinfo, self.CustomZipInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using private members in tests. If we change how they work, it adds additional maintenance.

# When writing from an external file, the file is created using
# the custom ZipInfo
with temp_dir() as source_dir:
source_file = pathlib.Path(source_dir).joinpath('source.txt')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just use Path(source_dir) / 'source.txt' here.

Comment on lines +776 to +778
with temp_dir() as extract_dir:
zipfp.extract(dir_info, path=extract_dir)
zipfp.extract('dir-as-text/', path=extract_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this part of the test is necessary. We aren't stressing anything on the custom classes.

@ZeroIntensity
Copy link
Member

Should I update the documentation page at Doc/library/zipfile.rst?

Sorry, forgot to mention this in the review. Yeah, please update the documentation. Docstrings are generally very broad and can go out of date quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor zipfile to ease subclassing and enhancement
2 participants