-
Notifications
You must be signed in to change notification settings - Fork 847
feat: serve cached digest if available #462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
MickaelCa
commented
Jul 27, 2025
- Added methods to upload metadata alongside digest files to S3.
- Implemented S3-based digest caching mechanism for improved efficiency.
- Refactored digest storage logic to support both S3 and local storage.
⚙️ Preview environment for PR #462 is available at: |
5b757ee
to
655c0ba
Compare
acb00ea
to
dd6e541
Compare
dd6e541
to
579894d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RGPD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ingest a repo hello_world
then try to ingest it again with include/exclude patterns the cache is not used ?
Anyway, LGTM !
clone_config = query.extract_clone_config() | ||
commit_sha = await resolve_commit(clone_config, token=token) | ||
query.commit = commit_sha | ||
|
||
# Generate S3 file path using the resolved commit | ||
s3_file_path = generate_s3_file_path( | ||
source=query.url, | ||
user_name=cast("str", query.user_name), | ||
repo_name=cast("str", query.repo_name), | ||
commit=commit_sha, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clone_config = query.extract_clone_config() | |
commit_sha = await resolve_commit(clone_config, token=token) | |
query.commit = commit_sha | |
# Generate S3 file path using the resolved commit | |
s3_file_path = generate_s3_file_path( | |
source=query.url, | |
user_name=cast("str", query.user_name), | |
repo_name=cast("str", query.repo_name), | |
commit=commit_sha, | |
clone_config = query.extract_clone_config() | |
query.commit = await resolve_commit(clone_config, token=token) | |
# Generate S3 file path using the resolved commit | |
s3_file_path = generate_s3_file_path( | |
source=query.url, | |
user_name=cast("str", query.user_name), | |
repo_name=cast("str", query.repo_name), | |
commit=query.commit, |
except ClientError as err: | ||
# Object doesn't exist if we get a 404 error | ||
error_code = err.response.get("Error", {}).get("Code") | ||
if error_code == "404": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use fastapi.status.HTTP_404_NOT_FOUND
instead of string validation
except ClientError as err: | ||
# Object doesn't exist if we get a 404 error | ||
error_code = err.response.get("Error", {}).get("Code") | ||
if error_code == "404": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here