Fix repair_metadata OOM on large repositories#1189
Conversation
|
Could you please squash the commits into one? Otherwise, it looks good! |
025a2f3 to
9cda1db
Compare
Done @jobselko. Just one commit. |
gerrod3
left a comment
There was a problem hiding this comment.
We can accept this change and backport it. In the future we will need to take a look at our util methods and refactor them, they've grown a bit absurd.
| Reduced peak memory consumption of repair_metadata by lowering batch size from 1000 to 250, | ||
| eliminating double S3 reads for wheel files, and closing artifact file handles after each | ||
| iteration. This fixes "Worker has gone missing" errors on repositories with 1000+ packages. |
There was a problem hiding this comment.
Simplify this, too verbose. Keep it on one line.
Large repositories (1000+ packages) cause workers to OOM during repair_metadata. Three changes reduce peak memory: - Reduce BULK_SIZE from 1000 to 250, flushing batches 4x more often - Copy artifact to temp file once via helper, reuse for both content data extraction and metadata extraction (eliminates double S3 read) - Extract metadata bytes while temp file exists, pass bytes through the metadata batch instead of file paths Closes pulp#1188 JIRA: PULP-1573 Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9cda1db to
208ab04
Compare
Backport to 3.27: 💚 backport PR created✅ Backport PR branch: Backported as #1196 🤖 @patchback |
Fix repair_metadata OOM on large repositories (cherry picked from commit 7aaf53f)
Backport to 3.28: 💚 backport PR created✅ Backport PR branch: Backported as #1197 🤖 @patchback |
Fix repair_metadata OOM on large repositories (cherry picked from commit 7aaf53f)
…f2a46f21be5363e677f503d6ea86fc9/pr-1189 [PR #1189/7aaf53fc backport][3.27] Fix repair_metadata OOM on large repositories
…f2a46f21be5363e677f503d6ea86fc9/pr-1189 [PR #1189/7aaf53fc backport][3.28] Fix repair_metadata OOM on large repositories
Summary
BULK_SIZEfrom 1000 to 250, flushing batches 4x more often to cap peak memoryFixes #1188
Test plan
test_repair.pytests pass (metadata repair command, endpoint, artifact repair)test_metadata_repair_batch_boundarypasses with reduced BULK_SIZErepair-python-metadata.py --env stage --domain <large-domain>to verify no OOMJIRA: PULP-1573
🤖 Generated with Claude Code