Open your uploads folder after a theme migration and you will likely see three, four, sometimes eight versions of every image you uploaded. Before you start deleting, it is worth knowing that most of what you are looking at is not a problem. WordPress put it there on purpose.
This guide distinguishes the files WordPress generates automatically, which you should leave alone, from the real duplicates worth removing, and covers how to clean the latter without breaking a single image in the process.
What are you really seeing in your WordPress media library?
Most sites that appear to have duplicate image problems have two completely different issues mixed together. Treating them as the same thing leads to broken layouts or, at best, a cleanup that changes nothing meaningful.
True duplicates are independent uploads of the same file. Same pixels, same content, uploaded more than once — by different team members, via repeated imports, or after a migration that copied the library across. These waste storage and create confusion. They are the actual target.
Generated sizes are not duplicates at all. When you upload a single image, WordPress creates multiple resized derivatives automatically: thumbnail (150×150), medium (300px wide), large (1024px wide), and any custom sizes registered by your active theme or plugins. These live in the same uploads/ directory and can make it look like you have hundreds of extra files. They are expected, managed internally by WordPress, and mostly safe to leave alone.
Regenerated sizes are what you see after changing themes, switching page builders, or installing a plugin that registers new image dimensions. WordPress does not remove the old sizes — it adds the new ones alongside them. A site that has changed themes twice may have three generations of derivatives for every image.
The most common misconception about duplicate images
A single upload of banner.jpg can produce banner.jpg, banner-150×150.jpg, banner-300×200.jpg, banner-768×512.jpg, banner-1024×683.jpg and more. None of these are duplicates. Deleting them manually, without regenerating the sizes WordPress still needs, will break image display on any template or block that references those specific dimensions.
The files worth removing are those uploaded independently more than once, not the files WordPress generated from a single upload.
Why do thumbnails and regenerated sizes appear, and when does it become a problem?
WordPress generates image sizes for practical reasons: responsive layouts require different resolutions for different screen widths, WooCommerce needs product thumbnails at precise dimensions, gallery plugins have their own required formats. According to the WordPress developer documentation, the default installation registers three named sizes (thumbnail, medium, large) plus a full-size original — and every active theme and plugin can register additional sizes on top of these.
The problem is accumulation. WordPress has no mechanism to remove sizes that are no longer in use. Every theme change, every plugin install that registers a new size, every Regenerate Thumbnails run adds files without clearing the old ones.
The size explosion scenarios
Theme or builder change. A theme built with Elementor registers four custom sizes. You switch to a block theme that registers three different ones. Both sets of files now exist for every image uploaded before and after the switch.
Sliders and gallery plugins. Many gallery and slider plugins register their own image sizes. Install three gallery plugins over the life of a site and you can easily double the number of derivative files in your uploads directory.
WooCommerce and product variations. WooCommerce registers several sizes: shop thumbnail, product thumbnail, product image, product full. Add a size-customisation plugin and the count climbs further. On a catalogue of 500 products, the difference between three and six registered sizes is thousands of extra files.
Migration from staging to production. Migrations copy the entire uploads directory, including every derivative. If staging and production run in parallel and the library is synced back after a rebuild, you can end up with multiple generations of sizes for the same original files.
How do you tell real duplicates from thumbnails?
Filename patterns are the fastest first diagnostic. WordPress always appends the dimensions to generated sizes: image-150×150.jpg, image-1024×683.jpg. If the filename ends in a number pattern like -[width]x[height], it is a generated size, not a duplicate.
Visual cues in filenames
| Pattern | What it is |
|---|---|
| image.jpg | Original upload |
| image-150×150.jpg | WordPress thumbnail |
| image-300×200.jpg | Generated medium size |
| image-1024×683.jpg | Generated large size |
| image-copy.jpg | Likely a manual duplicate |
| image-1.jpg or image (1).jpg | Likely a re-upload |
| logo-v2.jpg alongside logo.jpg | Likely a renamed re-upload |
| product-photo_copy.jpg | Likely a duplicate from a feed import |
If two files share no dimension suffix and have similar names, similar file sizes, and were uploaded on different dates, you are looking at a real duplicate. Confirm with a hash-based tool before deleting.
How do you know when it is a real duplicate?
Real duplicates show up as: same file size (or very close), same dimensions, no -[width]x[height] suffix, uploaded more than once, and often with slightly different filenames that suggest a rename before re-upload. Hash-based detection — comparing the actual binary fingerprint of each file — is the only method that confirms two files are identical with certainty.
Cleanup strategy: what you should and should not delete
Delete a generated size only when you have confirmed nothing references it and you are prepared to regenerate the sizes that are still needed. The risk is not theoretical — any template, block, CSS background, or builder layout that points to a specific generated size will display a broken image if that file is gone.
Decision table
| File type | Action | Reason |
|---|---|---|
| True duplicate (hash-confirmed) | Safe to remove after replacing references | No longer needed once replaced |
| Generated size no longer in use | Caution — verify with a tool first | May still be referenced in builder or CSS |
| Generated size from active theme/plugin | Do not delete | Still in active use |
| Old regenerated size from previous theme | Caution — safe only if theme is fully gone | Could still be referenced in legacy content |
| Original upload | Never delete without confirmation | Source file for all derivatives |
| Registered media file with no source file | Safe to remove from library | Broken attachment, no file on disk |
Step 0: backup and staging
Take a full backup before any deletion — database and uploads directory. For high-traffic or complex sites, run the cleanup on staging first and push the result to production once verified. A backup you have never tested restoring is not a backup. Confirm the restore process works before you start.
Step 1: audit your image sizes
Before deleting anything, understand what sizes are registered and what is generating them. In Settings → Media, you can see WordPress’s default sizes. Additional sizes are registered by your active theme and plugins — check theme documentation or use a plugin that lists all registered sizes across the installation.
The goal is a clear picture of which sizes are still in active use and which are orphaned by a theme or plugin you no longer run. Mediapapa’s Library Health dashboard surfaces this as part of its recommended actions view, flagging oversized images and optimisation opportunities without requiring you to dig into settings manually.
Step 2: reduce future bloat
Removing old files only solves the past. Preventing new accumulation requires reducing the number of sizes being generated going forward. If your theme registers custom sizes you are not using, consult the theme documentation for how to disable them. Plugins that add gallery or slider sizes can often be configured to use existing WordPress sizes instead of adding new ones. The fewer sizes registered, the fewer files generated on every new upload.
Step 3: clean old regenerated sizes
Work in small batches of 10 to 50 files. Verify the site after each batch — check the pages most likely to surface the affected images. Prioritise checking templates, home page, category pages, product pages if WooCommerce is active, and any page builder layouts. Pay attention to cached versions: clear cache and CDN after each batch to confirm you are seeing the current state of the site.
Avoid bulk deletion tools that remove files without checking active usage first. The reliable approach is a tool that checks whether each file is still referenced before removing it.
Step 4: re-check key pages
After cleanup, check the home page, popular posts, product pages, header and footer builders, and any custom templates. If the site uses a CDN, purge the cache and check again. Check on mobile as well as desktop — some responsive breakpoints reference specific generated sizes that desktop views do not.
What about real duplicates (the same image uploaded more than once)?
Once the thumbnail confusion is resolved, the remaining duplicate problem is more straightforward: the same file exists in the library under two or more entries. The method for handling it depends on the size of the library.
Detection options
Manual (small libraries, under a few hundred files). Sort the media library list view by filename. Look for sequences, “(1)” suffixes, or naming patterns like -copy, -v2, -final. Open suspected duplicates in separate browser tabs and compare dimensions and file size. This works at small scale. It misses duplicates with completely different filenames.
Hash-based tools (most sites). A file hash is a binary fingerprint. Two files with identical content produce the same hash regardless of filename. Hash-based detection is the only reliable method for catching every exact duplicate in a large library. Mediapapa scans on upload and flags duplicates in the media library view and in the Library Health dashboard. You can also filter by “Duplicated medias” to see all affected files at once.
WP-CLI (large sites, developer access required). For libraries with tens of thousands of files, plugin-based scanning can be slow. A custom WP-CLI script can compute hashes server-side and output a list of duplicates without loading the admin. The script produces a list; deletion and reference replacement still need to be handled separately.
Safe removal workflow
Choose the canonical file: highest resolution, smallest file size for equivalent quality, oldest upload date (more likely to be referenced in older content). Before deleting the duplicate, replace every reference to it — URL and attachment ID — with the canonical version. Any page, post, builder layout, widget, or WooCommerce product gallery that referenced the deleted file should now point to the version you kept.
Mediapapa’s Safe Replace handles this automatically. It locates every occurrence of the file across the site, updates the ID and URL, and deletes the duplicate only once the replacement is in place. No broken images, no manual find-and-replace in the database.
How to prevent duplicates from coming back
Cleanup is necessary once. Keeping the library clean long-term takes less time but requires a routine.
A simple monthly pass — 10 minutes — prevents accumulation from compounding. Filter for duplicates, review recent imports, check whether any new plugins installed since the last pass are adding unexpected sizes or re-uploading existing files. Tag or categorise uploads consistently so files are findable on the next search, reducing the instinct to re-upload something you cannot locate.
For team-managed sites, establish a naming convention before upload and require contributors to search before uploading. For WooCommerce imports, check the feed or CSV plugin settings — most can be set to skip files that already exist in the library.
Ten minutes monthly costs less than a multi-hour repair later. That is the case for treating media governance as an ongoing practice rather than a one-off cleanup.
FAQ
No. Thumbnails and resized derivatives are generated automatically by WordPress from the original file. They are managed internally, attached to the original upload, and are not duplicates in any meaningful sense. Do not delete them manually unless you have confirmed they are no longer referenced anywhere.
When you run a thumbnail regeneration tool, WordPress creates the sizes currently registered by your active theme and plugins. If your configuration has changed since the original upload — a new theme, a new plugin — the regeneration adds new sizes without removing the old ones. Only the sizes registered at the time of regeneration are created; the rest are leftovers from previous configurations.
With care, yes. The prerequisite is confirming that nothing still references the specific file you are deleting — no template, no builder layout, no CSS background, no content block. A tool that checks usage before deleting is far safer than manual deletion. The original upload should never be deleted this way; only derivative sizes that are provably unused.
Directly, the impact on page speed is limited: browsers load only the images referenced in the page, not everything in the uploads directory. Indirectly, a cleaner library reduces storage costs, speeds up media library queries in the admin, and removes the risk of accidentally referencing a sub-optimal file — wrong dimensions, unoptimised format — when the better version is elsewhere in the library. Library Health score in Mediapapa reflects this: a clean, well-optimised library scores higher and surfaces fewer recommended actions.
WooCommerce is one of the most common sources of both duplicate images and size accumulation. Product CSV imports and feed syncs frequently re-upload images that already exist. WooCommerce also registers several custom image sizes; add a size-customisation plugin and the count climbs. After any import, run a hash-based duplicate check. Mediapapa’s duplicate detection covers WooCommerce product attachments as part of the full library scan.
Any page, post, widget, or builder layout referencing the deleted file’s URL or attachment ID will display a broken image. The content itself is not affected, but every instance needs to be manually re-linked to a working file. This is why replacing references before deleting is non-negotiable. Mediapapa’s Safe Replace handles the replacement before deletion; manual deletion without this step is the source of most cleanup-related breakage.
Yes on both. Mediapapa’s duplicate detection and Safe Replace flow cover the full media library regardless of which editor or builder was used to embed the file. Multisite is supported on the Agency plan, with duplicate detection running per site within the network.
Further reading
- Understanding attachments and image sizes — WordPress.org documentation
- Detecting and deleting duplicate images — Mediapapa help docs
- Tracking media usage and Deletion Warnings — Mediapapa help docs




