What “crawled but not indexed” means
Crawled vs indexed: the difference
Crawling and indexing are related, but they are not the same.
- Crawled means Googlebot visited the URL and retrieved the page.
- Indexed means Google decided the page is eligible to appear in search results.
A page can be crawled many times and still remain outside the index. In Google Search Console, this often appears in the Page indexing report as a status related to discovery or exclusion, depending on the exact issue.
Why Google may crawl a page and still skip indexing
Google does not index every crawled URL. It evaluates whether the page adds enough unique value, whether it is the preferred version among duplicates, and whether technical signals support inclusion.
Common reasons include:
- The page is thin or low value
- The page is duplicated or near-duplicated
- Canonical signals point elsewhere
- A noindex tag or robots directive blocks indexing
- The page looks like a soft 404 or low-quality placeholder
Reasoning block: what to prioritize
Recommendation: Focus first on pages that are unique, commercially important, and internally linked, because those are most likely to benefit from reindexing.
Tradeoff: Requesting indexing before improving the page can waste time and may not change the outcome.
Limit case: If a page is intentionally excluded, duplicate by design, or low priority, leaving it out of the index may be the correct choice.
The most common reasons pages are crawled but not indexed
Thin or low-value content
Pages with very little original information often struggle to get indexed. This includes pages with short copy, generic descriptions, or content that does not answer a clear search intent.
Typical examples:
- Near-empty category pages
- Auto-generated pages with minimal text
- Location pages with only swapped city names
- Product pages with reused manufacturer copy
Google’s systems are designed to surface useful pages, not just accessible ones. If the page does not add enough unique value, it may be crawled and then skipped.
Duplicate or near-duplicate pages
Duplicate content is one of the most common causes of crawled but not indexed outcomes. If Google sees multiple URLs with the same or very similar content, it may choose one canonical version and ignore the rest.
This often happens with:
- Faceted navigation
- URL parameters
- Printer-friendly pages
- Session IDs
- Product variants
- CMS-generated duplicates
Canonicalization issues
A canonical tag tells search engines which version of a page should be treated as the primary one. If the canonical points to another URL, Google may crawl the page but index the canonical target instead.
This is especially important when:
- The user-declared canonical differs from Google’s chosen canonical
- Internal links point to non-preferred URLs
- Canonical tags are inconsistent across templates
- Pagination or parameter handling is unclear
For a deeper reference, see the glossary entry on the canonical tag.
Noindex or blocked resources
A page can be crawled and still excluded if it contains a noindex directive. This is one of the first things to check in technical SEO audits.
Also review whether important resources are blocked, such as:
- JavaScript required for rendering
- CSS affecting content visibility
- Robots.txt rules that interfere with discovery
- Meta robots tags inherited from templates
If the page cannot be rendered properly, Google may not understand its value.
Soft 404s and quality signals
A soft 404 is a page that returns a normal status code but appears empty, irrelevant, or effectively missing. Google may crawl it and then decide it should not be indexed.
Signals that can contribute:
- “No results” pages with little context
- Out-of-stock pages with no alternatives
- Placeholder pages
- Pages with broken or misleading content
- Very low engagement or poor perceived usefulness
Comparison table: common causes and fixes
| Cause | Typical symptom | Best fix | Fix priority | When it does not apply |
|---|
| Thin or low-value content | Crawled repeatedly, not indexed, little unique text | Expand content depth and uniqueness | High | Pages intentionally minimal, such as utility pages |
| Duplicate content | Multiple URLs with similar content | Consolidate, canonicalize, or redirect | High | When duplicates are required for user experience and properly canonicalized |
| Canonical tag issue | Google indexes a different URL than expected | Align canonicals and internal links | High | If the alternate URL is the correct preferred version |
| Noindex or blocked resources | Crawled but excluded from index reports | Remove accidental noindex or unblock resources | High | When exclusion is intentional |
| Soft 404 | Crawled page behaves like a missing or empty page | Improve content or return proper status code | Medium to High | When the page is intentionally a dead end and should not rank |
How to diagnose the problem in Google Search Console
Check URL Inspection results
Start with the URL Inspection tool in Google Search Console. It gives you the most direct view of how Google sees a specific page.
Look for:
- Whether the URL is indexed
- The last crawl date
- The user-declared canonical
- The Google-selected canonical
- Any indexing blockers or warnings
If Google selected a different canonical than the one you intended, that is a strong signal that the page is being treated as a duplicate or secondary version.
Review Page indexing reports
The Page indexing report helps you identify patterns across many URLs. Instead of checking one page at a time, look for clusters.
Useful patterns include:
- Entire template types excluded
- Parameterized URLs not indexed
- A spike in “Crawled - currently not indexed”
- A large number of pages excluded after a site release
This is where technical SEO becomes operational. If the issue affects hundreds or thousands of URLs, the root cause is usually template-level rather than page-level.
Compare crawl date, canonical, and user-declared canonical
A useful diagnostic sequence is:
- Confirm the page was crawled recently
- Compare the user-declared canonical to the Google-selected canonical
- Check whether the page is blocked by noindex or robots rules
- Review content uniqueness and internal linking
If the page is crawled often but still not indexed, the issue is usually not discovery. It is evaluation.
Look for patterns across templates or sections
Do not treat every crawled but not indexed URL as a one-off. Look for shared traits:
- Same CMS template
- Same content length
- Same internal link depth
- Same canonical pattern
- Same parameter structure
This is the fastest way to separate isolated issues from systemic ones.
Evidence block: what Google documents
Source: Google Search Central documentation on indexing and canonicalization
Timeframe: Referenced as of 2026-03
Summary: Google states that crawling does not guarantee indexing, and canonical signals help determine which URL version should be indexed. Search Console’s Page indexing and URL Inspection tools are the primary diagnostics for these decisions.
What to fix first
Improve content depth and uniqueness
If a page is important but thin, start by making it genuinely useful.
Improve:
- Main content depth
- Original examples or data
- Clear headings and structure
- Supporting images, tables, or FAQs
- Specific answers to the target query
This is especially important for pages targeting competitive queries or commercial intent. Texta can help teams identify where content is too similar across templates and where indexable value is missing.
Resolve canonical and internal linking issues
If Google is choosing a different canonical, align the signals.
Do this by:
- Making the preferred URL self-canonical
- Updating internal links to point to the preferred version
- Avoiding mixed signals across sitemap, navigation, and canonicals
- Redirecting obsolete duplicates where appropriate
Internal links matter because they reinforce which URL is most important.
Remove accidental noindex or robots blocks
Check for accidental exclusions in:
- Meta robots tags
- HTTP headers
- CMS settings
- Robots.txt
- Template inheritance
A single template-level mistake can suppress indexation across many pages.
Consolidate duplicate URLs
If multiple URLs serve the same intent, choose one primary version and consolidate the rest.
Options include:
- 301 redirects
- Canonical tags
- Parameter handling
- Content merging
- URL normalization
Use redirects when the duplicate should not remain accessible. Use canonicals when multiple versions must exist for users but only one should be indexed.
Request reindexing after changes
After fixing the root cause, use URL Inspection to request indexing. This can help Google recrawl the page faster, but it does not force inclusion.
That distinction matters. Reindexing requests are a trigger, not a guarantee.
Reasoning block: fix order
Recommendation: Fix technical blockers first, then improve content, then request indexing.
Tradeoff: If you request indexing too early, you may get another exclusion cycle without progress.
Limit case: If the page is already strong and only needs a fresh crawl, a request may be enough.
When crawled but not indexed is normal
Low-priority pages
Not every page needs to be indexed. Some pages are useful for users but not valuable in search.
Examples:
- Internal search results
- Filter combinations with little demand
- Utility pages
- Duplicate variants with no search intent
If the page is low priority by design, exclusion may be appropriate.
Fresh pages still in evaluation
New pages often go through an evaluation period. Google may crawl them before deciding whether they deserve long-term indexation.
This is common when:
- The site is new or low authority
- The page has few internal links
- The topic is highly competitive
- The content is similar to existing pages
Do not assume a delay means failure. Some pages simply need more signals.
Pages intentionally excluded from index
Some pages should not be indexed at all, including:
- Thank-you pages
- Login pages
- Admin pages
- Internal utility pages
- Duplicate print views
In these cases, crawled but not indexed is expected and desirable.
How to prevent it from happening again
Build indexable page templates
Indexability should be designed into the template, not patched later.
A strong template usually includes:
- Unique title and H1
- Substantial main content
- Clear canonical tag
- Self-referencing internal links
- Structured data where relevant
- No accidental noindex directives
Strengthen internal linking
Pages that matter should be easy for both users and crawlers to find.
Best practices:
- Link important pages from hubs and category pages
- Use descriptive anchor text
- Avoid orphan pages
- Surface priority pages in navigation or related content modules
Use consistent canonicals
Canonical consistency reduces ambiguity.
Keep the following aligned:
- Canonical tag
- Internal links
- XML sitemap
- Redirect behavior
- Preferred URL format
If these signals conflict, Google may choose a different version than you intended.
Monitor indexation at scale
For larger sites, indexation should be monitored continuously.
Track:
- Indexed vs submitted URLs
- Excluded URL patterns
- Template-level changes
- Crawl spikes after releases
- Canonical mismatches
Texta can support this workflow by helping teams monitor visibility patterns and identify pages that are crawled, excluded, or ready to rank without requiring deep technical setup.
A practical decision framework
Fix now
Fix immediately if the page is:
- Commercially important
- Unique and valuable
- Meant to rank
- Blocked by noindex, robots, or canonical errors
- Part of a large template issue
Monitor
Monitor if the page is:
- Newly published
- Still earning signals
- Thin but planned for future expansion
- Not yet supported by strong internal links
Leave excluded
Leave it excluded if the page is:
- Intentionally private or utility-based
- Duplicate by design
- Low value for search
- Not aligned with your SEO strategy
Decision block: quick rule
If the page is important to revenue or demand capture, fix it. If it is merely discoverable but not strategically useful, monitor it. If it should not rank, exclude it on purpose.
Evidence-oriented checklist for SEO teams
Use this checklist when a page is crawled but not indexed:
- Confirm the URL is crawlable in Search Console
- Check whether the page is noindexed
- Compare user-declared and Google-selected canonicals
- Review content uniqueness and depth
- Look for duplicate URL patterns
- Check internal link prominence
- Inspect for soft 404 behavior
- Request indexing only after fixes are live
FAQ
What does crawled but not indexed mean in Google Search Console?
It means Google discovered and fetched the page, but chose not to include it in the index yet or at all. The page can still be evaluated later if quality or relevance improves.
Is crawled but not indexed a penalty?
Usually no. It is typically a quality, duplication, canonical, or prioritization issue rather than a manual penalty. In most cases, the page is being evaluated rather than punished.
How long does it take for a crawled page to get indexed?
It varies from days to weeks. High-value, unique pages with strong internal links tend to be indexed faster than thin or duplicate pages. There is no guaranteed timeline, so the best approach is to improve the signals that support indexation.
Yes, after fixing the underlying issue. A request can help recrawl, but it will not force indexing if the page still looks low value or duplicate. Use it as a final step, not the first one.
Can noindex cause crawled but not indexed?
Yes. If a page is marked noindex, Google may crawl it but will not index it. This is one of the first checks to make in any technical SEO audit.
How do I know whether the issue is sitewide or page-specific?
Look for patterns in Google Search Console. If many URLs share the same template, canonical setup, or content structure, the issue is likely sitewide. If only one page is affected, the problem is more likely page-specific.
CTA
Audit your indexation issues with Texta and see which pages are crawled, excluded, or ready to rank.
If you need a clearer view of what Google is doing with your pages, Texta helps you spot indexation patterns, prioritize fixes, and focus on the URLs most likely to matter for visibility and growth.