SEO challenges of a headless architecture
In a traditional WordPress architecture, plugins like Yoast SEO or Rank Math automatically generate meta tags, the sitemap, and structured data within the HTML rendered by the PHP theme. In headless mode, the WordPress theme is no longer used for rendering: these SEO elements must be implemented on the Next.js frontend.
Yoast and Rank Math remain useful on the back-end
In headless mode, Yoast SEO and Rank Math still serve a purpose: they expose metadata (SEO title, description, OG image) via the REST or GraphQL API. Editors keep filling in these fields in the WordPress editor. The Next.js frontend retrieves them via the API and injects them into the HTML.
Dynamic metadata with generateMetadata
Next.js (App Router) provides the generateMetadata function to define each page's meta tags from the data fetched from the WordPress API.
// app/articles/[slug]/page.tsx
import { Metadata } from 'next';
type Props = { params: { slug: string } };
export async function generateMetadata({ params }: Props): Promise<Metadata> {
const article = await fetch(
`${process.env.WORDPRESS_API_URL}/wp/v2/posts?slug=${params.slug}&_fields=title,excerpt,yoast_head_json`
).then(res => res.json()).then(data => data[0]);
const seo = article.yoast_head_json;
return {
title: seo?.title || article.title.rendered,
description: seo?.description || article.excerpt.rendered.replace(/<[^>]*>/g, ''),
openGraph: {
title: seo?.og_title || article.title.rendered,
description: seo?.og_description,
images: seo?.og_image ? [{ url: seo.og_image[0].url }] : [],
type: 'article',
},
twitter: {
card: 'summary_large_image',
title: seo?.twitter_title || seo?.og_title,
description: seo?.twitter_description || seo?.og_description,
},
alternates: {
canonical: seo?.canonical || `https://www.your-site.com/articles/${params.slug}`,
},
};
}
Dynamic XML sitemap
Next.js lets you generate an XML sitemap from the app/sitemap.ts file. This file queries the WordPress API to list every indexable page.
// app/sitemap.ts
import { MetadataRoute } from 'next';
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const baseUrl = 'https://www.your-site.com';
// Fetch all posts from WordPress
const posts = await fetch(
`${process.env.WORDPRESS_API_URL}/wp/v2/posts?per_page=100&_fields=slug,modified`
).then(res => res.json());
// Fetch all pages
const pages = await fetch(
`${process.env.WORDPRESS_API_URL}/wp/v2/pages?per_page=100&_fields=slug,modified`
).then(res => res.json());
const postEntries = posts.map((post: any) => ({
url: `${baseUrl}/articles/${post.slug}`,
lastModified: new Date(post.modified),
changeFrequency: 'weekly' as const,
priority: 0.7,
}));
const pageEntries = pages.map((page: any) => ({
url: `${baseUrl}/${page.slug}`,
lastModified: new Date(page.modified),
changeFrequency: 'monthly' as const,
priority: 0.8,
}));
return [
{ url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
...pageEntries,
...postEntries,
];
}
robots.txt configuration
// app/robots.ts
import { MetadataRoute } from 'next';
export default function robots(): MetadataRoute.Robots {
return {
rules: [
{
userAgent: '*',
allow: '/',
disallow: ['/api/', '/admin/'],
},
],
sitemap: 'https://www.your-site.com/sitemap.xml',
};
}
Block indexing of the WordPress back-end
The WordPress server (e.g., admin.your-site.com) must not be indexed by search engines. Add a robots.txt file on the WordPress domain with Disallow: / to prevent any crawl. Only the Next.js frontend should be indexed.
JSON-LD structured data
Structured data (schema.org) lets search engines understand the type of content on a page. They are implemented as JSON-LD scripts inside Next.js components.
// components/ArticleJsonLd.tsx
type ArticleJsonLdProps = {
title: string;
description: string;
url: string;
imageUrl: string;
datePublished: string;
dateModified: string;
authorName: string;
};
export function ArticleJsonLd(props: ArticleJsonLdProps) {
const schema = {
'@context': 'https://schema.org',
'@type': 'Article',
headline: props.title,
description: props.description,
url: props.url,
image: props.imageUrl,
datePublished: props.datePublished,
dateModified: props.dateModified,
author: { '@type': 'Person', name: props.authorName },
publisher: {
'@type': 'Organization',
name: 'Your Site',
logo: { '@type': 'ImageObject', url: 'https://www.your-site.com/logo.png' },
},
};
return (
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
/>
);
}
Canonical URLs and redirects
Canonical URLs
Each page must declare its canonical URL to avoid duplicate-content issues. The alternates.canonical property in generateMetadata handles this declaration.
Redirects and 404 handling
Redirects (old URLs to new ones) are configured in next.config.js:
// next.config.js — redirects
module.exports = {
async redirects() {
return [
{
source: '/old-article/:slug',
destination: '/articles/:slug',
permanent: true, // 301
},
];
},
};
For pages not found, Next.js uses the app/not-found.tsx file which automatically returns an HTTP 404 status code.
Configure dynamic metadata
Implement generateMetadata in each layout or dynamic page. Fetch the SEO data from the WordPress API (Yoast or Rank Math fields) and map them onto the Next.js Metadata object.
Generate the sitemap and robots.txt
Create the app/sitemap.ts and app/robots.ts files. The sitemap queries the WordPress API to list every indexable URL with its last modification date.
Implement JSON-LD structured data
Add JSON-LD components for each content type (Article, BreadcrumbList, Organization, FAQ). Use the data fetched from WordPress to populate the schema.org fields.
Configure canonical URLs and redirects
Declare canonical URLs via generateMetadata and configure redirects in next.config.js to preserve the SEO of legacy URLs.
Add hreflang tags (if multilingual)
For a multilingual site, add hreflang tags in generateMetadata via the alternates.languages property to tell search engines which language versions of each page are available.
hreflang tags for multilingual
If your site is available in several languages, hreflang tags tell search engines which language version to serve based on the user's locale.
// In generateMetadata
return {
alternates: {
canonical: `https://www.your-site.com/articles/${slug}`,
languages: {
'fr': `https://www.your-site.com/articles/${slug}`,
'en': `https://www.your-site.com/en/articles/${slug}`,
},
},
};
Open Graph and Twitter Cards
Open Graph tags control how your pages appear when shared on social networks (Facebook, LinkedIn). Twitter Cards do the same for X/Twitter. Both are configured in generateMetadata as illustrated in the dynamic metadata section.
Headless technical SEO checklist
- Unique title and description tags on every page
- Canonical URL declared on every page
- Dynamic XML sitemap accessible at
/sitemap.xml - robots.txt file configured (frontend indexable, backend blocked)
- JSON-LD structured data for each content type
- Open Graph and Twitter Card tags filled in
- hreflang tags if the site is multilingual
- 301 redirects for old URLs
- Custom 404 page with HTTP 404 status code
- SSR or SSG rendering so the HTML is complete at crawl time
Tracking with Google Search Console
After deployment, submit your sitemap in Google Search Console and monitor:
- The number of indexed vs discovered pages
- Crawl errors (404, 5xx, redirect loops)
- Search performance (impressions, clicks, average position)
- Core Web Vitals (LCP, FID, CLS) measured on field data
Performance and Core Web Vitals in headless
Article suivantHosting and going live - The method