Headless CMS7 min read9 sections

Technical SEO for headless architecture

Implementing technical SEO (metadata, sitemap, structured data, indexing) in a Next.js headless WordPress site.

Next Impact

2025-05-15

SEO challenges of a headless architecture

In a traditional WordPress architecture, plugins like Yoast SEO or Rank Math automatically generate meta tags, the sitemap, and structured data within the HTML rendered by the PHP theme. In headless mode, the WordPress theme is no longer used for rendering: these SEO elements must be implemented on the Next.js frontend.

Yoast and Rank Math remain useful on the back-end

In headless mode, Yoast SEO and Rank Math still serve a purpose: they expose metadata (SEO title, description, OG image) via the REST or GraphQL API. Editors keep filling in these fields in the WordPress editor. The Next.js frontend retrieves them via the API and injects them into the HTML.

Dynamic metadata with generateMetadata

Next.js (App Router) provides the generateMetadata function to define each page's meta tags from the data fetched from the WordPress API.

// app/articles/[slug]/page.tsx
import { Metadata } from 'next';

type Props = { params: { slug: string } };

export async function generateMetadata({ params }: Props): Promise<Metadata> {
  const article = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?slug=${params.slug}&_fields=title,excerpt,yoast_head_json`
  ).then(res => res.json()).then(data => data[0]);

  const seo = article.yoast_head_json;

  return {
    title: seo?.title || article.title.rendered,
    description: seo?.description || article.excerpt.rendered.replace(/<[^>]*>/g, ''),
    openGraph: {
      title: seo?.og_title || article.title.rendered,
      description: seo?.og_description,
      images: seo?.og_image ? [{ url: seo.og_image[0].url }] : [],
      type: 'article',
    },
    twitter: {
      card: 'summary_large_image',
      title: seo?.twitter_title || seo?.og_title,
      description: seo?.twitter_description || seo?.og_description,
    },
    alternates: {
      canonical: seo?.canonical || `https://www.your-site.com/articles/${params.slug}`,
    },
  };
}

Dynamic XML sitemap

Next.js lets you generate an XML sitemap from the app/sitemap.ts file. This file queries the WordPress API to list every indexable page.

// app/sitemap.ts
import { MetadataRoute } from 'next';

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = 'https://www.your-site.com';

  // Fetch all posts from WordPress
  const posts = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  // Fetch all pages
  const pages = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/pages?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  const postEntries = posts.map((post: any) => ({
    url: `${baseUrl}/articles/${post.slug}`,
    lastModified: new Date(post.modified),
    changeFrequency: 'weekly' as const,
    priority: 0.7,
  }));

  const pageEntries = pages.map((page: any) => ({
    url: `${baseUrl}/${page.slug}`,
    lastModified: new Date(page.modified),
    changeFrequency: 'monthly' as const,
    priority: 0.8,
  }));

  return [
    { url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
    ...pageEntries,
    ...postEntries,
  ];
}

robots.txt configuration

// app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: '/',
        disallow: ['/api/', '/admin/'],
      },
    ],
    sitemap: 'https://www.your-site.com/sitemap.xml',
  };
}

Block indexing of the WordPress back-end

The WordPress server (e.g., admin.your-site.com) must not be indexed by search engines. Add a robots.txt file on the WordPress domain with Disallow: / to prevent any crawl. Only the Next.js frontend should be indexed.

JSON-LD structured data

Structured data (schema.org) lets search engines understand the type of content on a page. They are implemented as JSON-LD scripts inside Next.js components.

// components/ArticleJsonLd.tsx
type ArticleJsonLdProps = {
  title: string;
  description: string;
  url: string;
  imageUrl: string;
  datePublished: string;
  dateModified: string;
  authorName: string;
};

export function ArticleJsonLd(props: ArticleJsonLdProps) {
  const schema = {
    '@context': 'https://schema.org',
    '@type': 'Article',
    headline: props.title,
    description: props.description,
    url: props.url,
    image: props.imageUrl,
    datePublished: props.datePublished,
    dateModified: props.dateModified,
    author: { '@type': 'Person', name: props.authorName },
    publisher: {
      '@type': 'Organization',
      name: 'Your Site',
      logo: { '@type': 'ImageObject', url: 'https://www.your-site.com/logo.png' },
    },
  };

  return (
    <script
      type="application/ld+json"
      dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
    />
  );
}

Canonical URLs and redirects

Canonical URLs

Each page must declare its canonical URL to avoid duplicate-content issues. The alternates.canonical property in generateMetadata handles this declaration.

Redirects and 404 handling

Redirects (old URLs to new ones) are configured in next.config.js:

// next.config.js — redirects
module.exports = {
  async redirects() {
    return [
      {
        source: '/old-article/:slug',
        destination: '/articles/:slug',
        permanent: true, // 301
      },
    ];
  },
};

For pages not found, Next.js uses the app/not-found.tsx file which automatically returns an HTTP 404 status code.

Configure dynamic metadata

Implement generateMetadata in each layout or dynamic page. Fetch the SEO data from the WordPress API (Yoast or Rank Math fields) and map them onto the Next.js Metadata object.

Generate the sitemap and robots.txt

Create the app/sitemap.ts and app/robots.ts files. The sitemap queries the WordPress API to list every indexable URL with its last modification date.

Implement JSON-LD structured data

Add JSON-LD components for each content type (Article, BreadcrumbList, Organization, FAQ). Use the data fetched from WordPress to populate the schema.org fields.

Configure canonical URLs and redirects

Declare canonical URLs via generateMetadata and configure redirects in next.config.js to preserve the SEO of legacy URLs.

Add hreflang tags (if multilingual)

For a multilingual site, add hreflang tags in generateMetadata via the alternates.languages property to tell search engines which language versions of each page are available.

hreflang tags for multilingual

If your site is available in several languages, hreflang tags tell search engines which language version to serve based on the user's locale.

// In generateMetadata
return {
  alternates: {
    canonical: `https://www.your-site.com/articles/${slug}`,
    languages: {
      'fr': `https://www.your-site.com/articles/${slug}`,
      'en': `https://www.your-site.com/en/articles/${slug}`,
    },
  },
};

Open Graph and Twitter Cards

Open Graph tags control how your pages appear when shared on social networks (Facebook, LinkedIn). Twitter Cards do the same for X/Twitter. Both are configured in generateMetadata as illustrated in the dynamic metadata section.

Headless technical SEO checklist

Unique title and description tags on every page
Canonical URL declared on every page
Dynamic XML sitemap accessible at /sitemap.xml
robots.txt file configured (frontend indexable, backend blocked)
JSON-LD structured data for each content type
Open Graph and Twitter Card tags filled in
hreflang tags if the site is multilingual
301 redirects for old URLs
Custom 404 page with HTTP 404 status code
SSR or SSG rendering so the HTML is complete at crawl time

Tracking with Google Search Console

After deployment, submit your sitemap in Google Search Console and monitor:

The number of indexed vs discovered pages
Crawl errors (404, 5xx, redirect loops)
Search performance (impressions, clicks, average position)
Core Web Vitals (LCP, FID, CLS) measured on field data

Article précédent

Performance and Core Web Vitals in headless

Article suivant

Hosting and going live - The method

Continuer la lecture

Headless CMS

Understanding headless

What is headless and what is WordPress used in headless mode?

5 min

Headless CMS

Why headless

Why choose to build a website headless?

6 min

Headless CMS

How headless works

How does headless work in practice?

5 min

Going further

All guides AI audit ROI simulator Case studies Our offerings

Learn Headless CMS

Headless CMS7 min read9 sections

Technical SEO for headless architecture

Implementing technical SEO (metadata, sitemap, structured data, indexing) in a Next.js headless WordPress site.

Next Impact

2025-05-15

SEO challenges of a headless architecture

Yoast and Rank Math remain useful on the back-end

Dynamic metadata with generateMetadata

Next.js (App Router) provides the generateMetadata function to define each page's meta tags from the data fetched from the WordPress API.

// app/articles/[slug]/page.tsx
import { Metadata } from 'next';

type Props = { params: { slug: string } };

export async function generateMetadata({ params }: Props): Promise<Metadata> {
  const article = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?slug=${params.slug}&_fields=title,excerpt,yoast_head_json`
  ).then(res => res.json()).then(data => data[0]);

  const seo = article.yoast_head_json;

  return {
    title: seo?.title || article.title.rendered,
    description: seo?.description || article.excerpt.rendered.replace(/<[^>]*>/g, ''),
    openGraph: {
      title: seo?.og_title || article.title.rendered,
      description: seo?.og_description,
      images: seo?.og_image ? [{ url: seo.og_image[0].url }] : [],
      type: 'article',
    },
    twitter: {
      card: 'summary_large_image',
      title: seo?.twitter_title || seo?.og_title,
      description: seo?.twitter_description || seo?.og_description,
    },
    alternates: {
      canonical: seo?.canonical || `https://www.your-site.com/articles/${params.slug}`,
    },
  };
}

Dynamic XML sitemap

Next.js lets you generate an XML sitemap from the app/sitemap.ts file. This file queries the WordPress API to list every indexable page.

// app/sitemap.ts
import { MetadataRoute } from 'next';

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = 'https://www.your-site.com';

  // Fetch all posts from WordPress
  const posts = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  // Fetch all pages
  const pages = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/pages?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  const postEntries = posts.map((post: any) => ({
    url: `${baseUrl}/articles/${post.slug}`,
    lastModified: new Date(post.modified),
    changeFrequency: 'weekly' as const,
    priority: 0.7,
  }));

  const pageEntries = pages.map((page: any) => ({
    url: `${baseUrl}/${page.slug}`,
    lastModified: new Date(page.modified),
    changeFrequency: 'monthly' as const,
    priority: 0.8,
  }));

  return [
    { url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
    ...pageEntries,
    ...postEntries,
  ];
}

robots.txt configuration

// app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: '/',
        disallow: ['/api/', '/admin/'],
      },
    ],
    sitemap: 'https://www.your-site.com/sitemap.xml',
  };
}

Block indexing of the WordPress back-end

JSON-LD structured data

Structured data (schema.org) lets search engines understand the type of content on a page. They are implemented as JSON-LD scripts inside Next.js components.

// components/ArticleJsonLd.tsx
type ArticleJsonLdProps = {
  title: string;
  description: string;
  url: string;
  imageUrl: string;
  datePublished: string;
  dateModified: string;
  authorName: string;
};

export function ArticleJsonLd(props: ArticleJsonLdProps) {
  const schema = {
    '@context': 'https://schema.org',
    '@type': 'Article',
    headline: props.title,
    description: props.description,
    url: props.url,
    image: props.imageUrl,
    datePublished: props.datePublished,
    dateModified: props.dateModified,
    author: { '@type': 'Person', name: props.authorName },
    publisher: {
      '@type': 'Organization',
      name: 'Your Site',
      logo: { '@type': 'ImageObject', url: 'https://www.your-site.com/logo.png' },
    },
  };

  return (
    <script
      type="application/ld+json"
      dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
    />
  );
}

Canonical URLs and redirects

Canonical URLs

Each page must declare its canonical URL to avoid duplicate-content issues. The alternates.canonical property in generateMetadata handles this declaration.

Redirects and 404 handling

Redirects (old URLs to new ones) are configured in next.config.js:

// next.config.js — redirects
module.exports = {
  async redirects() {
    return [
      {
        source: '/old-article/:slug',
        destination: '/articles/:slug',
        permanent: true, // 301
      },
    ];
  },
};

For pages not found, Next.js uses the app/not-found.tsx file which automatically returns an HTTP 404 status code.

Configure dynamic metadata

Implement generateMetadata in each layout or dynamic page. Fetch the SEO data from the WordPress API (Yoast or Rank Math fields) and map them onto the Next.js Metadata object.

Generate the sitemap and robots.txt

Create the app/sitemap.ts and app/robots.ts files. The sitemap queries the WordPress API to list every indexable URL with its last modification date.

Implement JSON-LD structured data

Add JSON-LD components for each content type (Article, BreadcrumbList, Organization, FAQ). Use the data fetched from WordPress to populate the schema.org fields.

Configure canonical URLs and redirects

Declare canonical URLs via generateMetadata and configure redirects in next.config.js to preserve the SEO of legacy URLs.

Add hreflang tags (if multilingual)

For a multilingual site, add hreflang tags in generateMetadata via the alternates.languages property to tell search engines which language versions of each page are available.

hreflang tags for multilingual

If your site is available in several languages, hreflang tags tell search engines which language version to serve based on the user's locale.

// In generateMetadata
return {
  alternates: {
    canonical: `https://www.your-site.com/articles/${slug}`,
    languages: {
      'fr': `https://www.your-site.com/articles/${slug}`,
      'en': `https://www.your-site.com/en/articles/${slug}`,
    },
  },
};