SEO challenges of a headless architecture

In a traditional WordPress architecture, plugins like Yoast SEO or Rank Math automatically generate meta tags, the sitemap, and structured data within the HTML rendered by the PHP theme. In headless mode, the WordPress theme is no longer used for rendering: these SEO elements must be implemented on the Next.js frontend.

Yoast and Rank Math remain useful on the back-end

In headless mode, Yoast SEO and Rank Math still serve a purpose: they expose metadata (SEO title, description, OG image) via the REST or GraphQL API. Editors keep filling in these fields in the WordPress editor. The Next.js frontend retrieves them via the API and injects them into the HTML.

Dynamic metadata with generateMetadata

Next.js (App Router) provides the generateMetadata function to define each page's meta tags from the data fetched from the WordPress API.

// app/articles/[slug]/page.tsx
import { Metadata } from 'next';

type Props = { params: { slug: string } };

export async function generateMetadata({ params }: Props): Promise<Metadata> {
  const article = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?slug=${params.slug}&_fields=title,excerpt,yoast_head_json`
  ).then(res => res.json()).then(data => data[0]);

  const seo = article.yoast_head_json;

  return {
    title: seo?.title || article.title.rendered,
    description: seo?.description || article.excerpt.rendered.replace(/<[^>]*>/g, ''),
    openGraph: {
      title: seo?.og_title || article.title.rendered,
      description: seo?.og_description,
      images: seo?.og_image ? [{ url: seo.og_image[0].url }] : [],
      type: 'article',
    },
    twitter: {
      card: 'summary_large_image',
      title: seo?.twitter_title || seo?.og_title,
      description: seo?.twitter_description || seo?.og_description,
    },
    alternates: {
      canonical: seo?.canonical || `https://www.your-site.com/articles/${params.slug}`,
    },
  };
}

Dynamic XML sitemap

Next.js lets you generate an XML sitemap from the app/sitemap.ts file. This file queries the WordPress API to list every indexable page.

// app/sitemap.ts
import { MetadataRoute } from 'next';

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = 'https://www.your-site.com';

  // Fetch all posts from WordPress
  const posts = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/posts?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  // Fetch all pages
  const pages = await fetch(
    `${process.env.WORDPRESS_API_URL}/wp/v2/pages?per_page=100&_fields=slug,modified`
  ).then(res => res.json());

  const postEntries = posts.map((post: any) => ({
    url: `${baseUrl}/articles/${post.slug}`,
    lastModified: new Date(post.modified),
    changeFrequency: 'weekly' as const,
    priority: 0.7,
  }));

  const pageEntries = pages.map((page: any) => ({
    url: `${baseUrl}/${page.slug}`,
    lastModified: new Date(page.modified),
    changeFrequency: 'monthly' as const,
    priority: 0.8,
  }));

  return [
    { url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
    ...pageEntries,
    ...postEntries,
  ];
}

robots.txt configuration

// app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: '/',
        disallow: ['/api/', '/admin/'],
      },
    ],
    sitemap: 'https://www.your-site.com/sitemap.xml',
  };
}

Block indexing of the WordPress back-end

The WordPress server (e.g., admin.your-site.com) must not be indexed by search engines. Add a robots.txt file on the WordPress domain with Disallow: / to prevent any crawl. Only the Next.js frontend should be indexed.

JSON-LD structured data

Structured data (schema.org) lets search engines understand the type of content on a page. They are implemented as JSON-LD scripts inside Next.js components.

// components/ArticleJsonLd.tsx
type ArticleJsonLdProps = {
  title: string;
  description: string;
  url: string;
  imageUrl: string;
  datePublished: string;
  dateModified: string;
  authorName: string;
};

export function ArticleJsonLd(props: ArticleJsonLdProps) {
  const schema = {
    '@context': 'https://schema.org',
    '@type': 'Article',
    headline: props.title,
    description: props.description,
    url: props.url,
    image: props.imageUrl,
    datePublished: props.datePublished,
    dateModified: props.dateModified,
    author: { '@type': 'Person', name: props.authorName },
    publisher: {
      '@type': 'Organization',
      name: 'Your Site',
      logo: { '@type': 'ImageObject', url: 'https://www.your-site.com/logo.png' },
    },
  };

  return (
    <script
      type="application/ld+json"
      dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
    />
  );
}

Canonical URLs and redirects

Canonical URLs

Each page must declare its canonical URL to avoid duplicate-content issues. The alternates.canonical property in generateMetadata handles this declaration.

Redirects and 404 handling

Redirects (old URLs to new ones) are configured in next.config.js:

// next.config.js — redirects
module.exports = {
  async redirects() {
    return [
      {
        source: '/old-article/:slug',
        destination: '/articles/:slug',
        permanent: true, // 301
      },
    ];
  },
};

For pages not found, Next.js uses the app/not-found.tsx file which automatically returns an HTTP 404 status code.

Configure dynamic metadata

Implement generateMetadata in each layout or dynamic page. Fetch the SEO data from the WordPress API (Yoast or Rank Math fields) and map them onto the Next.js Metadata object.

Generate the sitemap and robots.txt

Create the app/sitemap.ts and app/robots.ts files. The sitemap queries the WordPress API to list every indexable URL with its last modification date.

Implement JSON-LD structured data

Add JSON-LD components for each content type (Article, BreadcrumbList, Organization, FAQ). Use the data fetched from WordPress to populate the schema.org fields.

Configure canonical URLs and redirects

Declare canonical URLs via generateMetadata and configure redirects in next.config.js to preserve the SEO of legacy URLs.

Add hreflang tags (if multilingual)

For a multilingual site, add hreflang tags in generateMetadata via the alternates.languages property to tell search engines which language versions of each page are available.

hreflang tags for multilingual

If your site is available in several languages, hreflang tags tell search engines which language version to serve based on the user's locale.

// In generateMetadata
return {
  alternates: {
    canonical: `https://www.your-site.com/articles/${slug}`,
    languages: {
      'fr': `https://www.your-site.com/articles/${slug}`,
      'en': `https://www.your-site.com/en/articles/${slug}`,
    },
  },
};

Open Graph and Twitter Cards

Open Graph tags control how your pages appear when shared on social networks (Facebook, LinkedIn). Twitter Cards do the same for X/Twitter. Both are configured in generateMetadata as illustrated in the dynamic metadata section.

Headless technical SEO checklist

  • Unique title and description tags on every page
  • Canonical URL declared on every page
  • Dynamic XML sitemap accessible at /sitemap.xml
  • robots.txt file configured (frontend indexable, backend blocked)
  • JSON-LD structured data for each content type
  • Open Graph and Twitter Card tags filled in
  • hreflang tags if the site is multilingual
  • 301 redirects for old URLs
  • Custom 404 page with HTTP 404 status code
  • SSR or SSG rendering so the HTML is complete at crawl time

Tracking with Google Search Console

After deployment, submit your sitemap in Google Search Console and monitor:

  • The number of indexed vs discovered pages
  • Crawl errors (404, 5xx, redirect loops)
  • Search performance (impressions, clicks, average position)
  • Core Web Vitals (LCP, FID, CLS) measured on field data