Skip to content

Conversation

@danielhancockHX
Copy link

@danielhancockHX danielhancockHX commented Oct 27, 2025

Add Pagination Support for API Data Fetching

Problem

Hapi has been experiencing OOM (out of memory) crashes several times a day due to SEO data requests from SSG deployments. To protect hapi and provide stable service for customer traffic, a hard limit was imposed on the number of entities returned from all /seo routes (https://github.com/holidayextras/hapi/pull/10190).

This has caused build failures in SSGs where hapi now returns only a limited subset of the data, breaking deployments.

Related Links:

Solution

This PR adds automatic pagination support to the main data fetcher in pageData.js. The SSG will now automatically fetch all pages of data from APIs that return paginated responses, ensuring all required data is retrieved despite the new limits.

Key Changes

Core Implementation (src/components/pageData.js)

  1. New Helper Methods

    • getPaginationLimit() - Centralized limit calculation (config → env → default 100)
    • getDataArrayLength() - Safely gets array length accounting for repeater field
    • hasMorePages() - Determines if more data needs to be fetched
    • fetchAllPages() - Recursively fetches all pages of data
    • fetchSinglePage() - Legacy single-request behavior (when pagination disabled)
  2. Modified callAPI() Method

    • Now routes to pagination or single-request strategies
    • Refactored to eliminate code duplication
  3. Enhanced prepareRequest() Method

    • Automatically adds ?offset=0&limit=100 pagination parameters
    • Increments offset on subsequent requests
  4. Updated getDataForPage() Method

    • Returns raw data instead of extracted data for pagination merging

Documentation (README.md)

  • Added comprehensive "Pagination Support" section
  • Configuration options and examples
  • Testing instructions using SSG_PAGINATION_LIMIT environment variable
  • How pagination works and query parameter details

Features

Enabled by default - No markdown file changes required
Offset-based pagination - Uses ?offset=0&limit=100 matching hapi's standard
Recursive fetching - Automatically retrieves all pages until complete
Progress logging - Console output shows pagination progress during builds
Backward compatible - Can be disabled per-file with pagination.enabled: false
Environment testing - SSG_PAGINATION_LIMIT variable for testing with small page sizes
No assumptions - Works with any API response structure

Usage

Default Behaviour (No Changes Needed)

Existing markdown files work as-is:

dataSource:
  host: 'api.holidayextras.com'
  query: '/seo/pages?site=uk'
  repeater: 'data'
  pageNameField: 'pageName'

The SSG automatically adds ?offset=0&limit=100, ?offset=100&limit=100, etc.

Custom Limit

dataSource:
  host: 'api.holidayextras.com'
  query: '/seo/pages?site=uk'
  repeater: 'data'
  pageNameField: 'pageName'
  pagination:
    limit: 50

Testing with Small Pages

SSG_PAGINATION_LIMIT=10 npm run build

Console output will show:

Fetching next page for seo.md... (10 items so far)
Fetching next page for seo.md... (20 items so far)
Finished fetching all pages for seo.md. Total items: 247

Technical Details

  • Pagination Type: Offset-based only (simplified from original implementation)
  • Default Limit: 100 items per request
  • Query Parameters: offset and limit (hardcoded to match hapi standard)
  • Stop Condition: Stops fetching when returned items < limit
  • Data Merging: Accumulates data from all pages before processing

Breaking Changes

None - fully backward compatible.

Testing

  • ✅ Code transpiled successfully with Babel
  • ✅ No linter errors
  • ✅ Linked and tested with ssg-hx-uk structure
  • ⚠️ Full integration testing requires Node 20 (ssg-hx-uk compatibility)

Deployment Plan

  1. Merge this PR
  2. Update SSG projects to use new version
  3. Deploy - pagination will automatically handle hapi's limits

Future Enhancements

If other APIs require different pagination styles:

  • Add configurable parameter names (skip/take, start/count, etc.)
  • Add page-based pagination support (?page=1)
  • Add support for meta.total response fields for optimisation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants