Skip to content

Some entities imported via search:import are not indexed (missing records) #372

@quentint

Description

@quentint
  • Symfony version: v6.2.7
  • Algolia Search Bundle version: 6.0.0
  • Algolia Client Version: N/A
  • Language Version: PHP 8.1.14 (cli)

Description

When importing entities with search:import, the logs display correct index counts, but when browsing the index, some are missing.

Here is the command output:

> bin/console search:import
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 500 / 500 App\Entity\MediaTranslation entities into quentin_media index
Indexed 160 / 160 App\Entity\MediaTranslation entities into quentin_media index
Done!

I'd then expect my index to contain 14 * 500 + 160 = 7160 items, but only 5216 exist:

image

But clearing the index and importing again yields another record count (+/-5%).

Here's my configuration:

algolia_search:
    prefix: '%algolia_search_prefix%'
    indices:
        - name: media
          class: App\Entity\MediaTranslation
And here's the index settings file (created with `search:settings:backup`)
{
    "minWordSizefor1Typo": 4,
    "minWordSizefor2Typos": 8,
    "hitsPerPage": 20,
    "maxValuesPerFacet": 100,
    "version": 2,
    "searchableAttributes": [
        "unordered(media.id)",
        "unordered(title)",
        "unordered(tags)",
        "unordered(description)",
        "unordered(features)",
        "unordered(goals)",
        "unordered(more)"
    ],
    "numericAttributesToIndex": null,
    "attributesToRetrieve": null,
    "unretrievableAttributes": null,
    "optionalWords": null,
    "attributesForFaceting": [
        "locale",
        "media.type",
        "status",
        "filterOnly(tags)",
        "filterOnly(title)"
    ],
    "attributesToSnippet": null,
    "attributesToHighlight": null,
    "paginationLimitedTo": 1000,
    "attributeForDistinct": null,
    "exactOnSingleWordQuery": "attribute",
    "ranking": [
        "typo",
        "geo",
        "words",
        "filters",
        "proximity",
        "attribute",
        "exact",
        "custom"
    ],
    "customRanking": null,
    "separatorsToIndex": "",
    "removeWordsIfNoResults": "none",
    "queryType": "prefixLast",
    "highlightPreTag": "<em>",
    "highlightPostTag": "<\/em>",
    "snippetEllipsisText": "",
    "alternativesAsExact": [
        "ignorePlurals",
        "singleWordSynonym"
    ],
    "sortFacetValuesBy": "count",
    "renderingContent": {
        "facetOrdering": {
            "facets": {
                "order": [
                    "locale",
                    "media.type",
                    "status"
                ]
            },
            "values": {
                "locale": {
                    "sortRemainingBy": "alpha"
                },
                "media.type": {
                    "sortRemainingBy": "alpha"
                },
                "status": {
                    "sortRemainingBy": "alpha"
                }
            }
        }
    }
}

I tried changing the batchSize but the issue remained.
I used to have a index_if in there, but removed it and the issue remained.

When running the search:import command and regularly refreshing the index on the Algolia dashboard, the "No. records" evolves like so (that's only an example, values change if I re-run this on a clear index):

  • 500
  • 1000
  • 1,500
  • 2,000
  • 2,253
  • 2,525
  • 3,025
  • (...)

As you can see, thinks looks OK at first, but then get a bit crazy around the 2000/2500 mark.

Steps To Reproduce

Unfortunately this is hard to reproduce, because I can't pinpoint the origin of the issue (and the randomness makes it even stranger) 🙁

I tried looking at the Symfony logs to see if some error appeared there, but found nothing.

What could prevent records from appearing in my index?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions