Conversation
* Initial plan * Implement search improvements for birth places endpoint - Add accent/space/hyphen normalization in all search methods - Add 5-digit code search support - Remove date constraints for country searches to allow historical lookups (e.g., Algeria before 1962) - Add findByCode methods to repositories for code-based searches Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Add comprehensive tests for birth place endpoint improvements Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Refactor search normalization to use trait with improved accent handling Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Fix search normalization to rely on MySQL collation for accents - Simplified normalization to only handle spaces and hyphens - MySQL's utf8mb4_unicode_ci collation handles accent-insensitive matching - Fixed SQL query order to normalize database column consistently - Improved comments to clarify normalization approach Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Refactor normalization: replace hyphens with spaces, simplify SQL - Changed normalization to replace hyphens with spaces instead of removing them - Simplified SQL queries: do normalization only in SQL, not in PHP - Fixed wildcard patterns to use %search% for proper matching - Updated tests to use fixture data (Paris, Saint-Denis, Ars, Inde) - Removed 'FontenaySaint' test case (shouldn't work without spaces) - Fixed historical countries test to use simple date format - Fixed jsonapi test assertion (doesn't use hydra:member) - Improved test assertions to verify actual search behavior Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Add REPLACE custom DQL function for Doctrine - Created ReplaceFunction to support REPLACE() in DQL queries - Registered the function in doctrine.yaml - Fixes "Expected known function, got 'REPLACE'" errors Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Fix test fixtures and format negotiation - Updated testBirthPlaceSearchByCode to use code 75056 (Paris) which exists in LoadInseeCommune fixtures - Updated testGetOneBirthPlaceByCode to use Paris for non-dated test, Ars for dated test - Fixed testBirthPlaceFormatSuffix to use Accept header instead of .jsonapi URL suffix - API Platform 3.x uses header-based format negotiation, not URL suffixes Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * Fix Cim11sTest: use correct filter parameter name - Changed cim10Code to cim10_code in test - The API filter property is defined as 'cim10_code' (underscore) - Test was using camelCase which didn't match, causing filter to not apply - Fixes "Failed asserting that actual size 3 matches expected size 1" error Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> * fixes * fixes * fixes --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com> Co-authored-by: Bastien <bastien@instamed.fr>
There was a problem hiding this comment.
Pull request overview
This pull request enhances the birthplace search functionality by adding support for searching by 5-digit codes, improving search normalization with hyphen-to-space conversion, and enabling searches for historical countries without strict date constraints.
Changes:
- Added code-based search capability using 5-digit pattern matching for both current and historical INSEE data
- Implemented hyphen-to-space normalization in search queries via a custom Doctrine REPLACE function
- Removed date constraints from historical country name searches while keeping them for code-based searches
- Renamed Cim11 API filter property from 'cim10_code' to 'cim10Code' for consistency
- Added production workflow trigger for branches matching 'main-*' pattern
- Enhanced test coverage with new tests for accent normalization, code search, and historical country searches
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Functional/SpecialtyTest.php | Cleaned up documentation by removing empty comments and test group annotation |
| tests/Functional/BirthPlaceTest.php | Updated existing tests with new fixtures, added comprehensive tests for search normalization, code-based search, and historical countries; improved code style |
| src/StateProvider/BirthPlacesProvider.php | Changed date parameter retrieval to use request query instead of context filters; added code parsing logic to strip file extensions |
| src/Service/BirthPlaceService.php | Added 5-digit code detection to route searches to code-based or name-based repository methods |
| src/Repository/SearchNormalizationTrait.php | Introduced new trait for search term normalization (though the method is currently unused) |
| src/Repository/InseePaysRepository.php | Added hyphen-to-space normalization and new findByCode method using REPLACE function |
| src/Repository/InseePays1943Repository.php | Removed date constraints from name searches; added findByCodeAndDate method with date constraints retained |
| src/Repository/InseeCommuneRepository.php | Added hyphen-to-space normalization and new findByCode method using REPLACE function |
| src/Repository/InseeCommune1943Repository.php | Added hyphen-to-space normalization and new findByCodeAndDate method |
| src/Entity/Cim11.php | Changed API filter property from 'cim10_code' to 'cim10Code' for naming consistency |
| src/Doctrine/Functions/ReplaceFunction.php | Created custom Doctrine DQL function to support SQL REPLACE operations |
| config/packages/doctrine.yaml | Registered the new REPLACE custom function |
| .github/workflows/production-release.yaml | Added 'main-*' branch pattern to production deployment triggers |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * Normalize search term by replacing hyphens with spaces. | ||
| * | ||
| * Note: Accent normalization is handled by MySQL's collation (utf8mb4_unicode_ci) | ||
| * which is accent-insensitive by default, so we don't need to remove accents here. | ||
| */ | ||
| private function normalizeSearchTerm(string $search): string | ||
| { | ||
| // Replace hyphens with spaces for flexible matching | ||
| return str_replace('-', ' ', $search); | ||
| } |
There was a problem hiding this comment.
The normalizeSearchTerm method defined in this trait is never used in any of the repositories that include it (InseeCommuneRepository, InseePaysRepository, InseeCommune1943Repository, InseePays1943Repository). All repositories perform the normalization inline using str_replace directly in their query methods. Consider either removing this unused method or refactoring the repositories to use it consistently.
| /** | |
| * Normalize search term by replacing hyphens with spaces. | |
| * | |
| * Note: Accent normalization is handled by MySQL's collation (utf8mb4_unicode_ci) | |
| * which is accent-insensitive by default, so we don't need to remove accents here. | |
| */ | |
| private function normalizeSearchTerm(string $search): string | |
| { | |
| // Replace hyphens with spaces for flexible matching | |
| return str_replace('-', ' ', $search); | |
| } |
|
|
||
| $dateOfBirth = $request->query->get('dateOfBirth'); | ||
|
|
||
| $code = explode('.', $uriVariables['code'])[0]; |
There was a problem hiding this comment.
The code parsing logic using explode('.', $uriVariables['code'])[0] assumes that codes may contain a file extension suffix (e.g., "75056.json"). However, this behavior is not documented or validated. If the intent is to handle format suffixes in the URI, this should be done at the API Platform routing/format negotiation level, not in the state provider. Additionally, there's no validation to ensure that the split actually produces a valid code - if the code already doesn't contain a dot, this works fine, but the logic could be clearer about its intent.
| $code = explode('.', $uriVariables['code'])[0]; | |
| $code = (string) $uriVariables['code']; | |
| $dotPosition = strpos($code, '.'); | |
| if ($dotPosition !== false) { | |
| $code = substr($code, 0, $dotPosition); | |
| if ($code === '') { | |
| throw new RuntimeException('Invalid "code" format in URI variables'); | |
| } | |
| } |
|
|
||
| #[ORM\Entity(repositoryClass: Cim11Repository::class)] | ||
| #[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10_code'])] | ||
| #[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code'])] |
There was a problem hiding this comment.
The change from 'cim10_code' (with underscore) to 'cim10Code' (camelCase) in the ApiFilter properties is a breaking API change. Any clients currently using the 'cim10_code' query parameter will need to update their code to use 'cim10Code' instead. This should be documented in release notes or, if backward compatibility is required, both parameter names should be supported during a deprecation period.
| #[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code'])] | |
| #[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code', 'cim10_code' => 'cim10Code'])] |
| // Normalize hyphens to spaces at the SQL level for flexible matching | ||
| // MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci) | ||
| return $this->createQueryBuilder('p') | ||
| ->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search') | ||
| ->setParameter('search', '%' . str_replace('-', ' ', $search) . '%') |
There was a problem hiding this comment.
Using the REPLACE function directly in SQL (e.g., 'REPLACE(p.libelleCog, ...)') prevents the database from using indexes on the libelleCog column, which could significantly impact query performance on large datasets. Consider creating a computed/generated column with pre-normalized values or using full-text search capabilities if performance becomes an issue. The same concern applies to all repository methods using REPLACE in WHERE clauses.
| // Normalize hyphens to spaces at the SQL level for flexible matching | |
| // MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci) | |
| return $this->createQueryBuilder('p') | |
| ->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search') | |
| ->setParameter('search', '%' . str_replace('-', ' ', $search) . '%') | |
| // Normalize hyphens to spaces at the PHP level for flexible matching | |
| // MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci) | |
| $normalized = str_replace('-', ' ', $search); | |
| $patternWithSpaces = '%' . $normalized . '%'; | |
| $patternWithHyphens = '%' . str_replace(' ', '-', $normalized) . '%'; | |
| return $this->createQueryBuilder('p') | |
| ->where('p.libelleCog LIKE :patternWithSpaces OR p.libelleCog LIKE :patternWithHyphens') | |
| ->setParameter('patternWithSpaces', $patternWithSpaces) | |
| ->setParameter('patternWithHyphens', $patternWithHyphens) |
| * Test jsonapi format via Accept header. | ||
| * | ||
| * @throws ClientExceptionInterface | ||
| * @throws RedirectionExceptionInterface | ||
| * @throws ServerExceptionInterface | ||
| * @throws TransportExceptionInterface | ||
| */ | ||
| public function testBirthPlaceFormatSuffix(): void | ||
| { | ||
| // Test with Accept header for jsonapi format | ||
| $response = $this->get('birth_places', [ | ||
| 'search' => 'paris', | ||
| 'limit' => 10, | ||
| ], false, ['Accept' => 'application/vnd.api+json']); | ||
|
|
||
| self::assertResponseStatusCodeSame(Response::HTTP_OK); | ||
| // JSONAPI format uses different structure than JSON-LD | ||
| self::assertIsArray($response); |
There was a problem hiding this comment.
The test name 'testBirthPlaceFormatSuffix' is misleading as it tests Accept header format negotiation, not URL format suffixes. The test also only verifies that the response is an array and has a 200 status, without validating any JSON:API specific structure (like 'data', 'included', etc.). Consider renaming the test to 'testBirthPlaceJsonApiFormat' and adding assertions to verify the JSON:API structure is correct.
| * Test jsonapi format via Accept header. | |
| * | |
| * @throws ClientExceptionInterface | |
| * @throws RedirectionExceptionInterface | |
| * @throws ServerExceptionInterface | |
| * @throws TransportExceptionInterface | |
| */ | |
| public function testBirthPlaceFormatSuffix(): void | |
| { | |
| // Test with Accept header for jsonapi format | |
| $response = $this->get('birth_places', [ | |
| 'search' => 'paris', | |
| 'limit' => 10, | |
| ], false, ['Accept' => 'application/vnd.api+json']); | |
| self::assertResponseStatusCodeSame(Response::HTTP_OK); | |
| // JSONAPI format uses different structure than JSON-LD | |
| self::assertIsArray($response); | |
| * Test JSON:API format via Accept header. | |
| * | |
| * @throws ClientExceptionInterface | |
| * @throws RedirectionExceptionInterface | |
| * @throws ServerExceptionInterface | |
| * @throws TransportExceptionInterface | |
| */ | |
| public function testBirthPlaceJsonApiFormat(): void | |
| { | |
| // Test with Accept header for JSON:API format | |
| $response = $this->get('birth_places', [ | |
| 'search' => 'paris', | |
| 'limit' => 10, | |
| ], false, ['Accept' => 'application/vnd.api+json']); | |
| self::assertResponseStatusCodeSame(Response::HTTP_OK); | |
| // Basic JSON:API structure assertions | |
| self::assertIsArray($response); | |
| self::assertArrayHasKey('data', $response, 'JSON:API document must contain a top-level "data" member.'); | |
| self::assertIsArray($response['data'], '"data" member must be an array.'); | |
| if (!empty($response['data'])) { | |
| foreach ($response['data'] as $resource) { | |
| self::assertIsArray($resource, 'Each resource in "data" must be an array.'); | |
| self::assertArrayHasKey('type', $resource, 'JSON:API resource objects must contain a "type" member.'); | |
| self::assertArrayHasKey('id', $resource, 'JSON:API resource objects must contain an "id" member.'); | |
| self::assertArrayHasKey('attributes', $resource, 'JSON:API resource objects must contain an "attributes" member.'); | |
| } | |
| } |
| public function testBirthPlaceFormatSuffix(): void | ||
| { | ||
| // Test with Accept header for jsonapi format | ||
| $response = $this->get('birth_places', [ | ||
| 'search' => 'paris', | ||
| 'limit' => 10, | ||
| ], false, ['Accept' => 'application/vnd.api+json']); | ||
|
|
||
| self::assertResponseStatusCodeSame(Response::HTTP_OK); | ||
| // JSONAPI format uses different structure than JSON-LD | ||
| self::assertIsArray($response); | ||
| } |
There was a problem hiding this comment.
The test is named 'testBirthPlaceFormatSuffix' but it tests Accept header format negotiation (JSON:API via 'application/vnd.api+json'), not URL format suffixes. Meanwhile, the actual format suffix handling logic (explode('.', $uriVariables['code'])[0] in BirthPlacesProvider.php line 91) has no corresponding test. If format suffixes in URLs are supported (e.g., '/birth_places/75056.json'), add a test that verifies this works correctly. If they're not supported, either remove the explode logic or clarify why it's needed.
| $code = explode('.', $uriVariables['code'])[0]; | ||
|
|
There was a problem hiding this comment.
After extracting the code using explode, there's no validation that the resulting code is valid (e.g., exactly 5 digits for communes/countries). An input like '123.json' would result in code='123', which is invalid but would still be passed to the service. Consider adding validation after the explode operation to ensure the code matches expected patterns before querying the database.
| $code = explode('.', $uriVariables['code'])[0]; | |
| $code = explode('.', (string) $uriVariables['code'])[0]; | |
| if (!preg_match('/^\d{5}$/', $code)) { | |
| throw new RuntimeException('Invalid "code" format'); | |
| } |
| public function searchByNameAndDate(string $search, DateTime $date): array | ||
| { | ||
| // Normalize hyphens to spaces at the SQL level for flexible matching | ||
| // MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci) | ||
| return $this->createQueryBuilder('p') | ||
| ->where('p.libelleCog LIKE :search') | ||
| ->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search') | ||
| // Date constraints removed to allow historical country searches (e.g., Algeria before 1962) | ||
| ->setParameter('search', '%' . str_replace('-', ' ', $search) . '%') | ||
| ->getQuery() | ||
| ->getResult(); |
There was a problem hiding this comment.
The comment and docblock state that date constraints were removed to allow searching for historical countries (e.g., Algeria before 1962), but this behavior change is inconsistent with findByCodeAndDate which still enforces date constraints (lines 50-51). This creates an inconsistency where searching by name ignores dates while searching by code respects them. Either both methods should handle dates consistently, or the difference in behavior should be clearly documented and justified.
| push: | ||
| branches: | ||
| - main | ||
| - main-* |
There was a problem hiding this comment.
Adding 'main-' to the production release workflow triggers means that any branch starting with 'main-' will trigger production deployments. This could be risky if developers create branches like 'main-feature' or 'main-hotfix' for development purposes. Consider whether this pattern is intentional and documented. If the intent is to support multiple main branches for different environments, consider using more specific patterns like 'main-v' or documenting the naming convention clearly.
No description provided.