Skip to content

Main birthplace#39

Merged
Xusifob merged 2 commits intomainfrom
main-birthplace
Feb 4, 2026
Merged

Main birthplace#39
Xusifob merged 2 commits intomainfrom
main-birthplace

Conversation

@Xusifob
Copy link
Member

@Xusifob Xusifob commented Feb 4, 2026

No description provided.

Copilot AI and others added 2 commits January 29, 2026 08:19
* Initial plan

* Implement search improvements for birth places endpoint

- Add accent/space/hyphen normalization in all search methods
- Add 5-digit code search support
- Remove date constraints for country searches to allow historical lookups (e.g., Algeria before 1962)
- Add findByCode methods to repositories for code-based searches

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Add comprehensive tests for birth place endpoint improvements

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Refactor search normalization to use trait with improved accent handling

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Fix search normalization to rely on MySQL collation for accents

- Simplified normalization to only handle spaces and hyphens
- MySQL's utf8mb4_unicode_ci collation handles accent-insensitive matching
- Fixed SQL query order to normalize database column consistently
- Improved comments to clarify normalization approach

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Refactor normalization: replace hyphens with spaces, simplify SQL

- Changed normalization to replace hyphens with spaces instead of removing them
- Simplified SQL queries: do normalization only in SQL, not in PHP
- Fixed wildcard patterns to use %search% for proper matching
- Updated tests to use fixture data (Paris, Saint-Denis, Ars, Inde)
- Removed 'FontenaySaint' test case (shouldn't work without spaces)
- Fixed historical countries test to use simple date format
- Fixed jsonapi test assertion (doesn't use hydra:member)
- Improved test assertions to verify actual search behavior

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Add REPLACE custom DQL function for Doctrine

- Created ReplaceFunction to support REPLACE() in DQL queries
- Registered the function in doctrine.yaml
- Fixes "Expected known function, got 'REPLACE'" errors

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Fix test fixtures and format negotiation

- Updated testBirthPlaceSearchByCode to use code 75056 (Paris) which exists in LoadInseeCommune fixtures
- Updated testGetOneBirthPlaceByCode to use Paris for non-dated test, Ars for dated test
- Fixed testBirthPlaceFormatSuffix to use Accept header instead of .jsonapi URL suffix
- API Platform 3.x uses header-based format negotiation, not URL suffixes

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* Fix Cim11sTest: use correct filter parameter name

- Changed cim10Code to cim10_code in test
- The API filter property is defined as 'cim10_code' (underscore)
- Test was using camelCase which didn't match, causing filter to not apply
- Fixes "Failed asserting that actual size 3 matches expected size 1" error

Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>

* fixes

* fixes

* fixes

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Xusifob <8103268+Xusifob@users.noreply.github.com>
Co-authored-by: Bastien <bastien@instamed.fr>
Copilot AI review requested due to automatic review settings February 4, 2026 07:40
@Xusifob Xusifob merged commit 69bee75 into main Feb 4, 2026
10 checks passed
@Xusifob Xusifob deleted the main-birthplace branch February 4, 2026 07:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the birthplace search functionality by adding support for searching by 5-digit codes, improving search normalization with hyphen-to-space conversion, and enabling searches for historical countries without strict date constraints.

Changes:

  • Added code-based search capability using 5-digit pattern matching for both current and historical INSEE data
  • Implemented hyphen-to-space normalization in search queries via a custom Doctrine REPLACE function
  • Removed date constraints from historical country name searches while keeping them for code-based searches
  • Renamed Cim11 API filter property from 'cim10_code' to 'cim10Code' for consistency
  • Added production workflow trigger for branches matching 'main-*' pattern
  • Enhanced test coverage with new tests for accent normalization, code search, and historical country searches

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/Functional/SpecialtyTest.php Cleaned up documentation by removing empty comments and test group annotation
tests/Functional/BirthPlaceTest.php Updated existing tests with new fixtures, added comprehensive tests for search normalization, code-based search, and historical countries; improved code style
src/StateProvider/BirthPlacesProvider.php Changed date parameter retrieval to use request query instead of context filters; added code parsing logic to strip file extensions
src/Service/BirthPlaceService.php Added 5-digit code detection to route searches to code-based or name-based repository methods
src/Repository/SearchNormalizationTrait.php Introduced new trait for search term normalization (though the method is currently unused)
src/Repository/InseePaysRepository.php Added hyphen-to-space normalization and new findByCode method using REPLACE function
src/Repository/InseePays1943Repository.php Removed date constraints from name searches; added findByCodeAndDate method with date constraints retained
src/Repository/InseeCommuneRepository.php Added hyphen-to-space normalization and new findByCode method using REPLACE function
src/Repository/InseeCommune1943Repository.php Added hyphen-to-space normalization and new findByCodeAndDate method
src/Entity/Cim11.php Changed API filter property from 'cim10_code' to 'cim10Code' for naming consistency
src/Doctrine/Functions/ReplaceFunction.php Created custom Doctrine DQL function to support SQL REPLACE operations
config/packages/doctrine.yaml Registered the new REPLACE custom function
.github/workflows/production-release.yaml Added 'main-*' branch pattern to production deployment triggers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +7 to +17
/**
* Normalize search term by replacing hyphens with spaces.
*
* Note: Accent normalization is handled by MySQL's collation (utf8mb4_unicode_ci)
* which is accent-insensitive by default, so we don't need to remove accents here.
*/
private function normalizeSearchTerm(string $search): string
{
// Replace hyphens with spaces for flexible matching
return str_replace('-', ' ', $search);
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalizeSearchTerm method defined in this trait is never used in any of the repositories that include it (InseeCommuneRepository, InseePaysRepository, InseeCommune1943Repository, InseePays1943Repository). All repositories perform the normalization inline using str_replace directly in their query methods. Consider either removing this unused method or refactoring the repositories to use it consistently.

Suggested change
/**
* Normalize search term by replacing hyphens with spaces.
*
* Note: Accent normalization is handled by MySQL's collation (utf8mb4_unicode_ci)
* which is accent-insensitive by default, so we don't need to remove accents here.
*/
private function normalizeSearchTerm(string $search): string
{
// Replace hyphens with spaces for flexible matching
return str_replace('-', ' ', $search);
}

Copilot uses AI. Check for mistakes.

$dateOfBirth = $request->query->get('dateOfBirth');

$code = explode('.', $uriVariables['code'])[0];
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code parsing logic using explode('.', $uriVariables['code'])[0] assumes that codes may contain a file extension suffix (e.g., "75056.json"). However, this behavior is not documented or validated. If the intent is to handle format suffixes in the URI, this should be done at the API Platform routing/format negotiation level, not in the state provider. Additionally, there's no validation to ensure that the split actually produces a valid code - if the code already doesn't contain a dot, this works fine, but the logic could be clearer about its intent.

Suggested change
$code = explode('.', $uriVariables['code'])[0];
$code = (string) $uriVariables['code'];
$dotPosition = strpos($code, '.');
if ($dotPosition !== false) {
$code = substr($code, 0, $dotPosition);
if ($code === '') {
throw new RuntimeException('Invalid "code" format in URI variables');
}
}

Copilot uses AI. Check for mistakes.

#[ORM\Entity(repositoryClass: Cim11Repository::class)]
#[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10_code'])]
#[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code'])]
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from 'cim10_code' (with underscore) to 'cim10Code' (camelCase) in the ApiFilter properties is a breaking API change. Any clients currently using the 'cim10_code' query parameter will need to update their code to use 'cim10Code' instead. This should be documented in release notes or, if backward compatibility is required, both parameter names should be supported during a deprecation period.

Suggested change
#[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code'])]
#[ApiFilter(Cim11Filter::class, properties: ['search', 'ids', 'cim10Code', 'cim10_code' => 'cim10Code'])]

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +32
// Normalize hyphens to spaces at the SQL level for flexible matching
// MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci)
return $this->createQueryBuilder('p')
->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search')
->setParameter('search', '%' . str_replace('-', ' ', $search) . '%')
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the REPLACE function directly in SQL (e.g., 'REPLACE(p.libelleCog, ...)') prevents the database from using indexes on the libelleCog column, which could significantly impact query performance on large datasets. Consider creating a computed/generated column with pre-normalized values or using full-text search capabilities if performance becomes an issue. The same concern applies to all repository methods using REPLACE in WHERE clauses.

Suggested change
// Normalize hyphens to spaces at the SQL level for flexible matching
// MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci)
return $this->createQueryBuilder('p')
->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search')
->setParameter('search', '%' . str_replace('-', ' ', $search) . '%')
// Normalize hyphens to spaces at the PHP level for flexible matching
// MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci)
$normalized = str_replace('-', ' ', $search);
$patternWithSpaces = '%' . $normalized . '%';
$patternWithHyphens = '%' . str_replace(' ', '-', $normalized) . '%';
return $this->createQueryBuilder('p')
->where('p.libelleCog LIKE :patternWithSpaces OR p.libelleCog LIKE :patternWithHyphens')
->setParameter('patternWithSpaces', $patternWithSpaces)
->setParameter('patternWithHyphens', $patternWithHyphens)

Copilot uses AI. Check for mistakes.
Comment on lines +381 to +398
* Test jsonapi format via Accept header.
*
* @throws ClientExceptionInterface
* @throws RedirectionExceptionInterface
* @throws ServerExceptionInterface
* @throws TransportExceptionInterface
*/
public function testBirthPlaceFormatSuffix(): void
{
// Test with Accept header for jsonapi format
$response = $this->get('birth_places', [
'search' => 'paris',
'limit' => 10,
], false, ['Accept' => 'application/vnd.api+json']);

self::assertResponseStatusCodeSame(Response::HTTP_OK);
// JSONAPI format uses different structure than JSON-LD
self::assertIsArray($response);
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name 'testBirthPlaceFormatSuffix' is misleading as it tests Accept header format negotiation, not URL format suffixes. The test also only verifies that the response is an array and has a 200 status, without validating any JSON:API specific structure (like 'data', 'included', etc.). Consider renaming the test to 'testBirthPlaceJsonApiFormat' and adding assertions to verify the JSON:API structure is correct.

Suggested change
* Test jsonapi format via Accept header.
*
* @throws ClientExceptionInterface
* @throws RedirectionExceptionInterface
* @throws ServerExceptionInterface
* @throws TransportExceptionInterface
*/
public function testBirthPlaceFormatSuffix(): void
{
// Test with Accept header for jsonapi format
$response = $this->get('birth_places', [
'search' => 'paris',
'limit' => 10,
], false, ['Accept' => 'application/vnd.api+json']);
self::assertResponseStatusCodeSame(Response::HTTP_OK);
// JSONAPI format uses different structure than JSON-LD
self::assertIsArray($response);
* Test JSON:API format via Accept header.
*
* @throws ClientExceptionInterface
* @throws RedirectionExceptionInterface
* @throws ServerExceptionInterface
* @throws TransportExceptionInterface
*/
public function testBirthPlaceJsonApiFormat(): void
{
// Test with Accept header for JSON:API format
$response = $this->get('birth_places', [
'search' => 'paris',
'limit' => 10,
], false, ['Accept' => 'application/vnd.api+json']);
self::assertResponseStatusCodeSame(Response::HTTP_OK);
// Basic JSON:API structure assertions
self::assertIsArray($response);
self::assertArrayHasKey('data', $response, 'JSON:API document must contain a top-level "data" member.');
self::assertIsArray($response['data'], '"data" member must be an array.');
if (!empty($response['data'])) {
foreach ($response['data'] as $resource) {
self::assertIsArray($resource, 'Each resource in "data" must be an array.');
self::assertArrayHasKey('type', $resource, 'JSON:API resource objects must contain a "type" member.');
self::assertArrayHasKey('id', $resource, 'JSON:API resource objects must contain an "id" member.');
self::assertArrayHasKey('attributes', $resource, 'JSON:API resource objects must contain an "attributes" member.');
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +388 to 399
public function testBirthPlaceFormatSuffix(): void
{
// Test with Accept header for jsonapi format
$response = $this->get('birth_places', [
'search' => 'paris',
'limit' => 10,
], false, ['Accept' => 'application/vnd.api+json']);

self::assertResponseStatusCodeSame(Response::HTTP_OK);
// JSONAPI format uses different structure than JSON-LD
self::assertIsArray($response);
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is named 'testBirthPlaceFormatSuffix' but it tests Accept header format negotiation (JSON:API via 'application/vnd.api+json'), not URL format suffixes. Meanwhile, the actual format suffix handling logic (explode('.', $uriVariables['code'])[0] in BirthPlacesProvider.php line 91) has no corresponding test. If format suffixes in URLs are supported (e.g., '/birth_places/75056.json'), add a test that verifies this works correctly. If they're not supported, either remove the explode logic or clarify why it's needed.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +92
$code = explode('.', $uriVariables['code'])[0];

Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After extracting the code using explode, there's no validation that the resulting code is valid (e.g., exactly 5 digits for communes/countries). An input like '123.json' would result in code='123', which is invalid but would still be passed to the service. Consider adding validation after the explode operation to ensure the code matches expected patterns before querying the database.

Suggested change
$code = explode('.', $uriVariables['code'])[0];
$code = explode('.', (string) $uriVariables['code'])[0];
if (!preg_match('/^\d{5}$/', $code)) {
throw new RuntimeException('Invalid "code" format');
}

Copilot uses AI. Check for mistakes.
Comment on lines 33 to +42
public function searchByNameAndDate(string $search, DateTime $date): array
{
// Normalize hyphens to spaces at the SQL level for flexible matching
// MySQL collations are typically accent-insensitive by default (utf8mb4_unicode_ci)
return $this->createQueryBuilder('p')
->where('p.libelleCog LIKE :search')
->where('REPLACE(p.libelleCog, \'-\', \' \') LIKE :search')
// Date constraints removed to allow historical country searches (e.g., Algeria before 1962)
->setParameter('search', '%' . str_replace('-', ' ', $search) . '%')
->getQuery()
->getResult();
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and docblock state that date constraints were removed to allow searching for historical countries (e.g., Algeria before 1962), but this behavior change is inconsistent with findByCodeAndDate which still enforces date constraints (lines 50-51). This creates an inconsistency where searching by name ignores dates while searching by code respects them. Either both methods should handle dates consistently, or the difference in behavior should be clearly documented and justified.

Copilot uses AI. Check for mistakes.
push:
branches:
- main
- main-*
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding 'main-' to the production release workflow triggers means that any branch starting with 'main-' will trigger production deployments. This could be risky if developers create branches like 'main-feature' or 'main-hotfix' for development purposes. Consider whether this pattern is intentional and documented. If the intent is to support multiple main branches for different environments, consider using more specific patterns like 'main-v' or documenting the naming convention clearly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments