Skip to content

Conversation

Copy link

Copilot AI commented Oct 2, 2025

Problem

Previously, PubMed format was being generated for all articles regardless of their document type. However, according to NCBI PubMed documentation, PubMed only accepts specific types of articles for submission.

This was generating unnecessary PubMed XML files for ineligible article types such as book reviews, abstracts, news items, and other document types that PubMed does not accept.

Solution

This PR implements filtering logic to only generate PubMed format for eligible article types:

Eligible article types:

  • research-article
  • review-article
  • case-report
  • editorial
  • letter
  • brief-report
  • rapid-communication
  • reply
  • article-commentary
  • correction
  • retraction
  • addendum

Changes

  1. article/choices.py: Added PUBMED_ARTICLE_TYPES constant containing the list of eligible article types based on NCBI guidelines

  2. article/models.py: Modified ArticleFormat.generate_formats() to check the article type before generating PubMed format:

    # Only generate PubMed format for eligible article types
    if article.article_type in choices.PUBMED_ARTICLE_TYPES:
        cls.generate(
            user,
            article,
            "pubmed",
            article.sps_pkg_name + ".xml",
            pubmed.pipeline_pubmed,
            indexed_check=False,
        )
  3. article/tests.py: Added comprehensive test suite (PubMedArticleTypeFilteringTest) with 4 test methods to verify:

    • PubMed format is generated for eligible article types
    • PubMed format is NOT generated for ineligible article types
    • Other formats (PMC, CrossRef) continue to work as expected

Impact

  • Before: PubMed format generated for ALL articles
  • After: PubMed format only generated for eligible article types
  • Other formats (CrossRef, PMC) remain unaffected
  • Reduces unnecessary processing and file generation
  • Ensures compliance with PubMed submission guidelines

Fixes the requirement from issue: scieloorg/PC-Programs#3374

Original prompt

This section details on the original issue you should resolve

<issue_title>Selecionar os artigos para a geração do formato PubMed</issue_title>
<issue_description>### Descrição da tarefa
Somente para os seguintes tipos de documentos:
https://www.ncbi.nlm.nih.gov/books/NBK3828/#publisherhelp.What_types_of_articles_are

demanda registrada inicialmente em: scieloorg/PC-Programs#3374</issue_description>

Comments on the Issue (you are @copilot in this section)

@robertatakenaka @scieloorg/scielo-ps-developers por favor, adicione aqui exemplos de xml pubmed compactados (zip) original e corrigido. Assim atenderemos este issue com mais precisão.
Fixes #1081

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Copilot AI changed the title [WIP] Selecionar os artigos para a geração do formato PubMed Add article type filtering for PubMed format generation Oct 2, 2025
Copilot AI requested a review from robertatakenaka October 2, 2025 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Selecionar os artigos para a geração do formato PubMed

2 participants