Skip to content

feat: add reusable CSV/XLS export utility and filtered download routes to journal page classes#1417

Open
Copilot wants to merge 4 commits intomainfrom
copilot/add-download-csv-functionality
Open

feat: add reusable CSV/XLS export utility and filtered download routes to journal page classes#1417
Copilot wants to merge 4 commits intomainfrom
copilot/add-download-csv-functionality

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 23, 2026

O que esse PR faz?

Extrai a lógica de exportação CSV/XLS de periódicos para um módulo utilitário reutilizável e adiciona endpoints de download diretamente nas três classes de página (ListPageJournal, ListPageJournalByPublisher, ListPageJournalByCategory) via RoutablePageMixin e um JournalDownloadMixin compartilhado. Os downloads respeitam os filtros aplicados pelo usuário na página (search_term, start_with_letter, tab), exportando apenas os periódicos filtrados.

Mudanças principais:

  • core/home/utils/export_journals.py (novo): get_scielo_journals_data(filters=None), generate_csv_response(), generate_xls_response() — funções reutilizáveis com colunas journals, scielo_url, publisher. get_scielo_journals_data aceita filtros Q opcionais para restringir o queryset
  • core/home/models.py: JournalDownloadMixin com rotas download-csv/ e download-xls/, usando template method get_export_filters(request) que cada page class sobrescreve com sua lógica de filtro específica (incluindo busca por nome de publisher em ListPageJournalByPublisher)
  • core/home/views.py: views standalone refatoradas para usar o utilitário (mantidas para backward compatibility das URLs globais)
  • download_button_csv_xls.html: usa {% routablepageurl page 'download_xls' %} com request.GET.urlencode para encaminhar os parâmetros de filtro atuais às URLs de download
  • core/home/tests.py: testes para generate_csv_response, generate_xls_response e filtragem de dados via get_scielo_journals_data

Onde a revisão poderia começar?

core/home/utils/export_journals.pycore/home/models.py (classe JournalDownloadMixin e get_export_filters nas page classes) → core/templates/home/include/download_button_csv_xls.html

Como este poderia ser testado manualmente?

  1. Acessar qualquer página de listagem de periódicos (por título, por editora, por categoria)
  2. Aplicar filtros (busca por termo, filtro por letra, aba ativo/descontinuado)
  3. Clicar no botão de download (ícone ↓) no cabeçalho da tabela
  4. Selecionar CSV ou XLS
  5. Verificar que o arquivo baixado contém apenas os periódicos filtrados, com as colunas journals, scielo_url, publisher
  6. Repetir sem filtros para verificar que todos os periódicos são exportados
  7. As URLs globais anteriores (/download-csv-journals-page-scielo-org/, /download-xls-journals-page-scielo-org/) continuam funcionando

Algum cenário de contexto que queira dar?

A implementação anterior tinha lógica de exportação duplicada em view functions standalone e não considerava os filtros do request. O JournalDownloadMixin centraliza os métodos de download e é herdado pelas três classes, eliminando duplicação. Cada page class sobrescreve get_export_filters(request) para aplicar os mesmos filtros usados na listagem da página ao export. O template de download encaminha os query parameters atuais (request.GET.urlencode) para que os filtros sejam preservados no download. As URLs globais em config/urls.py foram mantidas para não quebrar links existentes.

Screenshots

N/A — funcionalidade de download de arquivo, sem mudança visual além das URLs dos links.

Quais são tickets relevantes?

Desenvolver função para download de arquivo CSV com dados de periódicos.

Referências

Original prompt

This section details on the original issue you should resolve

<issue_title>Desenvolver função para download de arquivo CSV com dados de periódicos</issue_title>
<issue_description>### Descrição da tarefa
Nas classes de página (ListPageJournalByPublisher, ListPageJournal e ListPageJournalByCategory), implementar funcionalidades para download da listagem de periódicos nos formatos CSV e XLSX.

Os arquivos gerados devem conter as seguintes colunas:

  • journals
  • scielo_url
  • publisher

Antes de implementar, verificar se já existe alguma função utilitária no projeto que realize exportação de dados nesses formatos, priorizando seu reaproveitamento. Caso não exista, criar uma solução reutilizável (ex: função helper/serviço) e utilizá-la nas três classes, evitando duplicação de código.</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits March 23, 2026 13:05
…nal page classes

- Create core/home/utils/export_journals.py with reusable get_scielo_journals_data, generate_csv_response, and generate_xls_response functions
- Add RoutablePageMixin and download-csv/download-xls routes to ListPageJournal, ListPageJournalByPublisher, and ListPageJournalByCategory
- Refactor existing standalone download views to use the utility functions
- Update download_button_csv_xls.html template to use routablepageurl tags
- Add tests for generate_csv_response and generate_xls_response
- Update existing tests to import from new utility location

Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Agent-Logs-Url: https://github.com/scieloorg/core/sessions/6c34ec03-b2a2-4ba6-b03f-32ba5018f3cc
- Create JournalDownloadMixin to eliminate code duplication across three page classes
- Improve error messages in generate_csv_response and generate_xls_response to specify the format

Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Agent-Logs-Url: https://github.com/scieloorg/core/sessions/6c34ec03-b2a2-4ba6-b03f-32ba5018f3cc
Copilot AI changed the title [WIP] Implement download function for CSV of journal listings feat: add reusable CSV/XLS export utility and download routes to journal page classes Mar 23, 2026
Copilot AI requested a review from robertatakenaka March 23, 2026 13:10
class JournalDownloadMixin:
@re_path(r"^download-csv/$", name="download_csv")
def download_csv(self, request):
journals_data = get_scielo_journals_data()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot deve considerar que os itens foram filtrados usando os argumentos de request

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 05eed2e. The download endpoints now respect request filters:

  • get_scielo_journals_data(filters=None) accepts optional Q filters
  • JournalDownloadMixin uses a template method get_export_filters(request) — each page class overrides it to build the same filters used in get_context
  • ListPageJournalByPublisher.get_export_filters includes publisher name search (matching the page's own filter logic)
  • Download template forwards current query params via request.GET.urlencode

…ith_letter, tab)

- get_scielo_journals_data() now accepts optional Q filters parameter
- JournalDownloadMixin uses template method get_export_filters(request)
- Each page class overrides get_export_filters with its own filter logic
- Download template forwards current query parameters to download URLs
- Added tests for filtered data retrieval

Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Agent-Logs-Url: https://github.com/scieloorg/core/sessions/426fb4c6-9c21-4ab8-a3a9-21eaa4055951
Copilot AI changed the title feat: add reusable CSV/XLS export utility and download routes to journal page classes feat: add reusable CSV/XLS export utility and filtered download routes to journal page classes Mar 23, 2026
Copilot AI requested a review from robertatakenaka March 23, 2026 16:55
@robertatakenaka robertatakenaka marked this pull request as ready for review March 23, 2026 17:58
Copilot AI review requested due to automatic review settings March 23, 2026 17:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts the journals CSV/XLS export logic into a shared utility module and adds filtered download endpoints directly to journal listing page classes via a reusable Wagtail routable mixin, preserving current-page filters in the exported data.

Changes:

  • Added core/home/utils/export_journals.py with helpers to fetch journal data and generate CSV/XLS download responses.
  • Added JournalDownloadMixin + routable download routes to journal list page classes, using per-page filter construction (get_export_filters).
  • Updated the download button include to call the page’s routable download endpoints and forward current querystring filters; refactored legacy standalone views to use the utility and added tests.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
core/home/utils/export_journals.py New reusable utility to fetch/export journal data as CSV/XLS
core/home/models.py Adds routable download endpoints + per-page export filtering hooks
core/home/views.py Refactors legacy global download endpoints to reuse utility
core/templates/home/include/download_button_csv_xls.html Switches links to routable endpoints and forwards current filters
core/home/tests.py Adds/updates tests for export utility and filtered data retrieval

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +247 to +253
def get_export_filters(self, request):
search_term = request.GET.get("search_term", "")
starts_with_letter = request.GET.get("start_with_letter", "")
active_or_discontinued = list(request.GET.get("tab", ""))
return default_journal_filter(
search_term, starts_with_letter, active_or_discontinued
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list(request.GET.get("tab", "")) turns a tab value like "active" into a list of characters, which will make Q(status__in=active_or_discontinued) effectively never match. Use request.GET.getlist("tab") (if the UI can send multiple values) or normalize a single tab value into a one-element list (e.g., [tab] if tab else []) before passing to default_journal_filter.

Copilot uses AI. Check for mistakes.
Comment on lines +271 to +284
def get_export_filters(self, request):
search_term = request.GET.get("search_term", "")
starts_with_letter = request.GET.get("start_with_letter", "")
active_or_discontinued = list(request.GET.get("tab", ""))
filters = Q(status__in=SCIELO_STATUS_CHOICES)
if search_term:
filters &= Q(journal__title__icontains=search_term) | Q(
journal__owner_history__institution__institution__institution_identification__name__icontains=search_term
)
if starts_with_letter:
filters &= Q(journal__title__istartswith=starts_with_letter)
if active_or_discontinued:
filters &= Q(status__in=active_or_discontinued)
return filters
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: list(request.GET.get("tab", "")) produces a character list and breaks the status__in filter. Switch to request.GET.getlist("tab") or wrap the single tab string into a list; otherwise the export for the “tab” filter will be incorrect.

Copilot uses AI. Check for mistakes.

logger = logging.getLogger(__name__)

HEADERS = ["journals", "scielo_url", "publisher"]
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exported column headers are journals/publisher, but the internal data keys are title/owner. This mismatch makes the utility harder to reuse correctly and increases the chance of future mistakes. Consider aligning the dict keys to the export contract (e.g., use journals and publisher everywhere) and then read those same keys in both generate_csv_response and generate_xls_response.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +44
{
"title": title,
"scielo_url": scielo_url,
"owner": owner,
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exported column headers are journals/publisher, but the internal data keys are title/owner. This mismatch makes the utility harder to reuse correctly and increases the chance of future mistakes. Consider aligning the dict keys to the export contract (e.g., use journals and publisher everywhere) and then read those same keys in both generate_csv_response and generate_xls_response.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +63
writer.writerow(
[journal.get("title"), journal.get("scielo_url"), journal.get("owner")]
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exported column headers are journals/publisher, but the internal data keys are title/owner. This mismatch makes the utility harder to reuse correctly and increases the chance of future mistakes. Consider aligning the dict keys to the export contract (e.g., use journals and publisher everywhere) and then read those same keys in both generate_csv_response and generate_xls_response.

Copilot uses AI. Check for mistakes.
"collection__domain",
"journal__owner_history__institution__institution__institution_identification__name",
"issn_scielo",
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Export rows can be duplicated if the joins (notably journal__owner_history...) produce multiple rows per SciELOJournal. This will lead to repeated lines in CSV/XLS exports. A concrete mitigation is applying distinct() to the values() queryset (e.g., qs.values(...).distinct()), or otherwise selecting a single “current” owner record before exporting.

Suggested change
)
).distinct()

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +11
<a class="dropdown-item" href="{% routablepageurl page 'download_xls' %}{% if request.GET.urlencode %}?{{ request.GET.urlencode }}{% endif %}">XLS</a>
</li>
<li>
<a class="dropdown-item" href="{% url 'download_csv_journals_page_scielo_org' %}">CSV</a>
<a class="dropdown-item" href="{% routablepageurl page 'download_csv' %}{% if request.GET.urlencode %}?{{ request.GET.urlencode }}{% endif %}">CSV</a>
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These links now depend on a page variable being present in the template context. If this include is ever rendered outside a Wagtail Page context (or the calling template doesn’t pass page explicitly), it will error at render time. If this include is intended to be reusable, consider adding a safe fallback to the legacy global URLs when page is missing, or require/ensure page is explicitly provided by the caller.

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +49
except Exception as e:
logger.error(f"Error fetching scielo journals data: {e}")
return []
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging the exception message without a traceback makes diagnosing production issues harder. Prefer logger.exception("Error fetching scielo journals data") (or logger.error(..., exc_info=True)) so the traceback is captured.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +95
@re_path(r"^download-csv/$", name="download_csv")
def download_csv(self, request):
filters = self.get_export_filters(request)
journals_data = get_scielo_journals_data(filters)
return generate_csv_response(journals_data)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key new behavior is “export respects the same filters as the listing page” via get_export_filters(request) + routable endpoints. There are good unit tests for the export utility, but there’s no coverage shown for the routable download routes applying real request query params (e.g., search_term, start_with_letter, tab) end-to-end. Add an integration-style test that requests the page’s download-csv/ (and/or download-xls/) with query parameters and asserts the response contains only the filtered journal(s).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Desenvolver função para download de arquivo CSV com dados de periódicos

3 participants