Skip to content

Conversation

@kennethrioja
Copy link
Contributor

Summary of changes

  • From the basis of lib/ingestors/ical_ingestor.rb I extended it to be able to ingest:
  1. Direct .ics links (e.g., https://indico.cern.ch/event/1596327/event.ics or https://indico.cern.ch/category/19377/events.ics)
  2. Indico /event links (e.g., https://indico.cern.ch/event/1596327/)
  3. Indico /category links (e.g., https://indico.cern.ch/category/19377/)

Motivation and context

training.cern.ch needs indico ingestion

Screenshots

✅ 100% test covered
image

Checklist

  • I have read and followed the CONTRIBUTING guide.
  • I confirm that I have the authority necessary to make this contribution on behalf of its copyright owner and agree
    to license it to the TeSS codebase under the
    BSD license.

@fbacall fbacall requested a review from Copilot October 16, 2025 11:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends the iCalendar ingestor to support direct .ics links and Indico event/category URLs. The changes refactor the existing IcalIngestor class to handle multiple URL formats, extract reusable functionality into concerns, and add comprehensive test coverage for the new URL conversion logic.

Key Changes:

  • Refactored IcalIngestor to process direct .ics files, Indico /event URLs, and Indico /category URLs
  • Extracted sitemap parsing logic into a reusable SitemapHelpers concern
  • Created IcalIngestorExportUrl concern to handle URL-to-export conversion for different URL formats

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
lib/ingestors/ical_ingestor.rb Core refactoring: split monolithic methods into smaller, focused methods; integrated new concerns for sitemap and URL handling
lib/ingestors/concerns/sitemap_helpers.rb New concern: extracts sitemap parsing (XML/TXT) into reusable module
lib/ingestors/concerns/ical_ingestor_export_url.rb New concern: handles conversion of various URL formats to their .ics export equivalents
test/unit/ingestors/ical_ingestor_test.rb Added comprehensive tests for error handling, URL conversion logic, and edge cases

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

def process_description(input)
return input if input.nil?

desc = input.to_s.gsub('', '<br />')
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gsub pattern is empty (''), which will not match anything. This should likely be /\\R/ to match line breaks, as in the original implementation.

Suggested change
desc = input.to_s.gsub('', '<br />')
desc = input.to_s.gsub(/<br\s*\/?>/i, "\n")

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@kennethrioja kennethrioja Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed – process_description is useless. I deleted it.

events = Icalendar::Event.parse(open_url(file_url, raise: true).set_encoding('utf-8'))
# Assigns to event: venue, online, city.
def assign_location_info(event, location)
return if location.blank? || !location.present?
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition location.blank? || !location.present? is redundant. blank? and present? are opposites, so this simplifies to just return if location.blank?.

Suggested change
return if location.blank? || !location.present?
return if location.blank?

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@kennethrioja kennethrioja Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 110 to 114
[
location['suburb'],
location['postcode'],
location['country']
]
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process_location method attempts to call location['suburb'] but location is a string (from calevent.location), not a hash. This will cause a TypeError. The original code called convert_location(calevent.location) which likely returned a hash.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@kennethrioja kennethrioja Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the old code logic. I am now checking whether location is an array before returning this array of location variables, see changes here.

Comment on lines 195 to 200
def ingestor.assign_basic_info(*)
raise StandardError, 'test failure'
end

ingestor.send(:process_calevent, calevent)

Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Reopening the singleton class to stub a method is fragile. Consider using a mocking framework like Mocha or Minitest's stub mechanism for cleaner and more maintainable test doubles.

Suggested change
def ingestor.assign_basic_info(*)
raise StandardError, 'test failure'
end
ingestor.send(:process_calevent, calevent)
ingestor.stub(:assign_basic_info, ->(*) { raise StandardError, 'test failure' }) do
ingestor.send(:process_calevent, calevent)
end

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@kennethrioja kennethrioja Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 33 to 35
uri.path = File.join(uri.path, 'event.ics') unless uri.path.end_with?('/event.ics')
elsif uri.path&.include?('category')
uri.path = File.join(uri.path, 'events.ics') unless uri.path.end_with?('/events.ics')
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using File.join can produce incorrect paths when uri.path ends with a trailing slash, as it may create paths like /event//event.ics. Consider using uri.path.chomp('/') + '/event.ics' or similar string manipulation to ensure proper path formatting.

Suggested change
uri.path = File.join(uri.path, 'event.ics') unless uri.path.end_with?('/event.ics')
elsif uri.path&.include?('category')
uri.path = File.join(uri.path, 'events.ics') unless uri.path.end_with?('/events.ics')
uri.path = uri.path.chomp('/') + '/event.ics' unless uri.path.end_with?('/event.ics')
elsif uri.path&.include?('category')
uri.path = uri.path.chomp('/') + '/events.ics' unless uri.path.end_with?('/events.ics')

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not true, when I register : https://indico.cern.ch/category/19377/ with trailing slash, export_url == https://indico.cern.ch/category/19377/events.ics no //...

@kennethrioja
Copy link
Contributor Author

f3f16b9 -> adds the possibility to use Indico API token to have an authenticated access to restricted events, only works for one Indico domain.

Copy link
Member

@fbacall fbacall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Indico ingestor should be a separate class, maybe a subclass of the Ical ingestor.

@kennethrioja
Copy link
Contributor Author

Tests are in the making

Copy link
Member

@fbacall fbacall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor change


# Reads either a sitemap.{xml|txt} or a single URL
# Returns a list of URLs from 1 to n URLs
def get_sources(source_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this method to parse_sitemap?

Just so it doesn't get confused with TeSS' concept of "sources"

@fbacall fbacall merged commit 3d70654 into ElixirTeSS:master Dec 18, 2025
7 checks passed
@kennethrioja kennethrioja deleted the indico branch December 19, 2025 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants