Skip to content

Conversation

@dfsm
Copy link
Contributor

@dfsm dfsm commented Sep 28, 2025

The original scrapePageDropdown method is still in place as a fallback, but the new data atributes will be used first.

See: https://forums.somethingawful.com/showthread.php?threadid=3944917&pagenumber=3#post548658773

  • The tests using the updated showthread2.html fixture (testIgnoredPost and testWeirdSizeTags) are using the new data attributes: [PageNav] Using data attributes: page 17/18
  • The tests with older fixtures are falling back to the select menu method: [PageNav] Using select menu fallback: page X/Y

Changes Made:

  1. Updated scrapePageDropdown function (AwfulCore/Sources/AwfulCore/Scraping/Helpers.swift:144-169):
    - First attempts to read the new data-current-page and data-total-pages attributes
    - Falls back to the old select menu counting method for backward compatibility
    - Added logging to show which method is being used: [PageNav] Using data attributes or [PageNav] Using select menu fallback
  2. Added new PageNavigationData struct and scrapePageNavigationData function (Helpers.swift:111-142):
    - Captures all four new data attributes: current-page, total-pages, base-url, and per-page
    - Available for future use when you need the base URL or per-page values
  3. Updated test fixture (showthread2.html):
    - Replaced this file with a modern sample (taken from the same thread and roughly the same page)
    - Tests confirm the new attributes are being read correctly

Test Results:

  • All 60 tests pass successfully
  • Logging shows the new data attributes are being used when present
  • Fallback to select menu works for older fixtures without the attributes

The implementation ensures the app won't break when the site changes the inner HTML of the pages div, as warned by the admin.

The original scrapePageDropdown method is still in place as a fallback, but the new data atributes will be used first.

See: https://forums.somethingawful.com/showthread.php?threadid=3944917&pagenumber=3#post548658773

  - The tests using the updated showthread2.html fixture (testIgnoredPost and testWeirdSizeTags) are using the new data attributes: [PageNav] Using data  attributes: page 17/18
  - The tests with older fixtures are falling back to the select menu method: [PageNav] Using select menu fallback: page X/Y

  Changes Made:

  1. Updated scrapePageDropdown function
  (AwfulCore/Sources/AwfulCore/Scraping/Helpers.swift:144-169):
    - First attempts to read the new data-current-page and data-total-pages attributes
    - Falls back to the old select menu counting method for backward compatibility
    - Added logging to show which method is being used: [PageNav] Using data attributes or [PageNav] Using select menu fallback
  2. Added new PageNavigationData struct and scrapePageNavigationData function
  (Helpers.swift:111-142):
    - Captures all four new data attributes: current-page, total-pages, base-url, and per-page
    - Available for future use when you need the base URL or per-page values
  3. Updated test fixture (showthread2.html):
    - Replaced this file with a modern sample (taken from the same thread and roughly the same page)
    - Tests confirm the new attributes are being read correctly

  Test Results:
  - All 60 tests pass successfully
  - Logging shows the new data attributes are being used when present
  - Fallback to select menu works for older fixtures without the attributes

  The implementation ensures the app won't break when the site changes the inner HTML of the pages div, as warned by the admin.
@dfsm dfsm requested a review from nolanw September 28, 2025 00:04
Copy link
Member

@nolanw nolanw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple nits, ok by me if you wanna merge as is though

let baseURL = pageDiv["data-base-url"]
let perPage = pageDiv["data-per-page"].flatMap { Int($0) }

print("[PageNav] Using data attributes: page \(currentPage)/\(totalPages)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this helpful once it's all working?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the print but will keep returning baseURL and perPage, even though they aren't currently used for anything. They could be useful and the data is there, so might as well return it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep that's cool

let totalPages = Int(totalPagesStr) {
print("[PageNav] Using data attributes: page \(currentPage)/\(totalPages)")
return (pageNumber: currentPage, pageCount: totalPages)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit redundant with scrapePageNavigationData, is there a sensible way to call one from the other? It's ok if not, I'm not opposed to some redundancy if it makes likely changes easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fallback method. Was a "just in case" idea that isn't really needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right on

…olely use the new scrapePageNavigationData()
@nolanw nolanw merged commit a624e3e into main Oct 17, 2025
1 check failed
@nolanw nolanw deleted the page-select-changes branch October 17, 2025 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants