EHR Link Crawler Test #997

labkey-martyp · 2025-07-28T11:53:17Z

Rationale

Core EHR test that crawls links in the EHR folder or subfolders to check for broken links. Help find CSP violations and provide a new test to ensure links expected to work do not have regressions.

Related Pull Requests

Changes

Add testCrawlEhrLinks
Add skip links known to cause trouble in test setups
Update EHRApp test skip links

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

labkey-tchad · 2025-07-28T22:13:14Z

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

+            }
+
+            // scope this to admin, ehr folder and subfolders
+            if (!decodedHref.contains(getContainerPath()) && !decodedHref.startsWith("/admin"))


Crawler.ControllerActionId will parse out the controller, action, and containerPath of a URL (new Crawler.ControllerActionId(href)). Note that it does not include a leading slash like getContainerPath() does.

labkey-tchad · 2025-07-28T22:13:55Z

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

+            }
+
+            // scope this to admin, ehr folder and subfolders
+            if (!decodedHref.contains(getContainerPath()) && !decodedHref.startsWith("/admin"))


Why does this crawl admin actions? Our regular crawler should definitely hit those.

I meant to just validate that links in the EHR pointing to admin are valid URLs. I just added admin to the default urls to skip crawling, but it will continue to validate the links to there from the EHR.

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

labkey-tchad · 2025-07-28T22:52:32Z

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

+
+            try
+            {
+                openLinkInNewWindowOrThrow(anchor);


What sort of links don't open successfully? This seems like an error state.
If there are links that we expect to not open, we should filter them out earlier.
Waiting for a thing that won't happen can waste a lot of time (switchToWindow waits ten seconds for the new tab to appear)

Opening these links at all in these phase seems inefficient. validatePageLinks also navigates to the links, so we're hitting each link twice.

So our pages have lots of anchor tags on them that aren't links, and the ones that are links can have a href, onclick handler or click listener added a variety of ways in JS. Instead of trying to pre-determine which anchor tags are actual links, this clicks on the anchor tags and sees if a new window opens. That's why I added the throw immediately in openLinkInNewWindow, to not wait ten seconds for a new window.

We are hitting each link twice. I'm trying to do somewhat of a breadth first search validating the links on one page at a time (easier logging and troubleshooting) and iterating through the anchor tags to see which are actual links. Once determined if its an actual link then the URL gets queued up for crawling. I don't think it will work to queue up a bunch of anchor tag webelements then navigate away. It needs to determine what anchor tags are actual links for crawling. I'm seeing this test taking 7-12 min. A little longer but it's pretty comprehensive.

Skipping the dataset view is making these run a couple minutes faster.

…java Co-authored-by: Trey Chadick <tchad@labkey.com>

labkey-tchad · 2025-07-31T22:59:00Z

ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.java

+    private String validLink(WebElement anchor)
+    {
+        String href = anchor.getDomAttribute("href");
+        if (href != null && !href.startsWith("#") && !href.equalsIgnoreCase("undefined"))


<a href="undefined"> definitely seems like an error case. Maybe the NightlyTestServer module property needs to be set or ehrBegin.html should handle it being unset?

I did update the WNPRC test to set NightlyTestServer, unfortunately it's not the only place where this happens. The issue here is Crawler.ControllerActionId doesn't handle this and throws a low level exception (NPE or something can't remember). This will still throw an error just doesn't try to parse apart the href.

labkey-martyp added 11 commits July 20, 2025 11:11

Link crawler test

d342fda

remove unused lookup

f6de2fa

decode url

56ed54f

Decoding

b0cf8a8

Better messages

9f11597

rename

644a8e5

test fix

41a2c78

testing

a9c18e1

fix ehr app test

f808ae3

Longer load wait and cleanup

8f917f6

comments

ea28d70

labkey-martyp requested review from labkey-danield and labkey-jeckels July 28, 2025 11:53

This was referenced Jul 28, 2025

Link crawler fixes LabKey/snprcEHRModules#835

Merged

Link crawler set up LabKey/nircEHRModules#559

Merged

Link crawler test fixes LabKey/wnprc-modules#855

Merged

Link crawler test LabKey/onprcEHRModules#1402

Merged

labkey-danield requested a review from labkey-tchad July 28, 2025 17:09

Update test

43197a6

labkey-tchad reviewed Jul 28, 2025

View reviewed changes

labkey-martyp and others added 6 commits July 29, 2025 14:28

Update ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.…

b0e22ac

…java Co-authored-by: Trey Chadick <tchad@labkey.com>

Update ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.…

890742e

…java Co-authored-by: Trey Chadick <tchad@labkey.com>

Update ehr/test/src/org/labkey/test/tests/ehr/AbstractGenericEHRTest.…

15a7bff

…java Co-authored-by: Trey Chadick <tchad@labkey.com>

Code review feedback

0dcd01f

skip crawling datasets view

2f1d99a

Handle "undefined" href

c389b07

labkey-martyp requested a review from labkey-tchad July 31, 2025 03:45

labkey-tchad approved these changes Jul 31, 2025

View reviewed changes

verify link function

d871bfa

labkey-martyp merged commit 46a1422 into release25.7-SNAPSHOT Aug 3, 2025
6 of 7 checks passed

labkey-martyp deleted the 25.7_fb_ehr_link_crawler branch August 3, 2025 23:22

EHR Link Crawler Test #997

EHR Link Crawler Test #997

Uh oh!

Conversation

labkey-martyp commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale

Related Pull Requests

Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

labkey-martyp commented Jul 28, 2025 •

edited

Loading