-
Notifications
You must be signed in to change notification settings - Fork 206
Add addition best practice on hosting GTFS files #567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
While supportive, the statements have to do with reuse and open data. Is that within scope of the best practises? |
There are other best practises related to how the file is hosted, like not requiring a login and supporting the file modification date.
I use CURLOPT_TIMECONDITION and CURLOPT_TIMEVALUE to set a If-Modified-Since http header, which works for most sources. |
|
I wonder if the current use of negative phrasing (e.g., "should not") might come across as a bit strong, especially considering that the final decision rests with data producers. In some regions, producers may have "compelling reasons" (e.g., political considerations). Would it make sense to rephrase these statements using more positive, recommendation-based language? This might better respect the position of producers while still conveying the best practices valued by the community. |
|
If you are creating a standard you must use standard language not "do whatever you want to do". |
|
"Should not" is a phase with a specific definition in the standard. |
|
I would suggest that any recommendation made to serving GTFS feeds should apply to GTFS-rt as well. (I am thinking, for example, of the GTFS-rt endpoint that sporadically responds with a Cloudflare CAPTCHA...) |
I think "should not" is a good fit here, quoting from RFC 2119:
I agree. Adding it to this PR might help to bring awareness. |
I had less problems with GTFS Realtime probably because they are often hosted on separate web servers rather then the transit agency's main site, but I ran into one with a geographic block yesterday. I have added similar best practices for GTFS Realtime. (I noticed the document is really inconsistent about putting a dash between "real" and "time".) |
|
I'm going to call for a vote. The voting period will end on Aug 21, 2025 at 23:59:59 UTC. |
|
-1 OpenGeo |
|
+1 MOTIS |
|
+1 Transit @skinkie are you suggesting we don't mention open data best practices in the GTFS best practices at all, or that we should re-use existing open data best practices? Tried to look and The Open Definition looks like it's the closest to what GTFS needs, but it's also very broad and strict, probably stricter that we need it to be. |
|
@gcamp either we follow strict "open data" rules / best practices. Or we stick with being a transit standard. While most of us share the first goal (too), we are here for a transit standard. Using git blame I cannot see where this line actually came from "The URL should be directly accessible without requiring a login to access the feed." but it is also an open activism claim. |
|
+1 Transitous |
|
+1 Trillium @skinkie The spec's Best Practices already make value statements regarding open data, so precedence has already been established. This change just adds stipulations to the already-existing best practice for publishing GTFS openly. If your data is to be considered "public" and "directly...[and] openly downloadable" (https://gtfs.org/documentation/schedule/schedule-best-practices/), thus in compliance with Best Practices, it necessarily cannot simultaneously restrict access—based on geography or otherwise. |
|
+1 SkedGo |
|
+1 Aubin |
|
Thank you all for your contributions. |
My problem is that @isabelle-dr added this, and this was not done with any prior discussion. Hence this is mixing stuff, and I think it would even be better that this is removed. Given my previous argument: this is not part of a technical specification. And I am fully supportive of open data. But not as a "requirement" in a technical specification. |
|
I agree with @skinkie on the delineation between the technical spec and the practice of hosting data in an open way... but isn't that why this is in the "best practices" and not in the spec itself? This seems to be like a reasonable best practice. |
Not trying to be stuborn. But from this point https://gtfs.org/documentation/schedule/schedule-best-practices/#practice-recommendations-organized-by-file onward the best practice becomes a guideline, with examples. Read: better than the "schema". |
The whole page is best practices, which would be guidelines by definition. The spot you point out is just where best practices for individual files begin. The best practices used to be a separate document but where merged into the specification to give them more prominence. |
|
A note on adding the statement about data publishing into the Official Spec from the Best Practices. Previous discussions:
Although this statement was recently added to the spec, it has long existed in the GTFS Best Practices (prior to MobilityData). We see the integration of widely adopted Best Practices into the official spec more of a consolidation than a new addition. |
|
This pull request has been automatically marked as stale because of lack of recent activity. It may be closed manually after one month of inactivity. Thank you for your contributions. |
I sometimes run into problems on TransSee where my system to automatically download GTFS files with libcurl fails because the web server hosting it is blocking it.
Two of the reason this happens is because the web server only allow actual browsers to download files and because I am blocked from downloading it because they are excluding access to IP addresses other then ones in the United States from accessing it. I am in Canada.
This change addresses this by adding best practices advising against this.