Skip to content
This repository was archived by the owner on Jul 16, 2025. It is now read-only.

Conversation

@codersquirrelbln
Copy link
Member

@codersquirrelbln codersquirrelbln commented Aug 27, 2024

Issue Summary:
This PR addresses the need to refactor our application by introducing a repositories table in the database to store GitHub repositories associated with users. This enhancement will provide a consistent and reliable dataset for repository information, enabling improved performance and more robust features. Additionally, we are implementing a background job to keep the data in this table synchronized with the latest updates from the GitHub API.

Reason for Change:
Currently, our application fetches repository data directly from the GitHub API whenever needed. While functional, this approach presents several challenges:

  • API Rate Limits: Frequent API requests risk exceeding GitHub's rate limits, which can lead to delays or failed data retrieval.
  • Performance: Continuous data fetching from the API on every request slows down the application, negatively impacting user experience.
  • Consistency: Storing repository data in our database ensures uniformity across the application and facilitates building more sophisticated features.

Changes:

  • Create Repositories Table:
    Introduce a new repositories table to store essential data about each repository:

  • id: Unique identifier for the repository.

  • full_name: Full name of the repository (e.g., user/repo-name).

  • name: Name of the repository.

  • owner_login: GitHub username of the repository owner.

  • html_url: URL to the repository on GitHub.

  • description: Brief description of the repository.

  • private: Boolean indicating whether the repository is private.

  • fork: Boolean indicating whether the repository is a fork.

  • created_on github_at, updated_on_github_at, pushed_to_github_at: Timestamps related to the repository, adjusted column names as to not conflict with the automatically created columns created_at and updated_at.

  • Modify Associations:

  • Update the User model to reflect the new has_many :repositories association.

  • Update the Campaign model to include a belongs_to :repository association.

  • Ensure consistency in the Contribution model's relationships, with careful handling of dependent associations.

  • Implement Background Job for Syncing Data:

  • Add the sidekiq gem, in conjunction with redis

  • Introduce a background job with that periodically syncs the repository data in the database with the latest information from GitHub.

  • Sync Frequency: Setting the sync frequency to every day at midnight(up for iterative change, might later implement the option for users to manually trigger a sync for immediate updates).

The job will:

  • Fetch the latest repository data for each user.

  • Update existing records in the repositories table.

  • Add new repositories that the user may have created.

  • Handle deletions or modifications to repositories.

  • Refactor Existing Code:

  • Update controllers and helpers to utilize the new repositories table instead of fetching data directly from the GitHub API.
    Files to be updated include:

  • HomeController: Modify how repository data is fetched and displayed.

  • GithubApiHelper: Refactor to support both direct API calls and database interactions.

  • User model: Update associations to reflect the new relationship with repositories.

  • Additional models, controllers, or helpers that interact with repository data.

Tasks:

  • Create migration for the repositories table.
  • Update User model to include has_many :repositories association.
  • Update Campaign model to include belongs_to :repository association.
  • Ensure Contribution model's relationships remain consistent with the new associations.
  • Implement the background job for syncing GitHub data to the repositories table.
  • Refactor HomeController to fetch and display repository data from the repositories table.
  • Refactor GithubApiHelper to support both direct API calls and database interactions.
  • Update any other relevant models, controllers, or helpers that interact with repository data.
  • Thoroughly test all changes to ensure proper functionality.

Expected Outcome:

  • Improved performance by reducing the number of API calls.
  • Enhanced consistency and reliability of repository data within the application.
  • Simplified extension of features related to repositories and campaigns.
  • This refactor will lay a solid foundation for future development, ensuring our application can handle GitHub data more effectively.

closes #46

@codersquirrelbln codersquirrelbln added enhancement New feature or request testing database implications for database - structure and otherwise labels Aug 27, 2024
@codersquirrelbln codersquirrelbln self-assigned this Aug 27, 2024
Copy link
Contributor

@mauricevogel mauricevogel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a brief look over the campaigns controller and here is already some feedback related to it!

@codersquirrelbln codersquirrelbln merged commit 05ad197 into main Feb 22, 2025
2 checks passed
@codersquirrelbln codersquirrelbln deleted the create-repos-table branch February 22, 2025 23:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

database implications for database - structure and otherwise enhancement New feature or request testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Repositories Table and Background Job for GitHub Data Sync

3 participants