Skip to content

Conversation

@abdulrahim2002
Copy link

@abdulrahim2002 abdulrahim2002 commented Apr 24, 2025

The API get_developer_stats fetches the closed issues, applies the random forest algorithm on them to obtain the domain labels for each of the closed issue. Furthermore, it gets the developer associated with the closed issue from the { "user", "login" } and { "user", "id" } parameters in returned JSON.

Then it builds a dataframe in the form:

------------------------------------------------------ 
| developer_name | developer_id | domain | frequency | 
------------------------------------------------------

Which tells us, how much issues, a particular developer has solved on a particular domain. This information is helpful to determine the expertise of a developer. For example, a developer who has solved many issues relating to 'database' can be assumed to be an expert in all issues that pertain to the 'database' label on that particular repository.

The API `get_developer_stats` fetches the closed issues, applies the
random forest algorithm on them to obtain the domain labels for each of
the closed issue. Furthermore, it gets the developer associated with the
closed issue from the { "user", "login" } and { "user", "id" }
parameters in returned JSON.

Then it build a dataframe in the form:

------------------------------------------------------
| developer_name | developer_id | domain | frequency |
------------------------------------------------------

Which tells us, how much issues has a particular developer solved on a
particular domain. This information is helpful to determine the
expertise of a developer. For example, a developer who has solved 100
issues relating to 'database' can be assumed to be an expert in all
issues that pertain to the 'database' label.
@abdulrahim2002
Copy link
Author

Sample output:

developer_name developer_id Domain frequency
wanling0000164749591Data Structure-Data Sorting1
wanling0000164749591Event Handling1
wanling0000164749591Software Development and IT Operations-Configuration Management1
tim-bsm125408219Application-Version Control1
tim-bsm125408219Databases-Backup and Recovery1
tim-bsm125408219Software Development and IT Operations-Configuration Management1
subhramit74734844Event Handling16
subhramit74734844Software Development and IT Operations-Configuration Management16
subhramit74734844Data Structure-Data Sorting14
subhramit74734844Data Structure-Search Algorithms2
priyanshu16095147549268Data Structure-Data Sorting1

@abdulrahim2002
Copy link
Author

Sample input:

{
    "github_token": "#github token",
    "repo_owner": "JabRef",
    "repo_name": "JabRef",
    "openai_key": "#open ai api key",
    "limit" : "100"
}

@BenCarter44
Copy link
Collaborator

BenCarter44 commented Jul 18, 2025

Hello Abdul,

Thank you for submitting the PR. Sorry there has not been any comments way sooner! Got busy with other things :) . Nice feature on adding developer statistics, it will make a great addition to the CoreEngine. I noticed you overwritten the Issue class as well as redefining the get_issues() function.

I think your changes would work better in the actual Issue class in src/issue_class.py, where the optional attributes dev_name and dev_id are included in the original Issue class with the optional attributes. Same goes for the get_issues() class. From what I read, it is the same function except for using the new Issue() class. Therefore, I think your changes would work better in the original get_issues() function, located in src/classifier.py. The function get_developer_stats can remain in the file named correspondingly.

There should not be any negative effects on the rest of the program as these new optional attributes in the Issue class are just for generating statistics from get_developer_stats.

With these two small changes, then it would be ready to merge.

If you need anything you can leave a comment.

-Ben

@BenCarter44 BenCarter44 self-assigned this Jul 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants