Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions src/main/java/com/whichclasses/statistics/ConcreteAggregator.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
package com.whichclasses.statistics;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;

import com.google.common.collect.Maps;
import com.google.inject.Inject;
import com.whichclasses.model.Course;
import com.whichclasses.model.DataSource;
import com.whichclasses.model.Department;
import com.whichclasses.model.proto.TceRating.Question;

public class ConcreteAggregator implements DataAggregator {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got a better name than ConcreteAggregator?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal convention recently has been to name classes 'Standard[interface name]" for interfaces which only have one real implementation (not including test classes). Any better alternatives?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For [interface] and single-implementation, I prefer the "[Interface]Impl" convention. But ideally I'd prefer a descriptive name, particularly since it's an "aggregator" and should describe what kind of aggregation it is doing. RankingDataAggregator?

Note that if you don't have a reason to split the interface/implementation out, you can also just combine them into one class. Guice isn't restricted to injecting interfaces, so injecting a concrete class could make sense if you don't expect to have other aggregators going forward.

There are definitely a few cases in the codebase right now of an interface with a single implementation, but most of those have an intention (or at least a plausible possibility) to have multiple future implementations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are definitely a few cases in the codebase right now of an interface with a single implementation, but most of those have an intention (or at least a plausible possibility) to have multiple future implementations.

Also unit tests. :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can change DataAggregator so that it just implements a single "aggregate" method. This way we can have sorting aggregators, grouping aggregators, etc. Maybe genericize DataAggregator with the return type of the aggregate method?

public class RankingDataAggregator implements DataAggregator<List>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable, what would the return type be though? Why not always
a Rateable?

On Tue, May 5, 2015 at 12:09 PM, Sean Topping notifications@github.com
wrote:

In src/main/java/com/whichclasses/statistics/ConcreteAggregator.java
#3 (comment):

+package com.whichclasses.statistics;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+
+import com.google.common.collect.Maps;
+import com.google.inject.Inject;
+import com.whichclasses.model.Course;
+import com.whichclasses.model.DataSource;
+import com.whichclasses.model.Department;
+import com.whichclasses.model.proto.TceRating.Question;
+
+public class ConcreteAggregator implements DataAggregator {

Maybe we can change DataAggregator so that it just implements a single
"aggregate" method. This way we can have sorting aggregators, grouping
aggregators, etc. Maybe genericize DataAggregator with the return type of
the aggregate method?

public class RankingDataAggregator implements DataAggregator>


Reply to this email directly or view it on GitHub
https://github.com/cleichner/whichclasses/pull/3/files#r29702022.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the intent of the class is to answer queries such as:
"Give me a list of rated courses in CSC."
"Give me a list of rated professors in AME."
"Give me a list of rated semesters for ACCT." (how are the classes in ACCT changing over time?)

Come to think of it, maybe the aggregator should always just return a list of Rateable objects. Then you can do sorting and selection as needed. Basically it's just a way to convert a large collection of individual class ratings into chunked rated collections (e.g. courses, departments, instructors, semesters, etc). Basically a GROUP BY implementation.


DataSource dataSource;
Map<String, Map<Question, Double>> courseScoreDataMap = Maps.newHashMap();

@Inject
public ConcreteAggregator(DataSource dataSource) {
this.dataSource = dataSource;
}

@Override
public List<Course> getRankedCourses(String deptId, Question question) {
Department dept = dataSource.getDepartmentList().getChildren().get(deptId);
Collection<Course> courses = dept.getChildren().values();
List<Course> courseList = new ArrayList<Course>(courses);
Collections.sort(courseList, new SimpleCourseComparator(question));
return courseList;
}

}
10 changes: 10 additions & 0 deletions src/main/java/com/whichclasses/statistics/DataAggregator.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
package com.whichclasses.statistics;

import java.util.List;

import com.whichclasses.model.Course;
import com.whichclasses.model.proto.TceRating.Question;

public interface DataAggregator {
public List<Course> getRankedCourses(String deptId, Question question);
}
9 changes: 9 additions & 0 deletions src/main/java/com/whichclasses/statistics/Rateable.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
package com.whichclasses.statistics;

import com.whichclasses.model.proto.TceRating;

public interface Rateable {
public int getRatingCount(TceRating.Question question);
public double getAverageScore(TceRating.Question question);
public double getWilsonScore(TceRating.Question question);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Wilson score make sense on an individual Rateable item, or only on a RateableCollection?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense in either context, all it requires is a collection of ratings (which can belong to an individual class or a collection of classes). You can rate a course by grabbing all class ratings and then calculating a Wilson score from that set, or you can calculate individual Wilson scores for each class and then aggregate the class results in some other way.

}
65 changes: 65 additions & 0 deletions src/main/java/com/whichclasses/statistics/RateableCollection.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
package com.whichclasses.statistics;

import java.util.Collection;
import java.util.Map;

import com.google.common.collect.Maps;
import com.whichclasses.model.proto.TceRating;
import com.whichclasses.model.proto.TceRating.Question;

public class RateableCollection implements Rateable {

private Collection<? extends Rateable> children;
private Map<Question, Double> questionToScore = Maps.newHashMap();
private Map<Question, Double> questionToWilsonScore = Maps.newHashMap();
private Map<Question, Integer> questionToNumRatings = Maps.newHashMap();

public RateableCollection( Collection<? extends Rateable> children ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: remove spaces after "(" and before ")"

this.children = children;
calculateStatistics();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be done more lazily? (e.g. when the first value is requested)

}

private void calculateStatistics() {
for(Question question : Question.values()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: space between "for" and "(" tokens (here and below, same for "if")

double qualityPoints = 0;
int numRatings = 0;
for(Rateable rateable : children) {
qualityPoints += rateable.getAverageScore(question) * rateable.getRatingCount(question);
numRatings += rateable.getRatingCount(question);
}
double averageScore = numRatings != 0 ? qualityPoints / numRatings : 0;
double wilsonScore = calculateWilsonScore((averageScore - 1) / 4, numRatings) * 4 + 1;
questionToScore.put(question, averageScore);
questionToWilsonScore.put(question, wilsonScore);
questionToNumRatings.put(question, numRatings);
}
}

@Override
public double getAverageScore(TceRating.Question question) {
return questionToScore.get(question);
}

@Override
public int getRatingCount(TceRating.Question question) {
return questionToNumRatings.get(question);
}

@Override
public double getWilsonScore(TceRating.Question question) {
return questionToWilsonScore.get(question);
}

/**
* Calculates the lower bound of the 95% confidence interval of a normalized data set of data
* @param p - Normalized average of sample data set
* @param n - Number of samples
* @return The normalized Wilson score of a normalized data set
*/
private double calculateWilsonScore(double p, int n) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static

if(n == 0) return 0;
double z = 1.96;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this value (1.96) a static final constant naming the confidence interval

return (p + z*z/(2*n) - z*Math.sqrt(p*(1-p)/n + z*z/(4*n*n)))/(1 + z*z/n);
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package com.whichclasses.statistics;

import java.util.Comparator;

import com.whichclasses.model.Course;
import com.whichclasses.model.proto.TceRating.Question;

public class SimpleCourseComparator implements Comparator<Course> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: AverageCourseComparator (to go with using getAverageScore)


private Question question;

public SimpleCourseComparator(Question question) {
this.question = question;
}

@Override
public int compare(Course o1, Course o2) {
double diff = o2.getAverageScore(question) - o1.getAverageScore(question);
return (int) (diff < 0 ? Math.floor(diff) : Math.ceil(diff));
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra blank line

}