-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Description
The function for filtering out non-english apps doesn't handle all cases. If we review the Apple Store dataset, we see app names that consist of only three characters, which are classified as English apps using the given function because the count of non-ascii characters is not larger than 3.
Proposed solution:
def is_english(string):
non_ascii = 0
for character in string:
if ord(character) > 127:
non_ascii += 1
if non_ascii > 3:
return False
if non_ascii == len(string):
return False
return True
Metadata
Metadata
Assignees
Labels
No labels