-
Notifications
You must be signed in to change notification settings - Fork 768
Sourcery refactored master branch #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to GitHub API limits, only the first 60 comments can be shown.
| # 统计结果写入result.txt(字典的遍历) | ||
| for (k, v) in num_dict.items(): | ||
| open('data/result.txt', 'a+').write(str(k) + ' ' + str(v) + '\n') # 将k,v转换为str类型 | ||
| open('data/result.txt', 'a+').write(f'{str(k)} {str(v)}' + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function com_tf refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation)
This removes the following comments ( why? ):
# 将k,v转换为str类型
| path1 = path + 'data/title_and_abs/' | ||
| newpath = path + "data/pro_keyword/" | ||
| path1 = f'{path}data/title_and_abs/' | ||
| newpath = f"{path}data/pro_keyword/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 20-21 refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation)
| data_source = open(file_import_url, 'r') | ||
| data = data_source.readline() | ||
| word_in_afile_stat = {} | ||
| word_in_allfiles_stat = {} | ||
| files_num = 0 | ||
| while data != "": # 对文件pro_res.txt进行处理 | ||
| data_temp_1 = data.strip("\n").split("\t") # file name and key words of a file | ||
| data_temp_2 = data_temp_1[1].split(",") # key words of a file | ||
| file_name = data_temp_1[0] | ||
| data_temp_len = len(data_temp_2) | ||
| files_num += 1 | ||
| data_dict = {} | ||
| data_dict.clear() | ||
| for word in data_temp_2: | ||
| if word not in word_in_allfiles_stat: | ||
| word_in_allfiles_stat[word] = 1 | ||
| data_dict[word] = 1 | ||
| else: | ||
| if word not in data_dict: # 如果这个单词在这个文件中之前没有出现过 | ||
| with open(file_import_url, 'r') as data_source: | ||
| data = data_source.readline() | ||
| word_in_afile_stat = {} | ||
| word_in_allfiles_stat = {} | ||
| files_num = 0 | ||
| while data != "": # 对文件pro_res.txt进行处理 | ||
| data_temp_1 = data.strip("\n").split("\t") # file name and key words of a file | ||
| data_temp_2 = data_temp_1[1].split(",") # key words of a file | ||
| file_name = data_temp_1[0] | ||
| data_temp_len = len(data_temp_2) | ||
| files_num += 1 | ||
| data_dict = {} | ||
| data_dict.clear() | ||
| for word in data_temp_2: | ||
| if word not in word_in_allfiles_stat: | ||
| word_in_allfiles_stat[word] = 1 | ||
| data_dict[word] = 1 | ||
| elif word not in data_dict: # 如果这个单词在这个文件中之前没有出现过 | ||
| word_in_allfiles_stat[word] += 1 | ||
| data_dict[word] = 1 | ||
|
|
||
| if not word_in_afile_stat.has_key(file_name): | ||
| word_in_afile_stat[file_name] = {} | ||
| if not word_in_afile_stat[file_name].has_key(word): | ||
| word_in_afile_stat[file_name][word] = [] | ||
| word_in_afile_stat[file_name][word].append(data_temp_2.count(word)) | ||
| word_in_afile_stat[file_name][word].append(data_temp_len) | ||
| data = data_source.readline() | ||
| data_source.close() | ||
|
|
||
| if not word_in_afile_stat.has_key(file_name): | ||
| word_in_afile_stat[file_name] = {} | ||
| if not word_in_afile_stat[file_name].has_key(word): | ||
| word_in_afile_stat[file_name][word] = [data_temp_2.count(word), data_temp_len] | ||
| data = data_source.readline() | ||
| # filelist = os.listdir(newpath2) # 取得当前路径下的所有文件 | ||
| TF_IDF_last_result = [] | ||
| if (word_in_afile_stat) and (word_in_allfiles_stat) and (files_num != 0): | ||
| for filename in word_in_afile_stat.keys(): | ||
| for filename, value in word_in_afile_stat.items(): | ||
| TF_IDF_result = {} | ||
| TF_IDF_result.clear() | ||
| for word in word_in_afile_stat[filename].keys(): | ||
| for word in value.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function TF_IDF_Compute refactored with the following changes:
- Use
withwhen opening file to ensure closure [×2] (ensure-file-closed) - Merge else clause's nested if statement into elif (
merge-else-if-into-elif) - Merge append into list declaration [×2] (
merge-list-append) - Use items() to directly unpack dictionary values (
use-dict-items) - Remove unnecessary call to keys() (
remove-dict-keys) - Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] (
remove-redundant-slice-index)
| path = base_path + 'data/computer/'# 原始数据 | ||
| path1 = base_path + 'data/title_and_abs/' # 处理后的标题和摘要 | ||
| newpath = base_path + 'data/pro_keyword/' | ||
| newpath2 = base_path + 'data/keyword/' | ||
| path = f'{base_path}data/computer/' | ||
| path1 = f'{base_path}data/title_and_abs/' | ||
| newpath = f'{base_path}data/pro_keyword/' | ||
| newpath2 = f'{base_path}data/keyword/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 17-20 refactored with the following changes:
- Use f-string instead of string concatenation [×4] (
use-fstring-for-concatenation)
This removes the following comments ( why? ):
# 处理后的标题和摘要
# 原始数据
| # print b | ||
| if b is None or b.string is None: | ||
| continue | ||
| else: | ||
| abstracts.extend(soup.title.stripped_strings) | ||
| s = b.string | ||
| abstracts.extend(s.encode('utf-8')) | ||
| f = open(path1 + filename + ".txt", "w+") # 写入txt文件 | ||
| abstracts.extend(soup.title.stripped_strings) | ||
| s = b.string | ||
| abstracts.extend(s.encode('utf-8')) | ||
| with open(path1 + filename + ".txt", "w+") as f: | ||
| for i in abstracts: | ||
| f.write(i) | ||
| f.close() | ||
| abstracts = [] | ||
| abstracts = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_text refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else) - Use
withwhen opening file to ensure closure [×2] (ensure-file-closed)
This removes the following comments ( why? ):
# 写入txt文件
# 将得到的未处理的文字放在pro_keyword文件夹中
# print b
| features = [text_len, isHasSH] | ||
| return features | ||
| return [text_len, isHasSH] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_feature refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| print(X[0:10]) | ||
| print(Y[0:10]) | ||
| print(X[:10]) | ||
| print(Y[:10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function load_data refactored with the following changes:
- Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] [×2] (
remove-redundant-slice-index)
| if __name__ == '__main__': | ||
| pass | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 48-49 refactored with the following changes:
- Remove redundant conditional (
remove-redundant-if)
| data_size = len(data) | ||
| num_batches_per_epoch = int(len(data)/batch_size) + 1 | ||
| for epoch in range(num_epochs): | ||
| for _ in range(num_epochs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function batch_iter refactored with the following changes:
- Replace unused for index with underscore (
for-index-underscore)
| raise ValueError("Linear is expecting 2D arguments: %s" % str(shape)) | ||
| raise ValueError(f"Linear is expecting 2D arguments: {str(shape)}") | ||
| if not shape[1]: | ||
| raise ValueError("Linear expects shape[1] of arguments: %s" % str(shape)) | ||
| raise ValueError(f"Linear expects shape[1] of arguments: {str(shape)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function linear refactored with the following changes:
- Replace interpolated string formatting with f-string [×2] (
replace-interpolation-with-fstring)
| pooled_outputs = [] | ||
| for filter_size, num_filter in zip(filter_sizes, num_filters): | ||
| with tf.name_scope("conv-maxpool-%s" % filter_size): | ||
| with tf.name_scope(f"conv-maxpool-{filter_size}"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function TextCNN.__init__ refactored with the following changes:
- Replace interpolated string formatting with f-string (
replace-interpolation-with-fstring)
| print("\nParameters:") | ||
| for attr, value in sorted(FLAGS.__flags.iteritems()): | ||
| print("{}={}".format(attr.upper(), value)) | ||
| print(f"{attr.upper()}={value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 36-46 refactored with the following changes:
- Replace call to format with f-string (
use-fstring-for-formatting)
| f2 = open('%s.txt' % item, 'a+') | ||
| for (k, v) in data_dict.items(): | ||
| f2.write(v + ',' + k + ' ' + '\n') | ||
| f2.close() | ||
| with open(f'{item}.txt', 'a+') as f2: | ||
| for (k, v) in data_dict.items(): | ||
| f2.write(v + ',' + k + ' ' + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_text refactored with the following changes:
- Use
withwhen opening file to ensure closure (ensure-file-closed) - Replace interpolated string formatting with f-string (
replace-interpolation-with-fstring)
| # print (files) | ||
| f = open(base_path + files, 'r') | ||
| text = (f.read().decode('GB2312', 'ignore').encode('utf-8')) | ||
| salt = ''.join(random.sample(string.ascii_letters + string.digits, 8)) # 产生随机数 | ||
| f2 = open("C:/Users/kaifun/Desktop/ass_TIP/TextInfoExp/Part2_Text_Classify/test3/" + salt + '.txt', 'w') | ||
| f2.write(text) | ||
| f3.write(salt + ' ' + 'e' + '\n') | ||
| f.close() | ||
| with open(base_path + files, 'r') as f: | ||
| text = (f.read().decode('GB2312', 'ignore').encode('utf-8')) | ||
| salt = ''.join(random.sample(string.ascii_letters + string.digits, 8)) # 产生随机数 | ||
| f2 = open( | ||
| f"C:/Users/kaifun/Desktop/ass_TIP/TextInfoExp/Part2_Text_Classify/test3/{salt}.txt", | ||
| 'w', | ||
| ) | ||
|
|
||
| f2.write(text) | ||
| f3.write(f'{salt} e' + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function trans_text refactored with the following changes:
- Use
withwhen opening file to ensure closure (ensure-file-closed) - Use f-string instead of string concatenation [×4] (
use-fstring-for-concatenation)
This removes the following comments ( why? ):
# print (files)
| f.write(str(test_name[i]) + ' ' + str(result[i]) + '\n') | ||
| f.write(f'{str(test_name[i])} {str(result[i])}' + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_classify refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation)
| if judgement != "": | ||
| return 4, judgement | ||
|
|
||
| return 0, "" | ||
| return (4, judgement) if judgement != "" else (0, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__analyse_word refactored with the following changes:
- Lift code into else after jump in control flow (
reintroduce-else) - Replace if statement with if expression (
assign-if-exp)
| if match is not None: | ||
| pattern = {"key": "要的是…给的是…", "value": 1} | ||
| return pattern | ||
| return "" | ||
| return {"key": "要的是…给的是…", "value": 1} if match is not None else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__is_clause_pattern1 refactored with the following changes:
- Lift code into else after jump in control flow (
reintroduce-else) - Replace if statement with if expression (
assign-if-exp) - Inline variable that is immediately returned (
inline-immediately-returned-variable)
| conjunction = {"key": the_word, "value": self.__conjunction_dict[the_word]} | ||
| return conjunction | ||
| return {"key": the_word, "value": self.__conjunction_dict[the_word]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__is_word_conjunction refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| punctuation = {"key": the_word, "value": self.__punctuation_dict[the_word]} | ||
| return punctuation | ||
| return {"key": the_word, "value": self.__punctuation_dict[the_word]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__is_word_punctuation refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| output += "Sub-clause" + str(i) + ": " | ||
| clause = comment_analysis["su-clause" + str(i)] | ||
| output += f"Sub-clause{str(i)}: " | ||
| clause = comment_analysis[f"su-clause{str(i)}"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__output_analysis refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation)
| if match is not None and len(self.__split_sentence(match.group(2))) <= 2: | ||
| to_delete = [] | ||
| for i in range(len(the_clauses)): | ||
| if the_clauses[i] in match.group(2): | ||
| to_delete.append(i) | ||
| if len(to_delete) > 0: | ||
| for i in range(len(to_delete)): | ||
| if match is not None and len(self.__split_sentence(match[2])) <= 2: | ||
| if to_delete := [ | ||
| i for i in range(len(the_clauses)) if the_clauses[i] in match[2] | ||
| ]: | ||
| for item in to_delete: | ||
| the_clauses.remove(the_clauses[to_delete[0]]) | ||
| the_clauses.insert(to_delete[0], match.group(2)) | ||
| the_clauses.insert(to_delete[0], match[2]) | ||
|
|
||
| # 识别“要是|如果……就好了”的假设句式 | ||
| pattern = re.compile(r"([,%。、!;??,!~~.… ]*)([\u4e00-\u9fa5]*?(如果|要是|" | ||
| r"希望).+就[\u4e00-\u9fa5]+(好|完美)了[,。;!%、??,!~~.… ]+)") | ||
| match = re.search(pattern, the_sentence.strip()) | ||
| if match is not None and len(self.__split_sentence(match.group(2))) <= 3: | ||
| if match is not None and len(self.__split_sentence(match[2])) <= 3: | ||
| to_delete = [] | ||
| for i in range(len(the_clauses)): | ||
| if the_clauses[i] in match.group(2): | ||
| if the_clauses[i] in match[2]: | ||
| to_delete.append(i) | ||
| if len(to_delete) > 0: | ||
| for i in range(len(to_delete)): | ||
| if to_delete: | ||
| for item_ in to_delete: | ||
| the_clauses.remove(the_clauses[to_delete[0]]) | ||
| the_clauses.insert(to_delete[0], match.group(2)) | ||
| the_clauses.insert(to_delete[0], match[2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__divide_sentence_into_clauses refactored with the following changes:
- Replace index in for loop with direct reference [×2] (
for-index-replacement) - Replace m.group(x) with m[x] for re.Match objects [×6] (
use-getitem-for-re-match-groups) - Use named expression to simplify assignment and conditional (
use-named-expression) - Convert for loop into list comprehension (
list-comprehension) - Simplify sequence length comparison [×2] (
simplify-len-comparison) - Replace unused for index with underscore [×2] (
for-index-underscore)
| clauses = [''.join(x) for x in zip(split_clauses, punctuations)] | ||
|
|
||
| return clauses | ||
| return [''.join(x) for x in zip(split_clauses, punctuations)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__split_sentence refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| with open(self.__root_filepath + "phrase_dict.txt", "r", encoding="utf-8") as f: | ||
| with open(f"{self.__root_filepath}phrase_dict.txt", "r", encoding="utf-8") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__get_phrase_dict refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation) - Remove unnecessary else after guard condition (
remove-unnecessary-else)
| f.write("%s" % info) | ||
| f.write(f"{info}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DictClassifier.__write_runout_file refactored with the following changes:
- Replace interpolated string formatting with f-string (
replace-interpolation-with-fstring)
| sorted_distances = distances.argsort() | ||
|
|
||
| return sorted_distances | ||
| return distances.argsort() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function KNNClassifier.__get_sorted_distances refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| a = WaimaiCorpus() | ||
| a = Waimai2Corpus() | ||
| a = HotelCorpus() | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function test_corpus refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass)
| pass | ||
|
|
||
| if __name__ == "__main__": | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 177-177 refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass)
| return [word for word in words[:num]] | ||
| else: | ||
| return [word[0] for word in words[:num]] | ||
| return list(words[:num]) if need_score else [word[0] for word in words[:num]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ChiSquare.best_words refactored with the following changes:
- Replace if statement with if expression (
assign-if-exp) - Replace identity comprehension with call to collection constructor (
identity-comprehension)
| if type(self.k) == int: | ||
| k = "%s" % self.k | ||
| else: | ||
| k = "-".join([str(i) for i in self.k]) | ||
|
|
||
| k = f"{self.k}" if type(self.k) == int else "-".join([str(i) for i in self.k]) | ||
| print("KNNClassifier") | ||
| print("---" * 45) | ||
| print("Train num = %s" % self.train_num) | ||
| print("Test num = %s" % self.test_num) | ||
| print("K = %s" % k) | ||
| print(f"Train num = {self.train_num}") | ||
| print(f"Test num = {self.test_num}") | ||
| print(f"K = {k}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Test.test_knn refactored with the following changes:
- Replace if statement with if expression (
assign-if-exp) - Replace interpolated string formatting with f-string [×4] (
replace-interpolation-with-fstring) - Move assignment closer to its usage within a block (
move-assign-in-block) - Convert for loop into list comprehension (
list-comprehension)
| print("BayesClassifier is testing ...") | ||
| for data in self.test_data: | ||
| classify_labels.append(bayes.classify(data)) | ||
| classify_labels = [bayes.classify(data) for data in self.test_data] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Test.test_bayes refactored with the following changes:
- Replace interpolated string formatting with f-string [×2] (
replace-interpolation-with-fstring) - Move assignment closer to its usage within a block (
move-assign-in-block) - Convert for loop into list comprehension (
list-comprehension)
|
这是来自QQ邮箱的假期自动回复邮件。
您好,您的来信本人已收到,将尽快给您回复。
|
Sourcery Code Quality Report✅ Merging this PR will increase code quality in the affected files by 0.57%.
Here are some functions in these files that still need a tune-up:
Legend and ExplanationThe emojis denote the absolute quality of the code:
The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request. Please see our documentation here for details on how these metrics are calculated. We are actively working on this report - lots more documentation and extra metrics to come! Help us improve this quality report! |
Branch
masterrefactored by Sourcery.If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.
See our documentation here.
Run Sourcery locally
Reduce the feedback loop during development by using the Sourcery editor plugin:
Review changes via command line
To manually merge these changes, make sure you're on the
masterbranch, then run:Help us improve this pull request!