-
Notifications
You must be signed in to change notification settings - Fork 140
Staging/hi_itn–v2 - Telephone changes to main #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Future implementations to date.py - Hindi ITN (#265) * Addition of whitelist and word classes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkins date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Future implementations for date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * pushing rough date code for ref Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Future implementations date.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkinsfile Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone.py-hindi itn Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Telephone.py - Hindi ITN Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Telephone modified tagger and verbalizer Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * telephone tagger with 3,4,5 digit std codes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Further additions - telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Jenkins update Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated tagger-telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone and Jenkinsfile cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkins Signed-off-by: Tarushi V <tarushiv@nvidia.com> --------- Signed-off-by: Tarushi V <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: RajanPutty <rputty@nvidia.com> * Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date (#306) * Addition of whitelist and word classes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkins date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Hindi 2.0 Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Tarushi V <tarushiv@nvidia.com> Signed-off-by: tarushi2k2 <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: RajanPutty <rputty@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: RajanPutty <rputty@nvidia.com> * Rebase Hindi ITN update: Fix Jenkinsfile for CI (#325) (#329) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> * Future implementations to date.py - Hindi ITN (#265) * Addition of whitelist and word classes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkins date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Future implementations for date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * pushing rough date code for ref Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Future implementations date.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkinsfile Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone.py-hindi itn Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Telephone.py - Hindi ITN Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Telephone modified tagger and verbalizer Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * telephone tagger with 3,4,5 digit std codes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Further additions - telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Jenkins update Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated tagger-telephone.py Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Telephone and Jenkinsfile cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkins Signed-off-by: Tarushi V <tarushiv@nvidia.com> --------- Signed-off-by: Tarushi V <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> * Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date (#306) * Addition of whitelist and word classes Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updation of Jenkins date Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Cleanup Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Updation Signed-off-by: Tarushi V <tarushiv@nvidia.com> * Hindi 2.0 Signed-off-by: Tarushi V <tarushiv@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Tarushi V <tarushiv@nvidia.com> Signed-off-by: tarushi2k2 <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: P V RAJAN <rajanv307@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: Revert date change in Jenkinsfile per review Signed-off-by: P V RAJAN <rajanv307@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> Signed-off-by: Tarushi V <tarushiv@nvidia.com> Signed-off-by: tarushi2k2 <tarushiv@nvidia.com> Signed-off-by: RajanPutty <rputty@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: tarushi2k2 <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: P V RAJAN <rajanv307@gmail.com> Signed-off-by: RajanPutty <rputty@nvidia.com> * Hindi ITN Merge Telephone Semiotic Class (#344) * feat(hi): Add Telephone class and all Hindi ITN updates Signed-off-by: RajanPutty <rputty@nvidia.com> * refactor(hi/telephone): Load digits and context from TSV files Addresses review comments on PR #344 by refactoring hardcoded variables to use data loaded from TSV files. - The `hindi_digits` and `english_digits` variables are no longer hardcoded. They are now populated by loading and creating a pynini union of their respective TSV files (`data/numbers/digit.tsv`, `data/numbers/zero.tsv`, `data/telephone/eng_digit.tsv`, `data/telephone/eng_zero.tsv`). - The hardcoded `context` dictionary has been removed. Its values are now loaded from a new `data/telephone/context_cues.tsv` file, matching the existing pattern used for `cc_cues` in the 'en' implementation. Signed-off-by: P V RAJAN <rajanv307@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: P V RAJAN <rajanv307@gmail.com> --------- Signed-off-by: RajanPutty <rputty@nvidia.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> Co-authored-by: P V RAJAN <rajanv307@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: RajanPutty <rputty@nvidia.com> * Update HI_TN_CACHE date to 01-06-26-0 in Jenkinsfile Signed-off-by: RajanPutty <rputty@nvidia.com> --------- Signed-off-by: Tarushi V <tarushiv@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: RajanPutty <rputty@nvidia.com> Signed-off-by: tarushi2k2 <tarushiv@nvidia.com> Signed-off-by: P V RAJAN <rajanv307@gmail.com> Co-authored-by: tarushi2k2 <tarushiv@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: P V RAJAN <rajanv307@gmail.com>
| graph_year_range = self.year_range | ||
| graph_year_range_century = self.year_range + delete_space + self.century | ||
|
|
||
| graph_ordinal_century = self.ordinal_century + self.morpho_graph + delete_extra_space + self.century |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning
redefined
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, to fix “variable defined multiple times” issues where the first assignment is unused, you remove the redundant assignment while keeping the logically correct one (usually the last assignment before use). You must ensure that you are not removing any right-hand-side expression that has side effects; here the right-hand side is a pure composition of Pynini FST expressions and has no observable side effects, and both assignments are identical.
The best fix here is to delete the first assignment to graph_ordinal_century on line 91 and keep the second assignment on line 93, which is the one whose value is actually used later when graph is built. No other code depends on the earlier assignment, and keeping only the second assignment preserves existing functionality entirely. No new imports, methods, or definitions are required; this is purely a small cleanup in nemo_text_processing/inverse_text_normalization/hi/taggers/date.py.
Specifically, within DateFst.__init__ in nemo_text_processing/inverse_text_normalization/hi/taggers/date.py, remove the line:
graph_ordinal_century = self.ordinal_century + self.morpho_graph + delete_extra_space + self.centurythat appears just before the identical line, and leave the later one intact.
| @@ -88,7 +88,6 @@ | ||
| graph_year_range = self.year_range | ||
| graph_year_range_century = self.year_range + delete_space + self.century | ||
|
|
||
| graph_ordinal_century = self.ordinal_century + self.morpho_graph + delete_extra_space + self.century | ||
|
|
||
| graph_ordinal_century = self.ordinal_century + self.morpho_graph + delete_extra_space + self.century | ||
| graph_date_exceptions = self.month + delete_space + pynutil.delete("की") + delete_space + self.day |
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.utils import get_abs_path | ||
| from nemo_text_processing.text_normalization.en.graph_utils import ( | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path |
Check notice
Code scanning / CodeQL
Unused import Note
Import of 'get_abs_path' is not used.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix the problem, remove the unused imports so that every imported symbol is referenced somewhere in the module. This eliminates unnecessary dependencies and satisfies the static analysis rule for unused imports.
In this file, the single problematic line is line 19:
from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_pathSince neither apply_fst nor get_abs_path appears in the provided code, the best minimal fix is to delete this import line entirely. No other code changes, new methods, or additional imports are required. The rest of the imports (for pynini, pynutil, and the various graph utilities) remain unchanged.
| @@ -16,7 +16,6 @@ | ||
| import pynini | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path | ||
| from nemo_text_processing.text_normalization.en.utils import load_labels | ||
| from nemo_text_processing.text_normalization.hi.graph_utils import ( | ||
| INPUT_CASED, |
| from nemo_text_processing.inverse_text_normalization.hi.utils import get_abs_path | ||
| from nemo_text_processing.text_normalization.en.graph_utils import ( | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path | ||
| from nemo_text_processing.text_normalization.en.utils import load_labels |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix an unused import, you remove the import statement for the unused symbol, leaving all other functionality intact. Since load_labels is never referenced in the shown code, the best fix is to delete the line that imports it.
Concretely, in nemo_text_processing/inverse_text_normalization/hi/taggers/fraction.py, remove line 20:
from nemo_text_processing.text_normalization.en.utils import load_labelsNo additional code changes are needed: there are no references to load_labels that would need to be updated, and removing this import will not affect the construction or behavior of FractionFst. The remaining imports (pynini, pynutil, and symbols from graph_utils, as well as apply_fst and get_abs_path) are untouched.
| @@ -17,7 +17,6 @@ | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path | ||
| from nemo_text_processing.text_normalization.en.utils import load_labels | ||
| from nemo_text_processing.text_normalization.hi.graph_utils import ( | ||
| INPUT_CASED, | ||
| INPUT_LOWER_CASED, |
| insert_space, | ||
| ) | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import get_abs_path | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix an unused import, remove the imported symbol that is not referenced anywhere in the file while preserving any still-used imports. Here, apply_fst and get_abs_path are imported together, but only apply_fst is reported as unused. The minimal, non‑functional change is to delete apply_fst from the import list and keep get_abs_path intact.
Concretely, in nemo_text_processing/inverse_text_normalization/hi/taggers/measure.py, update line 26 so that it only imports get_abs_path from nemo_text_processing.inverse_text_normalization.hi.utils. No other code changes or new definitions are required.
-
Copy modified line R26
| @@ -23,7 +23,7 @@ | ||
| delete_space, | ||
| insert_space, | ||
| ) | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst, get_abs_path | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import get_abs_path | ||
|
|
||
|
|
||
| class MeasureFst(GraphFst): |
| self.fraction = decimal_graph | ||
| self.currency = pynutil.insert("currency: \"") + currency_graph + pynutil.insert("\" ") | ||
| aur = pynutil.delete("और") | ||
| delete_hundred = pynutil.delete("सौ") |
Check notice
Code scanning / CodeQL
Unused local variable Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, to fix an unused local variable flagged by static analysis, either remove the assignment (if it has no required side effects) or, if you keep it for clarity or future use, rename the variable to indicate that it is intentionally unused (for example, _ or a name containing unused).
Here, delete_hundred = pynutil.delete("सौ") is unused. The right-hand side is a pure expression building a transducer; there are no external side effects, so we could safely delete the line. To avoid altering even minor internal state or perceived intent, and to match the recommendation, the best minimal-change fix is to rename the variable to something indicating intentional non-use, such as _unused_delete_hundred. This satisfies CodeQL’s rule while preserving the code for potential future use.
Only one line in nemo_text_processing/inverse_text_normalization/hi/taggers/money.py needs to change: line 55, renaming delete_hundred to _unused_delete_hundred. No extra imports or method definitions are required.
-
Copy modified line R55
| @@ -52,7 +52,7 @@ | ||
| self.fraction = decimal_graph | ||
| self.currency = pynutil.insert("currency: \"") + currency_graph + pynutil.insert("\" ") | ||
| aur = pynutil.delete("और") | ||
| delete_hundred = pynutil.delete("सौ") | ||
| _unused_delete_hundred = pynutil.delete("सौ") | ||
| delete_lakh = pynutil.delete("लाख") | ||
| delete_hazar = pynutil.delete("हजार") | pynutil.delete("हज़ार") | ||
| delete_crore = pynutil.delete("करोड़") | pynutil.delete("करोड़") |
| aur = pynutil.delete("और") | ||
| delete_hundred = pynutil.delete("सौ") | ||
| delete_lakh = pynutil.delete("लाख") | ||
| delete_hazar = pynutil.delete("हजार") | pynutil.delete("हज़ार") |
Check notice
Code scanning / CodeQL
Unused local variable Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix an unused local variable, either (a) remove the assignment if the value truly isn’t needed, or (b) rename it to an "unused" style name if it is intentionally unused and retained for documentation. Here, the delete_hazar graph fragment isn’t used anywhere in the shown code, and its creation has no side effects (it just builds a Pynini FSA). The safest, non‑functional change is to remove the delete_hazar assignment line entirely.
Concretely, in nemo_text_processing/inverse_text_normalization/hi/taggers/money.py, within MoneyFst.__init__, delete line 57 defining delete_hazar. No additional methods, imports, or definitions are required.
| @@ -54,7 +54,6 @@ | ||
| aur = pynutil.delete("और") | ||
| delete_hundred = pynutil.delete("सौ") | ||
| delete_lakh = pynutil.delete("लाख") | ||
| delete_hazar = pynutil.delete("हजार") | pynutil.delete("हज़ार") | ||
| delete_crore = pynutil.delete("करोड़") | pynutil.delete("करोड़") | ||
|
|
||
| graph_currency_decimal = self.fraction + delete_extra_space + self.currency |
| from nemo_text_processing.inverse_text_normalization.hi.graph_utils import ( | ||
| DEVANAGARI_DIGIT, | ||
| GraphFst, | ||
| delete_extra_space, | ||
| delete_space, | ||
| insert_space, | ||
| integer_to_devanagari, | ||
| ) |
Check notice
Code scanning / CodeQL
Unused import Note
Import of 'insert_space' is not used.
Import of 'delete_extra_space' is not used.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix the problem, remove unused symbols from the import list so that only actually used names remain. This avoids unnecessary dependencies and improves readability without changing runtime behavior.
Concretely, in nemo_text_processing/inverse_text_normalization/hi/taggers/time.py, adjust the from ...graph_utils import (...) statement so it no longer imports DEVANAGARI_DIGIT, insert_space, or delete_extra_space, while keeping GraphFst, delete_space, and integer_to_devanagari untouched. No other code changes are required, as we are only simplifying the import list. This change will not affect existing functionality as long as these three symbols truly are not referenced elsewhere in this file.
| @@ -16,11 +16,8 @@ | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.graph_utils import ( | ||
| DEVANAGARI_DIGIT, | ||
| GraphFst, | ||
| delete_extra_space, | ||
| delete_space, | ||
| insert_space, | ||
| integer_to_devanagari, | ||
| ) | ||
| from nemo_text_processing.inverse_text_normalization.hi.utils import get_abs_path |
| import pynini | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
To fix the problem, remove the unused import so that the file only imports what it actually uses. This reduces unnecessary coupling and satisfies the static analysis tool.
Concretely, in nemo_text_processing/inverse_text_normalization/hi/verbalizers/fraction.py, delete the line:
from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fstNo other changes are required, since apply_fst is not referenced anywhere in the file. All existing functionality will remain intact because the code only relies on pynini, pynutil, and the symbols imported from text_normalization.en.graph_utils.
| @@ -16,7 +16,6 @@ | ||
| import pynini | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.inverse_text_normalization.hi.utils import apply_fst | ||
| from nemo_text_processing.text_normalization.en.graph_utils import NEMO_NOT_QUOTE, NEMO_SPACE, GraphFst, delete_space | ||
|
|
||
|
|
Future implementations to date.py - Hindi ITN (Future implementations to date.py - Hindi ITN #265)
Addition of whitelist and word classes
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkins date
Cleanup
Updation
Updation
Future implementations for date
pushing rough date code for ref
Future implementations date.py
Cleanup
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkinsfile
Telephone.py-hindi itn
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Telephone.py - Hindi ITN
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Telephone modified tagger and verbalizer
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
telephone tagger with 3,4,5 digit std codes
Further additions - telephone.py
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Jenkins update
Telephone.py
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updated tagger-telephone.py
Telephone and Jenkinsfile cleanup
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date (Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date #306)
Addition of whitelist and word classes
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkins date
Cleanup
Updation
Updation
Hindi 2.0
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Rebase Hindi ITN update: Fix Jenkinsfile for CI (Fix Jenkinsfile for CI #325) (Rebase Hindi ITN update: Fix Jenkinsfile for CI (#325) #329)
Fix Jenkinsfile for CI (Fix Jenkinsfile for CI #325)
Fix Jenkinsfile for CI
Fix requirements for test
Update paths and docker
Fix docker name
Fix click version
Change path of grammars for sparrowhawk tests
Update paths in sh_test.sh
Update paths
Revert paths
Future implementations to date.py - Hindi ITN (Future implementations to date.py - Hindi ITN #265)
Addition of whitelist and word classes
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkins date
Cleanup
Updation
Updation
Future implementations for date
pushing rough date code for ref
Future implementations date.py
Cleanup
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkinsfile
Telephone.py-hindi itn
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Telephone.py - Hindi ITN
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Telephone modified tagger and verbalizer
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
telephone tagger with 3,4,5 digit std codes
Further additions - telephone.py
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Jenkins update
Telephone.py
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updated tagger-telephone.py
Telephone and Jenkinsfile cleanup
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date (Hindi 2.0: Quarterly Measures, Fraction Exceptions, Changes to Date #306)
Addition of whitelist and word classes
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Updation of Jenkins date
Cleanup
Updation
Updation
Hindi 2.0
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Hindi ITN Merge Telephone Semiotic Class (Hindi ITN Merge Telephone Semiotic Class #344)
feat(hi): Add Telephone class and all Hindi ITN updates
refactor(hi/telephone): Load digits and context from TSV files
Addresses review comments on PR #344 by refactoring hardcoded variables to use data loaded from TSV files.
The
hindi_digitsandenglish_digitsvariables are no longerhardcoded. They are now populated by loading and creating a
pynini union of their respective TSV files
(
data/numbers/digit.tsv,data/numbers/zero.tsv,data/telephone/eng_digit.tsv,data/telephone/eng_zero.tsv).The hardcoded
contextdictionary has been removed. Its valuesare now loaded from a new
data/telephone/context_cues.tsvfile,matching the existing pattern used for
cc_cuesin the 'en'implementation.
for more information, see https://pre-commit.ci
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.