Expose configuration for handling coded values like -666666666 by jeancochrane · Pull Request #75 · datamade/census

jeancochrane · 2019-05-07T00:57:14Z

Summary

Adjust the Client.query method to hande coded values, allowing the user to either:

Cast them to null
Raise an error

Set option 1) as the default. Implement these choices for the value -666666666, the only one I've encountered in the wild so far.

Closes #72.

Testing instructions

Confirm that all tests pass on Travis.

jeancochrane · 2019-05-07T01:00:22Z

census/core.py

+                raise NullValueException('Unhandled coded value: ', str(v))
+        else:
+            return func(v)
+    return null_wrapper


Another (maybe simpler) way to do this would be to avoid the decorator abstraction and use a "checking" function instead. I liked the decorator since it felt to me like a semantic way of indicating that this checking logic is a high-level modification that can be applied somewhat abstractly to the other type casting functions.

I think I'd be a bit more comfortable with this architecture if we had more cases than just the -666666666. That said I'm comfortable to go with this approach until we accumulate more cases, and then we can potentially refactor if appropriate.

jeancochrane · 2019-05-07T01:01:03Z

census/core.py

    def query(self, fields, geo, year=None, **kwargs):
+        cast_nulls = kwargs.get('cast_nulls', True)
+        if cast_nulls not in [True, False]:
+            raise CensusException('cast_nulls argument must be True or False')


I'm not totally confident in cast_nulls as the best name for the configuration argument that adjusts this behavior. In particular, it doesn't communicate that the alternative is to raise an error. Still, it was the best I had.

jeancochrane · 2019-05-07T01:07:09Z

census/core.py

+                            # rest of the row values for context, so flag the
+                            # error and continue the iteration
+                            error = True
+                            result = {header: item}


Some additional context for why we need to finish this inner loop before raising the error is that the results generated by for d in data look something like this:

{'var_name': -666666666, 'state': 42, 'county': 11}

The inner loop for header, cast, item in zip(headers, types, d) processes each of these elements in turn, returning a formatted "row" for the final results. The problem with raising an error in the inner loop is that we'll only know one of these header: item pairs (like var_name: -666666666 or 'state': 42) but unless you know all of the elements it's impossible to determine which API call raised the error.

fgregg

Thanks, Jean!

fgregg · 2019-05-17T12:01:57Z

census/tests/test_census.py

+        cast them to null.
+        """
+        # This call should return a value of -666666666
+        return_val = self._client.acs5.state_county_tract('B19081_001E',


Do we want this to be the default behavior? I think the default should be the behavior of cast_null=True, but I'm persuadable.

fgregg · 2019-05-17T12:03:20Z

census/tests/test_census.py

+        """
+        Test casting -666666666 values to null.
+        """
+        return_val = self._client.acs5.state_county_tract('B19081_001E',


We should also test that a warning is emitted that a value was cast to a null. Particularly if we decide to make this the default behavior.

fgregg · 2019-05-17T12:05:05Z

census/core.py

+                raise NullValueException('Unhandled coded value: ', str(v))
+        else:
+            return func(v)
+    return null_wrapper


I think I'd be a bit more comfortable with this architecture if we had more cases than just the -666666666. That said I'm comfortable to go with this approach until we accumulate more cases, and then we can potentially refactor if appropriate.

fgregg · 2019-05-17T12:05:44Z

census/tests/test_census.py

+                                                          cast_nulls=True)
+        self.assertEqual(return_val[0]['B19081_001E'], None)
+
+    def test_bad_cast_nulls_argument(self):


great to have these tests!

jeancochrane added 2 commits May 4, 2019 13:47

Add failing tests for coded values handling

2d41acb

Handle coded values by casting to null or raising an error

648f7c8

jeancochrane commented May 7, 2019

View reviewed changes

Fix data formatting for individual API results in .get

e610ae7

jeancochrane requested a review from fgregg May 13, 2019 22:45

fgregg requested changes May 17, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose configuration for handling coded values like -666666666#75

Expose configuration for handling coded values like -666666666#75
jeancochrane wants to merge 3 commits intomasterfrom
feature/jfc/convert-coded-values-to-nulls

jeancochrane commented May 7, 2019

Uh oh!

jeancochrane May 7, 2019

Uh oh!

fgregg May 17, 2019

Uh oh!

jeancochrane May 7, 2019

Uh oh!

jeancochrane May 7, 2019 •

edited

Loading

Uh oh!

fgregg left a comment

Uh oh!

fgregg May 17, 2019

Uh oh!

fgregg May 17, 2019

Uh oh!

fgregg May 17, 2019

Uh oh!

fgregg May 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeancochrane commented May 7, 2019

Summary

Testing instructions

Uh oh!

jeancochrane May 7, 2019

Choose a reason for hiding this comment

Uh oh!

fgregg May 17, 2019

Choose a reason for hiding this comment

Uh oh!

jeancochrane May 7, 2019

Choose a reason for hiding this comment

Uh oh!

jeancochrane May 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fgregg left a comment

Choose a reason for hiding this comment

Uh oh!

fgregg May 17, 2019

Choose a reason for hiding this comment

Uh oh!

fgregg May 17, 2019

Choose a reason for hiding this comment

Uh oh!

fgregg May 17, 2019

Choose a reason for hiding this comment

Uh oh!

fgregg May 17, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeancochrane May 7, 2019 •

edited

Loading