Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 23 additions & 13 deletions website/www/site/content/en/documentation/programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -4065,14 +4065,19 @@ and restricting it to a particular type. Beam will automatically infer the
schema for PCollections with `NamedTuple` output types. For example:
{{< /paragraph >}}

{{< highlight py >}}
class Transaction(typing.NamedTuple):
bank: str
purchase_amount: float

pc = input | beam.Map(lambda ...).with_output_types(Transaction)

{{< highlight java >}}
purchases.apply(Select.fieldNames("shippingAddress.postCode"));
{{< /highlight >}}

{{< highlight py >}}
import apache_beam as beam

purchases | beam.Select(
postCode=lambda row: row.shippingAddress.postCode
)
{{< /highlight >}}

{{< paragraph class="language-py" >}}
**beam.Row and Select**
Expand All @@ -4085,10 +4090,15 @@ use a lambda that returns instances of `beam.Row`:

{{< highlight py >}}
input_pc = ... # {"bank": ..., "purchase_amount": ...}
output_pc = input_pc | beam.Map(lambda item: beam.Row(bank=item["bank"],
purchase_amount=item["purchase_amount"])
output_pc = input_pc | beam.Map(
lambda item: beam.Row(
bank=item["bank"],
purchase_amount=item["purchase_amount"]
)
)
{{< /highlight >}}


{{< paragraph class="language-py" >}}
Sometimes it can be more concise to express the same logic with the
[`Select`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Select) transform:
Expand Down Expand Up @@ -4215,14 +4225,14 @@ Individual nested fields can be specified using the dot operator. For example, t
shipping address one would write
{{< /paragraph >}}

{{< highlight java >}}
purchases.apply(Select.fieldNames("shippingAddress.postCode"));
{{< highlight py >}}
import apache_beam as beam

purchases | beam.Select(
postCode=lambda row: row.shippingAddress.postCode
)
{{< /highlight >}}

<!-- {{< highlight py >}}
input_pc = ... # {"user_id": ..., "shipping_address": "post_code": ..., "bank": ..., "purchase_amount": ...}
output_pc = input_pc | beam.Select(post_code=lambda item: str(item["shipping_address.post_code"]))
{{< /highlight >}} -->
##### **Wildcards**

{{< paragraph class="language-py" >}}
Expand Down