Skip to content

Commit 4c891f4

Browse files
committed
Refinement of decentralized coordination blog post
1 parent 43f4828 commit 4c891f4

File tree

1 file changed

+36
-19
lines changed

1 file changed

+36
-19
lines changed
Lines changed: 36 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,53 @@
11
---
22
slug: decentralized-coordination
3-
title: "Timing Challenges with Decentralized Coordination"
3+
title: "Consistency and Availability Challenges with Decentralized Coordination"
44
authors: [fra-p, eal, rcakella]
55
tags: [lingua franca, federation, decentralized]
66
---
77

8-
[Distributed applications](/docs/writing-reactors/distributed-execution) may create trouble to meet timing constraints expressed as [deadlines](/docs/writing-reactors/deadlines), especially if the coordination of the federation is [decentralized](/docs/writing-reactors/distributed-execution#decentralized-coordination).
9-
8+
The design of [distributed applications](/docs/writing-reactors/distributed-execution) in Lingua Franca requires care, particularly if the coordination of the federation is [decentralized](/docs/writing-reactors/distributed-execution#decentralized-coordination).
109

1110
Consider the above Lingua Franca implementation of an automatic emergency braking system, one of the most critical ADAS systems which modern cars are equipped with.
12-
The controller system reads data coming from two sensors, a lidar and a radar, and uses both to detect if objects or pedestrian cross the path of the car, thus performing sensor fusion.
11+
The controller system reads data coming from two sensors, a lidar and a radar, and uses both to detect if objects or pedestrians cross the path of the car, thus performing sensor fusion.
1312
When either of the two signals the presence of a close object, the controller triggers the brake to stop the car and avoid crashing into it.
1413

15-
The lidar sensor has a higher sampling frequency, while the radar is slower.
16-
Their deadline is equal to their period and is enforced using dedicated deadline checking reactors, following the guidelines to [work with deadlines](/blog/deadlines).
17-
Meeting deadlines is crucial in this application, as we want to make sure that each single sensor data is processed by the automatic emergency braking system before new inputs from the same sensor arrive.
14+
The lidar sensor has a higher sampling frequency, while the radar is slower, and this is reflected by the timer in the corresponding reactors.
15+
Their deadline is equal to their period and is enforced using dedicated deadline checking reactors, following the guidelines of how to [work with deadlines](/blog/deadlines).
16+
Availability is a crucial property of this application, because we want the automatic emergency braking system to brake as fast as possible when a close object is detected. Consistency is also necessary: sensor fusion happens with sensor data produced at the same logical time, so in-order data processing is critical.
1817

1918
The application is implemented as a federated program with decentralized coordination, which means that the advancement of logical time in each single federate is not subject to approval from any centralized entities, but it is done locally based on the input it receives from the other federates.
2019
Consistency problems may arise when a federate receives data from two or more federates, as it is the case of the automatic emergency braking reactor.
21-
As an example, the controller expects to receive input from both sensors at times 0ms, 100ms, 200ms, etc. Let's consider the case where the remote connection between the controller and the radar has a slightly larger delay than that between the controller and the lidar. Hence, the lidar input will arrive slightly earlier than the radar one. When the controller receives the lidar input, what should it do? Should it process the data immediately, or should it wait for the radar input to come?
20+
As an example, the controller expects to receive input from both sensors at times 0ms, 100ms, 200ms, etc. Let's consider the case where the remote connection between the controller and the radar has a slightly larger delay than that between the controller and the lidar. The lidar input will arrive slightly earlier than the radar one. When the controller receives the lidar input, should it process the data immediately, or should it wait for the radar input to come? Sensor fusion requires consistency: if the controller processes the input from the lidar and then the radar data comes, the elaborated control action did not take into account both sensors even though it should have.
21+
22+
The desired behavior with simultaneous inputs is highly dependent on the application under analysis, and Lingua Franca lets you customize it. Each federate has a parameter called [STA (safe-to-advance)](/docs/writing-reactors/distributed-execution#safe-to-advance-sta) that controls how long the federate should wait for inputs from other federates before processing an input it has just received.
23+
More precisely, the STA is how much time a federate waits before advancing its tag to that of the just received event, when it is not known if the other input ports will receive data at the same or an earlier tag. At the expiration of the STA, the federate assumes that those unresolved ports will not receive data at earlier tags, and advances its logical time to the tag of the received event.
24+
25+
The maximum consistency guarantee is given by indefinitely waiting for the radar input before processing the radar, i.e., STA = forever, but this is viable only if the following two conditions are always satisfied:
26+
* the communication medium between the sensors and the controller is perfectly reliable; and
27+
* none of the three federates is subject to faults.
28+
29+
These conditions guarantee that all expected data will be generated, sent and correctly received by the communication parties.
30+
31+
However, setting the STA to forever creates problems when only the lidar input is expected (50ms, 150ms, 250ms, etc): the controller cannot process that input until an input from the radar comes, because the STA will never expire. For example, if the single lidar input comes at 50ms, it has to wait until time 100ms before being processed. If that input was signaling the presence of a close object, the detection would be delayed by 50ms, which may potentially mean crashing into the object. The automatic emergency braking system must be available, otherwise it might not brake in time to avoid collisions.
32+
The ideal STA value for maximum availability in the time instants with only the lidar input is 0, because if a single input is expected, no wait is necessary.
33+
34+
Summing up, consistency for sensor fusion requires STA=forever when inputs from both sensors are expected, while availability calls for STA=0 when only the lidar input is coming. The two values are at odds, and any value in between would mean sacrificing both properties at the same time.
35+
36+
The knowledge of the timing properties of the application under analysis enables the a priori determination of the time instants when both inputs are expected and those when only the lidar has new data available.
37+
Lingua Franca allows to dynamically change the STA in the reaction body using the lf_set_maxwait API, that takes as input parameter the new STA value to set.
38+
This capability of the language permits the automatic emergency braking federate to:
39+
* start with the STA statically set to forever, because at time 0 (startup) both sensors produce data;
40+
* set the STA to 0 after processing both inputs arrived at the same logical time, because the next data will be sent by the lidar only;
41+
* set the STA back to forever after processing the radar input alone, because the next data will be sent by both sensors.
42+
43+
This dynamic solution guarantees both consistency and availability in all input cases.
2244

23-
The desired behavior with simultaneous inputs is highly dependent on the application under analysis, and Lingua Franca lets you customize it. Each federate has a parameter called [STA (safe-to-advance)](http://localhost:3000/docs/writing-reactors/distributed-execution#safe-to-advance-sta) that controls how long the federate should wait for inputs from other federates before processing an input it has just received.
45+
Knowing the LF decentralized coordination:
46+
- consistency = in-order processing of events even with multiple events
47+
- availability = the system is responsive even with a single input
2448

25-
The maximum possible consistency guarantee is given by indefinitely waiting for the radar input before processing the radar, but this is viable only if the following three conditions are always satisfied: (i) the communication medium between the sensors and the controller is perfectly reliable; (ii) none of the three federates is subject to faults; and (iii) the network latency is never greater that 50ms, that is, the smallest period of the two sensors.
26-
(i) and (ii) guarantee that all expected data will be generated, sent and correctly received by the communication parties. (iii) is necessary to meet sensor deadlines: if the controller receives input from the lidar and indefinitely waits for the radar, it the latter does not arrive within 50ms, the lidar data cannot be processed and its deadline is violated.
49+
Oh, maybe mention that the clock of the two sensors is synced because we're resampling the data
2750

28-
lidar data only case
51+
I might also say that forever does not work when one of the sensors is delayed too much or when the medium fails for too much time, in which cases a finite STA is better (like a period or something) (this is gonna be the topic of a new blog post)
2952

30-
-example description
31-
-a little bit on decentralized coordination/maxwait
32-
-deadline issues
33-
-maxwait challenges
34-
-dynamic maxwait for both consistency and availability (=timing)
35-
-maybe highlight that deadlines here are not only affected by the workload on the single processor, but also on the distributed communication
36-
-maybe a little bit of what happens when out-of-order msg.s are received?
53+
-maybe a little bit of what happens when out-of-order msg.s are received? (not sure this is really needed though)

0 commit comments

Comments
 (0)