Commit 090f85c
authored
[Event Hubs] Rewrite partition receiver (Azure#24731)
# Re-implementing the Event Receiver
This PR re-implements the event receiver using promises and a single
queue to fix an ordering issue and to correct waiting behavior.
## Problem Statement [Issue Azure#23993]
A customer reported that the list of events passed into the
`processEvents` callback is not always ordered by `sequenceNumber`. This
leads to processing the events in a wrong order. The customer provided a
sample that prints an out of order message when the `sequenceNumber` of
received messages is not in order and I confirm that I see the message
printed sometimes.
## Analysis
The customer-provided callback, `processEvents`, gets called every time
a batch of events is received from the service. This batch is coming
from a single partition. Events are ordered within a partition by their
`sequenceNumber`, and events received by `processEvents` should be in
the same order. However currently, the list of events the
`processEvents` callback gets called on is not always in-order. Upon
further investigation, it was found that the library implements a
complex logic to read events from the service. It maintains two queues
for reading events, one for building a batch of events that will be sent
to the next call of the `processEvents` callback, and another for when
errors occur or there are no active listeners. The coordination to read
events from the two queues is subtle and is the source of the ordering
bug.
## Re-design
The most straightforward way to simplify this design and to ensure
ordering is to use a single queue and add incoming events to it in the
order they're received. Reading from this queue is as simple as the
following:
- If the queue contains any events, check if their count is already the
`maxMessageCount` or more:
- If yes, remove `maxMessageCount` events and return them immediately
- If no, wait for a few milliseconds and then remove up to
`maxMessageCount` and return them
- If the queue doesn't contain any events, wait until the
`maxWaitTimeInSeconds` and then return an empty list, or until one or
more event arrive and then return those
### Abstraction
The idea is concisely captured by `waitForEvents`, a newly introduced
function that races a list of promises, one for each of the scenarios
listed above:
https://github.com/Azure/azure-sdk-for-js/blob/10826927554e7254dce0a4849f1e0c8219373522/sdk/eventhub/event-hubs/src/eventHubReceiver.ts#L733-L739
The first promise resolves right away and is returned if the queue
already has `maxMessageCount` events or more. It corresponds to the
first scenario listed above.
The second promise is created by the `checkOnInterval` function. The
promise is resolved only if the queue has any events in it. Otherwise,
it keeps checking every number of milliseconds. Note that chained to it
is a timer promise that waits another number of milliseconds to give the
service a chance to send more events. This corresponds to the second
scenario listed above.
The third promise is a simple timer promise that is resolved after the
`maxWaitTime` has elapsed. This promise corresponds to the third
scenario.
### Rewrite
In addition to some other minor improvements, the `receiveBatch` method
is concisely rewritten using that abstraction as follows:
https://github.com/Azure/azure-sdk-for-js/blob/10826927554e7254dce0a4849f1e0c8219373522/sdk/eventhub/event-hubs/src/eventHubReceiver.ts#L578-L628
Notice that the chain of promises makes the algorithm simple to read: a
link is established first, credits are added to it as needed, and then
the waiting starts.
Also, notice that at this point, no actual events were read from the
queue yet, all what this does is waiting until one of the promises
resolve. The actual reading from the queue is thened to that chain so
that it happens only after everything else is said and done. For
example, if an error occurred, it should be handled and we don't want to
prematurely mutate the queue. The reading from the queue is as simple as
the following:
https://github.com/Azure/azure-sdk-for-js/blob/10826927554e7254dce0a4849f1e0c8219373522/sdk/eventhub/event-hubs/src/eventHubReceiver.ts#L630
## Other changes
### Exporting `core-util`'s `createAbortablePromise`
This function was added in
Azure#24821 and proved to be
useful in this re-write so I am exporting it. I am planning on using it
in core-lro too.
### Updating tests
There are two tests updated, one for authentication and one for
returning events in the presence of retryable and non-retryable errors.
In the former, the receiver is expected to receive events after the auth
token has been invalidated but not yet refreshed. However, I am
observing that a disconnected event has been received at that moment and
the receiver has been deleted. The old receiver's behavior is to
continue receiving despite the deletion but the new one's behavior
correctly cleans up the receiver. I deleted this expectation for now.
In the latter, the test forces an error on the receiver after 50
milliseconds but the receiver already finishes around 40 milliseconds,
so I updated the forced error to happen sooner, at 10 milliseconds:
https://github.com/Azure/azure-sdk-for-js/blob/10826927554e7254dce0a4849f1e0c8219373522/sdk/eventhub/event-hubs/test/internal/receiveBatch.spec.ts#L107
Finally, a couple test suites were added for `waitForEvents` and
`checkOnInterval` functions.
## Updates in action
Live tests succeed
[[here](https://dev.azure.com/azure-sdk/internal/_build/results?buildId=2201768&view=results)].
Please ignore the timeout in the deployed resources script in canary, it
is an unrelated service issue, see
[[here](https://dev.azure.com/azure-sdk/internal/_build/results?buildId=2198994&view=results)].
A log for how the updated receiver behaves when used by the customer
sample can be found in
[log2.txt](https://github.com/Azure/azure-sdk-for-js/files/10775378/log2.txt).
Notice that the out of order message was never printed.
## Reviewing tips
The changes in `eventHubReceiver.ts` are too many and the diff is not
easily readable. I highly suggest to review
Azure@1082692
instead because it is on top of a deleting commit so there is no diff to
wrestle with. The main changes are in `receiveBatch` but please feel
free to review the rest of the module too.1 parent 29773b2 commit 090f85c
File tree
18 files changed
+576
-464
lines changed- sdk
- core/core-util
- review
- src
- test/public
- eventhub/event-hubs
- src
- util
- test
- internal
- node
- public
18 files changed
+576
-464
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
15 | 25 | | |
16 | 26 | | |
17 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | 23 | | |
75 | 24 | | |
76 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
11 | 13 | | |
12 | 14 | | |
| 15 | + | |
| 16 | + | |
13 | 17 | | |
14 | 18 | | |
15 | 19 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
52 | | - | |
| 52 | + | |
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
0 commit comments