Commit Graph

11 Commits

Author SHA1 Message Date
Sebastian Markbåge
f1ecf82bfb [Flight] Optimize Async Stack Collection (#33727)
We need to optimize the collection of debug info for dev mode. This is
an incredibly hot path since it instruments all I/O and Promises in the
app.

These optimizations focus primarily on the collection of stack traces.
They are expensive to collect because we need to eagerly collect the
stacks since they can otherwise cause memory leaks. We also need to do
some of the processing of them up front. We also end up only using a few
of them in the end but we don't know which ones we'll use.

The first compromise here is that I now only collect the stacks of
"awaits" if they were in a specific request's render. In some cases it's
useful to collect them even outside of this if they're part of a
sequence that started early. I still collect stacks for the created
Promises outside of this though which can still provide some context.

The other optimization to awaits, is that since we'll only use the inner
most one that had an await directly in userspace, we can stop collecting
stacks on a chain of awaits after we find one. This requires a quick
filter on a single callsite to determine. Since we now only collect
stacks from awaits that belongs to a specific Request we can use that
request's specific filter option. Technically this might not be quite
correct if that same thing ends up deduped across Requests but that's an
edge case.

Additionally, I now stop collecting stack for I/O nodes. They're almost
always superseded by the Promise that wraps them anyway. Even if you
write mostly Promise free code, you'll likely end up with a Promise at
the root of the component eventually anyway and then you end up using
its stack anyway. You have to really contort the code to end up with
zero Promises at which point it's not very useful anyway. At best it's
maybe mostly useful for giving a name to the I/O when the rest is just
stuff like `new Promise`.

However, a possible alternative optimization could be to *only* collect
the stack of spawned I/O and not the stack of Promises. The issue with
Promises (not awaits) is that we never know what will end up resolving
them in the end when they're created so we have to always eagerly
collect stacks. This could be an issue when you have a lot of
abstractions that end up not actually be related to I/O at all. The
issue with collecting stacks only for I/O is that the actual I/O can be
pooled or batched so you end up not having the stack when the conceptual
start of each operation within the batch started. Which is why I decided
to keep the Promise stack.
2025-07-08 10:49:08 -04:00
Sebastian Markbåge
2d7f0c4259 [Flight] Insert an extra await node for awaiting on the promise returned by then callback (#33713)
When a `.then()` callback returns another Promise, there's effectively
another "await" on that Promise that happens in the internals but that
was not modeled. In effect the Promise returned by `.then()` is blocked
on both the original Promise AND the promise returned by the callback.

This models that by cloning the original node and treat that as the
await on the original Promise. Then we use the existing Node to await
the new Promise but its "previous" points to the clone. That way we have
a forked node that awaits both.

---------

Co-authored-by: Sebastian Sebbie Silbermann <sebastian.silbermann@vercel.com>
2025-07-06 15:34:36 -04:00
Sebastian Markbåge
2a911f27dd [Flight] Send the awaited Promise to the client as additional debug information (#33592)
Stacked on #33588, #33589 and #33590.

This lets us automatically show the resolved value in the UI.

<img width="863" alt="Screenshot 2025-06-22 at 12 54 41 AM"
src="https://github.com/user-attachments/assets/a66d1d5e-0513-4767-910c-5c7169fc2df4"
/>

We can also show rejected I/O that may or may not have been handled with
the error message.

<img width="838" alt="Screenshot 2025-06-22 at 12 55 06 AM"
src="https://github.com/user-attachments/assets/e0a8b6ae-08ba-46d8-8cc5-efb60956a1d1"
/>

To get this working we need to keep the Promise around for longer so
that we can access it once we want to emit an async sequence. I do this
by storing the WeakRefs but to ensure that the Promise doesn't get
garbage collected, I keep a WeakMap of Promise to the Promise that it
depended on. This lets the VM still clean up any Promise chains that
have leaves that are cleaned up. So this makes Promises live until the
last Promise downstream is done. At that point we can go back up the
chain to read the values out of them.

Additionally, to get the best possible value we don't want to get a
Promise that's used by internals of a third-party function. We want the
value that the first party gets to observe. To do this I had to change
the logic for which "await" to use, to be the one that is the first
await that happened in user space. It's not enough that the await has
any first party at all on the stack - it has to be the very first frame.
This is a little sketchy because it relies on the `.then()` call or
`await` call not having any third party wrappers. But it gives the best
object since it hides all the internals. For example when you call
`fetch()` we now log that actual `Response` object.
2025-06-23 10:12:45 -04:00
Sebastian Markbåge
d70ee32b88 [Flight] Eagerly parse stack traces in DebugNode (#33589)
There's a memory leak in DebugNode where the `Error` objects that we
instantiate retains their callstacks which can have Promises on them. In
fact, it's very likely since the current callsite has the "resource" on
it which is the Promise itself. If those Promises are retained then
their `destroy` async hook is never fired which doesn't clean up our map
which can contains the `Error` object. Creating a cycle that can't be
cleaned up.

This fix is just eagerly reifying and parsing the stacks.

I totally expect this to be crazy slow since there's so many Promises
that we end up not needing to visit otherwise. We'll need to optimize it
somehow. Perhaps by being smarter about which ones we might need stacks
for. However, at least it doesn't leak indefinitely.
2025-06-22 10:40:33 -04:00
Sebastian Markbåge
ff93c4448c [Flight] Track Debug Info from Synchronously Unwrapped Promises (#33485)
Stacked on #33482.

There's a flaw with getting information from the execution context of
the ping. For the soft-deprecated "throw a promise" technique, this is a
bit unreliable because you could in theory throw the same one multiple
times. Similarly, a more fundamental flaw with that API is that it
doesn't allow for tracking the information of Promises that are already
synchronously able to resolve.

This stops tracking the async debug info in the case of throwing a
Promise and only when you render a Promise. That means some loss of data
but we should just warn for throwing a Promise anyway.

Instead, this also adds support for tracking `use()`d thenables and
forwarding `_debugInfo` from then. This is done by extracting the info
from the Promise after the fact instead of in the resolve so that it
only happens once at the end after the pings are done.

This also supports passing the same Promise in multiple places and
tracking the debug info at each location, even if it was already
instrumented with a synchronous value by the time of the second use.
2025-06-11 12:07:10 -04:00
Sebastian Markbåge
56408a5b12 [Flight] Emit timestamps only in forwards advancing time in debug info (#33482)
Previously you weren't guaranteed to have only advancing time entries,
you could jump back in time, but now it omits unnecessary duplicates and
clamps automatically if you emit a previous time entry to enforce
forwards order only.

The reason I didn't do this originally is because `await` can jump in
the order because we're trying to encode a graph into a flat timeline
for simplicity of the protocol and consumers.

```js
async function a() {
  await fetch1();
  await fetch2();
}

async function b() {
  await fetch3();
}

async function foo() {
  const p = a();
  await b();
  return p;
}
```

This can effectively create two parallel sequences:

```
--1.................----2.......--
------3......---------------------
```

This can now be flattened to either:

```
--1.................3---2.......--
```

Or:

```
------3......1......----2.......--
```

Depending on which one we visit first. Regardless, information is lost.

I'd say that the second one is worse encoding of this scenario because
it pretends that we weren't waiting for part of the timespan that we
were. To solve this I think we should probably make `emitAsyncSequence`
create a temporary flat list and then sort it by start time before
emitting.

Although we weren't actually blocked since there was some CPU time that
was able to proceed to get to 3. So maybe the second one is actually
better. If we wanted that consistently we'd have to figure out what the
intersection was.

---------

Co-authored-by: Hendrik Liebau <mail@hendrik-liebau.de>
2025-06-10 11:03:20 -04:00
Sebastian Markbåge
37054867c1 [Flight] Forward debugInfo from awaited instrumented Promises (#33415)
Stacked on #33403.

When a Promise is coming from React such as when it's passed from
another environment, we should forward the debug information from that
environment. We already do that when rendered as a child.

This makes it possible to also `await promise` and have the information
from that instrumented promise carry through to the next render.

This is a bit tricky because the current protocol is that we have to
read it from the Promise after it resolves so it has time to be assigned
to the promise. `async_hooks` doesn't pass us the instance (even though
it has it) when it gets resolved so we need to keep it around. However,
we have to be very careful because if we get this wrong it'll cause a
memory leak since we retain things by `asyncId` and then manually listen
for `destroy()` which can only be called once a Promise is GC:ed, which
it can't be if we retain it. We have to therefore use a `WeakRef` in
case it never resolves, and then read the `_debugInfo` when it resolves.
We could maybe install a setter or something instead but that's also
heavy.

The other issues is that we don't use native Promises in
ReactFlightClient so our instrumented promises aren't picked up by the
`async_hooks` implementation and so we never get a handle to our
thenable instance. To solve this we can create a native wrapper only in
DEV.
2025-06-04 00:49:03 -04:00
Sebastian Markbåge
9cc74fec74 [Flight] Emit the time we awaited something inside a Server Component (#33402)
Stacked on #33400. 

<img width="1261" alt="Screenshot 2025-06-01 at 10 27 47 PM"
src="https://github.com/user-attachments/assets/a5a73ee2-49e0-4851-84ac-e0df6032efb5"
/>

This is emitted with the start/end time and stack of the "await". Which
may be different than the thing that started the I/O.

These awaits aren't quite as simple as just every await since you can
start a sequence in parallel there can actually be multiple overlapping
awaits and there can be CPU work interleaved with the await on the same
component.

```js
function getData() {
  await fetch(...);
  await fetch(...);
}
const promise = getData();
doWork();
await promise;
```

This has two "I/O" awaits but those are actually happening in parallel
with `doWork()`.

Since these also could have started before we started rendering this
sequence (e.g. a component) we have to clamp it so that we don't
consider awaits that start before the component.

What we're conceptually trying to convey is the time this component was
blocked due to that I/O resource. Whether it's blocked from completing
the last result or if it's blocked from issuing a waterfall request.
2025-06-03 17:29:41 -04:00
Sebastian Markbåge
45da4e055d [Flight] Track Owner on AsyncInfo and IOInfo (#33395)
Stacked on #33394.

This lets us create async stack traces to the owner that was in context
when the I/O was started or awaited.

<img width="615" alt="Screenshot 2025-06-01 at 12 31 52 AM"
src="https://github.com/user-attachments/assets/6ff5a146-33d6-4a4b-84af-1b57e73047d4"
/>

This owner might not be the immediate closest parent where the I/O was
awaited.
2025-06-03 16:12:26 -04:00
Sebastian Markbåge
acee65d6d0 [Flight] Track Awaits on I/O as Debug Info (#33388)
This lets us track what data each Server Component depended on. This
will be used by Performance Track and React DevTools.

We use Node.js `async_hooks`. This has a number of downside. It is
Node.js specific so this feature is not available in other runtimes
until something equivalent becomes available. It's [discouraged by
Node.js docs](https://nodejs.org/api/async_hooks.html#async-hooks). It's
also slow which makes this approach only really viable in development
mode. At least with stack traces. However, it's really the only solution
that gives us the data that we need.

The [Diagnostic
Channel](https://nodejs.org/api/diagnostics_channel.html) API is not
sufficient. Not only is many Node.js built-in APIs missing but all
libraries like databases are also missing. Were as `async_hooks` covers
pretty much anything async in the Node.js ecosystem.

However, even if coverage was wider it's not actually showing the
information we want. It's not enough to show the low level I/O that is
happening because that doesn't provide the context. We need the stack
trace in user space code where it was initiated and where it was
awaited. It's also not each low level socket operation that we want to
surface but some higher level concept which can span a sequence of I/O
operations but as far as user space is concerned.

Therefore this solution is anchored on stack traces and ignore listing
to determine what the interesting span is. It is somewhat
Promise-centric (and in particular async/await) because it allows us to
model an abstract span instead of just random I/O. Async/await points
are also especially useful because this allows Async Stacks to show the
full sequence which is not supported by random callbacks. However, if no
Promises are involved we still to our best to show the stack causing
plain I/O callbacks.

Additionally, we don't want to track all possible I/O. For example,
side-effects like logging that doesn't affect the rendering performance
doesn't need to be included. We only want to include things that
actually block the rendering output. We also need to track which data
blocks each component so that we can track which data caused a
particular subtree to suspend.

We can do this using `async_hooks` because we can track the graph of
what resolved what and then spawned what.

To track what suspended what, something has to resolve. Therefore it
needs to run to completion before we can show what it was suspended on.
So something that never resolves, won't be tracked for example.

We use the `async_hooks` in `ReactFlightServerConfigDebugNode` to build
up an `ReactFlightAsyncSequence` graph that collects the stack traces
for basically all I/O and Promises allocated in the whole app. This is
pretty heavy, especially the stack traces, but it's because we don't
know which ones we'll need until they resolve. We don't materialize the
stacks until we need them though.

Once they end up pinging the Flight runtime, we collect which current
executing task that pinged the runtime and then log the sequence that
led up until that runtime into the RSC protocol. Currently we only
include things that weren't already resolved before we started rendering
this task/component, so that we don't log the entire history each time.

Each operation is split into two parts. First a `ReactIOInfo` which
represents an I/O operation and its start/end time. Basically the start
point where it was start. This is basically represents where you called
`new Promise()` or when entering an `async function` which has an
implied Promise. It can be started in a different component than where
it's awaited and it can be awaited in multiple places. Therefore this is
global information and not associated with a specific Component.

The second part is `ReactAsyncInfo`. This represents where this I/O was
`await`:ed or `.then()` called. This is associated with a point in the
tree (usually the Promise that's a direct child of a Component). Since
you can have multiple different I/O awaited in a sequence technically it
forms a dependency graph but to simplify the model these awaits as
flattened into the `ReactDebugInfo` list. Basically it contains each
await in a sequence that affected this part from unblocking.

This means that the same `ReactAsyncInfo` can appear in mutliple
components if they all await the same `ReactIOInfo` but the same Promise
only appears once.

Promises that are only resolved by other Promises or immediately are not
considered here. Only if they're resolved by an I/O operation. We pick
the Promise basically on the border between user space code and ignored
listed code (`node_modules`) to pick the most specific span but abstract
enough to not give too much detail irrelevant to the current audience.
Similarly, the deepest `await` in user space is marked as the relevant
`await` point.

This feature is only available in the `node` builds of React. Not if you
use the `edge` builds inside of Node.js.

---------

Co-authored-by: Sebastian "Sebbie" Silbermann <silbermann.sebastian@gmail.com>
2025-06-03 14:14:40 -04:00
Sebastian Markbåge
8b8d265bd9 [Flight] Wire up async_hooks in Node.js DEV for inspecting Promises (#27840)
This wires up the use of `async_hooks` in the Node build (as well as the
Edge build when a global is available) in DEV mode only. This will be
used to track debug info about what suspended during an RSC pass.

Enabled behind a flag for now.
2023-12-15 21:38:01 -05:00