The way we collect component stacks right now are pretty fragile.
We expect that we'll call captureBoundaryErrorDetailsDev whenever an
error happens. That resets lastBoundaryErrorComponentStackDev to null
but if we don't, it just lingers and we don't set it to anything new
then which leaks the previous component stack into the next time we have
an error. So we need to reset it in a bunch of places.
This is still broken with erroredReplay because it has the inverse
problem that abortRemainingReplayNodes can call
captureBoundaryErrorDetailsDev more than one time. So the second
boundary won't get a stack.
We probably should try to figure out an alternative way to carry along
the stack. Perhaps WeakMap keyed by the error object.
This also fixes an issue where we weren't invoking the onShellReady
event if we error a replay. That event is a bit weird for resuming
because we're probably really supposed to just invoke it immediately if
we have already flushed the shell in the prerender which is always atm.
Right now, it gets invoked later than necessary because you could have a
resumed hole ready before a sibling in the shell is ready and that's
blocked.
The jsx-runtime uses the ReactCurrentDispatcher from shared internals.
Recently this was moved to ReactServerSharedInternals which broke
jsx-runtime. This change moves it back to ReactSharedInternals until we
can come up with a new forking mechanism.
This lets a registered object or value be "tainted", which we block from
crossing the serialization boundary. It's only allowed to stay
in-memory.
This is an extra layer of protection against mistakes of transferring
data from a data access layer to a client. It doesn't provide perfect
protection, because it doesn't trace through derived values and
substrings. So it shouldn't be used as the only security layer but more
layers are better.
`taintObjectReference` is for specific object instances, not any nested
objects or values inside that object. It's useful to avoid specific
objects from getting passed as is. It ensures that you don't
accidentally leak values in a specific context. It can be for security
reasons like tokens, privacy reasons like personal data or performance
reasons like avoiding passing large objects over the wire.
It might be privacy violation to leak the age of a specific user, but
the number itself isn't blocked in any other context. As soon as the
value is extracted and passed specifically without the object, it can
therefore leak.
`taintUniqueValue` is useful for high entropy values such as hashes,
tokens or crypto keys that are very unique values. In that case it can
be useful to taint the actual primitive values themselves. These can be
encoded as a string, bigint or typed array. We don't currently check for
this value in a substring or inside other typed arrays.
Since values can be created from different sources they don't just
follow garbage collection. In this case an additional object must be
provided that defines the life time of this value for how long it should
be blocked. It can be `globalThis` for essentially forever, but that
risks leaking memory for ever when you're dealing with dynamic values
like reading a token from a database. So in that case the idea is that
you pass the object that might end up in cache.
A request is the only thing that is expected to do any work. The
principle is that you can derive values from out of a tainted
entry during a request. Including stashing it in a per request cache.
What you can't do is store a derived value in a global module level
cache. At least not without also tainting the object.
The `resumeElement` function wasn't actually doing the correct thing
because it was resuming the element itself but what the child slot means
is that we're supposed to resume in the direct child of the element.
This is difficult to check for since it's all the possible branches that
the element can render into, so instead we just check this in
renderNode. It means the hottest path always checks the task which is
unfortunate.
And also, resuming using the correct nextSegmentId.
Fixes two bugs surfaced by this test.
---------
Co-authored-by: Josh Story <josh.c.story@gmail.com>
I do this by simply renaming the secret export name in the "subset"
bundle and this renamed version is what the FlightServer uses.
This requires us to be more diligent about always using the correct
instance of "react" in our tests so there's a bunch of clean up for
that.
stacked on #27314
Turbopack requires a different module loading strategy than Webpack and
as such this PR implements a new package `react-server-dom-turbopack`
which largely follows the `react-server-dom-webpack` but is implemented
for this new bundler
This fixes so that you can postpone in a fallback. This postpones the
parent boundary. I track the fallbacks in a separate replay node so that
when we resume, we can replay the fallback itself and finish the
fallback and then possibly later the content. By doing this we also
ensure we don't complete the parent too early since now it has a render
task on it.
There is one case that this surfaces that isn't limited to
prerender/resume but also render/hydrateRoot. I left todos in the tests
for this.
If you postpone in a fallback, and suspend in the content but eventually
don't postpone in the content then we should be able to just skip
postponing since the content rendered and we no longer need the
fallback. This is a bit of a weird edge case though since fallbacks are
supposed to be very minimal.
This happens because in both cases the fallback starts rendering early
as soon as the content suspends. This also ensures that the parent
doesn't complete early by increasing the blocking tasks. Unfortunately,
the fallback will irreversibly postpone its parent boundary as soon as
it hits a postpone.
When you suspend, the same thing happens but we typically deal with this
by doing a "soft" abort on the fallback since we don't need it anymore
which unblocks the parent boundary. We can't do that with postpone right
now though since it's considered a terminal state.
I think I'll just leave this as is for now since it's an edge case but
it's an annoying exception in the model. Makes me feel I haven't quite
nailed it just yet.
If we track a postponed SuspenseBoundary parent that we have to replay
through before it has postponed and it postpones itself later, we need
to upgrade it to a postponed replay boundary instead of adding two.
We currently abort a stream either it's explicitly told to abort (e.g.
by an abortsignal). In this case we still finish writing what we have as
well as instructions for the client about what happened so it can
trigger fallback cases and log appropriately.
We also abort a request if the stream itself cancels. E.g. if you can't
write anymore. In this case we should not write anything to the outgoing
stream since it's supposed to be closed already now. However, we should
still abort the request so that more work isn't performed and so that we
can log the reason for it to the onError callback.
We should also not do any work after aborting.
There we need to stop the "flow" of bytes - so I call stopFlowing in the
cancel case before aborting.
The tests were testing this case but we had changed the implementation
to only start flowing at initial read (pull) instead of start like we
used to. As a result, it was no longer covering this case. We have to
call reader.read() in the tests to start the flow so that we need to
cancel it.
We also were missing a final assertion on the error logs and since we
were tracking them explicitly the extra error was silenced.
The key is that instead of storing different tags of resumable points,
we just store if a replay node has any resumable slots and if that's at
the root `number` or if it has resumable slots by index.
This is a simpler and more compact format because we don't have to
separate the three Resume forms.
This helps deal with Postpone in fallbacks because it doesn't just
double all the cases.
To support MPA-style form submissions, useFormState sends down a key
that represents the identity of the hook on the page. It's based on the
key path of the component within the React tree; for deeply nested
hooks, this keypath can become very long. We can hash the key to make it
shorter.
Adds a method called createFastHash to the Stream Config interface.
We're not using this for security or obfuscation, only to generate a
more compact key without sacrificing too much collision resistance.
- In Node.js builds, createFastHash uses the built-in crypto module.
- In Bun builds, createFastHash uses Bun.hash. See:
https://bun.sh/docs/api/hashing#bun-hash
I have not yet implemented createFastHash in the Edge, Browser, or FB
(Hermes) stream configs because those environments do not have a
built-in hashing function that meets our requirements. (We can't use the
web standard `crypto` API because those methods are async, and yielding
to the main thread is too costly to be worth it for this particular use
case.) We'll likely use a pure JS implementation in those environments;
for now, they just return the original string without hashing it. I'll
address this in separate PRs.
Moves writing queues to renderState.
We shouldn't need the resource tracking's value. We just need to know if
that resource has already been emitted. We can use a Set for this. To
ensure that set is directly serializable we can just use a
dictionary-like object with no value.
See individual commits for special cases.
Originally the intension was to have React assign an ID to a user
rendered DOM node inside a `fallback` while it was loading. If there
already were an explicit `id` defined on the DOM element we would reuse
that one instead. That's why this was a DOM Config option and not just
built in to Fizz.
This became tricky since it can load late and so we'd have to transfer
it down and detect it only once it finished rendering and if there is no
DOM element it doesn't work anyway. So instead, what we do in practice
is to always use a `<template>` tag with the ID. This has the downside
of an extra useless node and shifting child CSS selectors.
Maybe we'll get around to fixing this properly but it might not be worth
it.
This PR just gets rid of the SuspenseBoundaryID concept and instead we
just use the same ID number as the root segment ID of the boundary to
refer to the boundary to simplify the implementation.
This also solves the problem that SuspenseBoundaryID isn't currently
serializable (although that's easily fixable by itself if necessary).
Based on #27385.
When we error or abort during replay, that doesn't actually error the
component that errored because that has already rendered. The error only
affects any child that is not yet completed. Therefore the error kind of
gets thrown at the resumable point.
The resumable point might be a hole in the replay path, in which case
throwing there errors the parent boundary just the same as if the replay
component errored. If the hole is inside a deeper Suspense boundary
though, then it's that Suspense boundary that gets client rendered. I.e.
the child boundary. We can still finish any siblings.
In the shell all resumable points are inside a boundary since we must
have finished the shell. Therefore if you error in the root, we just
simply just turn all incomplete boundaries into client renders.
The idea for this check is that we shouldn't flush anything before we
flush the shell. That may or may not hold true in future formats like
RN.
It is a problem for resuming because with resuming it's possible to have
root tasks that are used for resuming but the shell was already flushed
so we can have completed boundaries before the shell has fully resumed.
What matters is whether the parent has already flushed or not.
It's not technically necessary to bail early because there won't be
anything in partialBoundaries or completedBoundaries because nothing
gets added there unless the parent has already flushed.
It's not exactly slow to have to check the length of three arrays so
it's probably not a big deal.
Flush partials in an early preamble needs further consideration
regardless.
This forks Task into ReplayTask and RenderTask.
A RenderTask is the normal mode and it has a segment to write into.
A ReplayTask doesn't have a segment to write into because that has
already been written but instead it has a ReplayState which keeps track
of the next set of paths to follow. Once we hit a "Resume" node we
convert it into a RenderTask and continue rendering from there.
We can resume at either an Element position or a Slot position. An
Element pointing to a component doesn't mean we resume that component,
it means we resume in the child position directly below that component.
Slots are slots inside arrays.
Instead of statically forking most paths, I kept using the same path and
checked for the existence of a segment or replay state dynamically at
runtime.
However, there's still quite a bit of forking here like retryRenderTask
and retryReplayTask. Even in the new forks there's a lot of duplication
like resumeSuspenseBoundary, replaySuspenseBoundary and
renderSuspenseBoundary. There's opportunity to simplify this a bit.
This is an optimization where useFormState will only emit extra comment
markers if a form state is passed at the root. If no state is passed, we
don't need to emit anything because none of the hooks will match.
The permalink option of useFormState controls which page the form is
submitted to during an MPA form submission (i.e. a submission that
happens before hydration, or when JS is disabled). If the same
useFormState appears on the resulting page, and the permalink option
matches, it should receive the form state from the submission despite
the fact that the keypaths do not match.
So the logic for whether a form state instance is considered a match is:
- Both instances must be passed the same action signature
- If a permalink is provided, the permalinks must match.
- If a permalink is not provided, the keypaths must match.
Currently, if there are multiple matching useFormStates, they will all
match and receive the form state. We should probably only match the
first one, and/or warn when this happens. I've left this as a TODO for
now, pending further discussion.
Typically we assign IDs lazily when flushing to minimize the ids we have
to assign and we try to maximize inlining.
When we prerender we could always flush into a buffer before returning
but that's not actually what we do right now. We complete rendering
before returning but we don't actually run the flush path until someone
reads the resulting stream. We assign IDs eagerly when something
postpone so that we can refer to those ids in the holes without flushing
first.
This leads to some interesting conditions that needs to consider this.
This PR also deals with rootSegmentId which is the ID of the segment
that contains the body of a Suspense boundary. The boundary needs to
know this in addition to the Suspense Boundary's ID to know how to
inject the root segment into the boundary.
Why is the suspense boundary ID and the rootSegmentID just not the same
ID? Why don't segments and suspense boundary not share ID namespace?
That's a good question and I don't really remember. It might not be a
good reason and maybe they should just be the same.
During an MPA form submission, useFormState should only reuse the form
state if same action is passed both times. (We also compare the key
paths.)
We compare the identity of the inner closure function, disregarding the
value of the bound arguments. That way you can pass an inline Server
Action closure:
```js
function FormContainer({maxLength}) {
function submitAction(prevState, formData) {
'use server'
if (formData.get('field').length > maxLength) {
return { errorMsg: 'Too many characters' };
}
// ...
}
return <Form submitAction={submitAction} />
}
```
If a Server Action is passed to useFormState, the action may be
submitted before it has hydrated. This will trigger a full page
(MPA-style) navigation. We can transfer the form state to the next page
by comparing the key path of the hook instance.
`ReactServerDOMServer.decodeFormState` is used by the server to extract
the form state from the submitted action. This value can then be passed
as an option when rendering the new page. It must be passed during both
SSR and hydration.
```js
const boundAction = await decodeAction(formData, serverManifest);
const result = await boundAction();
const formState = decodeFormState(result, formData, serverManifest);
// SSR
const response = createFromReadableStream(<App />);
const ssrStream = await renderToReadableStream(response, { formState })
// Hydration
hydrateRoot(container, <App />, { formState });
```
If the `formState` option is omitted, then the state won't be
transferred to the next page. However, it must be passed in both places,
or in neither; misconfiguring will result in a hydration mismatch.
(The `formState` option is currently prefixed with `experimental_`)
This field was not being initialized. Although the property is part of
the Flow type, the type error wasn't caught because the constructor
itself is not covered by Flow, which is unfortunate. (I assume this is
related to the dev-only componentStack property.)
Currently, if a component suspends, the keyPath has already been
modified to include the identity of the component itself; the path is
set before the component body is called (akin to the begin phase in
Fiber). An accidental consequence is that when the promise resolves and
component is retried, the identity gets appended to the keyPath again,
leading to a duplicate node in the path.
To address this, we should only modify contexts after any code that may
suspend. For maximum safety, this should occur as late as possible:
right before the recursive renderNode call, before the children are
rendered.
I did not add a test yet because there's no feature that currently
observes it, but I do have tests in my other WIP PR for useFormState:
#27321
There's a subtle difference if you suspend before the first array or
after. In Fiber, we don't deal with this because we just suspend the
parent and replay it if lazy() or Usable are used in its child slots. In
Fizz we try to optimize this a bit more and enable resuming inside the
component.
Semantically, it's different if you suspend/postpone before the first
child array or inside that child array. Because when you resume the
inner result might be another array and either that's part of the parent
path or part of the inner slot.
There might be more clever way of structuring this but I just use -1 to
indicate that we're not yet inside the array and is in the root child
position. If that renders an element, then that's just the same as the 0
slot.
We need to also encode this in the resuming. I called that resuming the
element or resuming the slot.
When Float was first developed the internal implementation and external
interface were the same. This is problematic for a few reasons. One, the
public interface is typed but it is also untrusted and we should not
assume that it is actually respected. Two, the internal implementations
can get called from places other than the the public interface and
having to construct an options argument that ends up being destructured
to process the request is computationally wasteful and may limit JIT
optimizations to some degree. Lastly, the wire format was not as
compressed as it could be and it was untyped.
This refactor aims to address that by separating the public interface
from the internal implementations so we can solve these challenges and
also make it easier to change Float in the future
* The internal dispatcher method preinit is now preinitStyle and
preinitScript.
* The internal dispatcher method preinitModule is now
preinitModuleScript in anticipation of different implementations for
other module types in the future.
* The wire format is explicitly typed and only includes options if they
are actually used omitting undefined and nulls.
* Some function arguments are not options even if they are optional. For
instance precedence can be null/undefined because we deafult it to
'default' however we don't cosnider this an option because it is not
something we transparently apply as props to the underlying instance.
* Fixes a problem with keying images in flight where srcset and sizes
were not being taken into account.
* Moves argument validation into the ReactDOMFloat file where it is
shared with all runtimes that expose these methods
* Fixes crossOrigin serialization to use empty string except when
'use-credentials'
It's possible to postpone a specific node and not using a wrapper
component. Therefore we encode the resumable slot as the index slot.
When it's a plain client component that postpones, it's encoded as the
child slot inside that component which is the one that's postponed
rather than the component itself.
Since it's possible for a child slot to suspend (e.g. React.lazy's
microtask in this case) retryTask might need to keep its index around
when it resolves.
A planned feature of useFormState is that if the page load is the result
of an MPA-style form submission — i.e. a form was submitted before it
was hydrated, using Server Actions — the state of the hook should
transfer to the next page.
I haven't implemented that part yet, but as a prerequisite, we need some
way for Fizz to indicate whether a useFormState hook was rendered using
the "postback" state. That way we can do all state matching logic on the
server without having to replicate it on the client, too.
The approach here is to emit a comment node for each useFormState hook.
We use one of two comment types: `<!--F-->` for a normal useFormState
hook, and `<!--F!-->` for a hook that was rendered using the postback
state. React will read these markers during hydration. This is similar
to how we encode Suspense boundaries.
Again, the actual matching algorithm is not yet implemented — for now,
the "not matching" marker is always emitted.
We can optimize this further by not emitting any markers for a render
that is not the result of a form postback, which I'll do in subsequent
PRs.
Just moving some internal code around again.
I originally encoded what type of work using startRender vs
startPrerender. I had intended to do more forking of the work loop but
we've decided not to go with that strategy. It also turns out that
forking when we start working is actually too late because of a subtle
thing where you can call abort before work begins. Therefore it's
important that starting the work comes later.
To generate IDs for useId, we modify a context variable whenever
multiple siblings are rendered, or when a component includes a useId
hook.
When this happens, we must ensure that the context is reset properly on
unwind if something errors or suspends. When I originally implemented
this, I did this by wrapping the child's rendering with a try/finally
block. But a better way to do this is by using the non-destructive
renderNode path instead of renderNodeDestructive.
In https://github.com/facebook/react/pull/21113 I moved this over to the
segment from the task. This partially reverts this two use two fields
instead. I was just trying to micro-optimize by reusing a single field.
This is really conceptually two different values. Task is keeping track
of the working state of the currently executing context.
The segment just needs to keep track of which parent context it was
created in so that it can be wrapped correctly when a segment is
written. We just happened to rely on the working state returning to the
top before completing.
The main motivation is that there is no `segment` for replaying.
This is basically the implementation for the prerender pass.
Instead of forking basically the whole implementation for prerender, I
just add a conditional field on the request. If it's `null` it behaves
like before. If it's non-`null` then instead of triggering client
rendered boundaries it triggers those into a "postponed" state which is
basically just a variant of "pending". It's supposed to be filled in
later.
It also builds up a serializable tree of which path can be followed to
find the holes. This is basically a reverse `KeyPath` tree.
It is unfortunate that this approach adds more code to the regular Fizz
builds but in practice. It seems like this side is not going to add much
code and we might instead just want to merge the builds so that it's
smaller when you have `prerender` and `resume` in the same bundle -
which I think will be common in practice.
This just implements the prerender side, and not the resume side, which
is why the tests have a TODO. That's in a follow up PR.
`dom-legacy` does not make sense for Flight. we could still type check
the files but it adds maintenance burden in the inlinedHostConfigs
whenever things change there. Going to make these configs opaque mixed
types to quiet flow since no entrypoints use the flight code
When the `permalink` option is passed to `useFormState`, and the form is
submitted before it has hydrated, the permalink will be used as the
target of the form action, enabling MPA-style form submissions.
(Note that submitting a form without hydration is a feature of Server
Actions; it doesn't work with regular client actions.)
It does not have any effect after the form has hydrated.
This implements useFormState in Fiber. (It does not include any
progressive enhancement features; those will be added later.)
useFormState is a hook for tracking state produced by async actions. It
has a signature similar to useReducer, but instead of a reducer, it
accepts an async action function.
```js
async function action(prevState, payload) {
// ..
}
const [state, dispatch] = useFormState(action, initialState)
```
Calling dispatch runs the async action and updates the state to the
returned value.
Async actions run before React's render cycle, so unlike reducers, they
can contain arbitrary side effects.
This exposes, but does not yet implement, a new experimental API called
useFormState. It's gated behind the enableAsyncActions flag.
useFormState has a similar signature to useReducer, except instead of a
reducer it accepts an (async) action function. React will wait until the
promise resolves before updating the state:
```js
async function action(prevState, payload) {
// ..
}
const [state, dispatch] = useFormState(action, initialState)
```
When used in combination with Server Actions, it will also support
progressive enhancement — a form that is submitted before it has
hydrated will have its state transferred to the next page. However, like
the other action-related hooks, it works with fully client-driven
actions, too.
This exposes a `resume()` API to go with the `prerender()` (only in
experimental). It doesn't work yet since we don't yet emit the postponed
state so not yet tested.
The main thing this does is rename ResponseState->RenderState and
Resources->ResumableState. We separated out resources into a separate
concept preemptively since it seemed like separate enough but probably
doesn't warrant being a separate concept. The result is that we have a
per RenderState in the Config which is really just temporary state and
things that must be flushed completely in the prerender. Most things
should be ResumableState.
Most options are specified in the `prerender()` and transferred into the
`resume()` but certain options that are unique per request can't be.
Notably `nonce` is special. This means that bootstrap scripts and
external runtime can't use `nonce` in this mode. They need to have a CSP
configured to deal with external scripts, but not inline.
We need to be able to restore state of things that we've already emitted
in the prerender. We could have separate snapshot/restore methods that
does this work when it happens but that means we have to explicitly do
that work. This design is trying to keep to the principle that we just
work with resumable data structures instead so that we're designing for
it with every feature. It also makes restoring faster since it's just
straight into the data structure.
This is not yet a serializable format. That can be done in a follow up.
We also need to vet that each step makes sense. Notably stylesToHoist is
a bit unclear how it'll work.
Tracks the currently executing parent path of a task, using the name of
the component and the key or index.
This can be used to uniquely identify an instance of a component between
requests - assuming nothing in the parents has changed. Even if it has
changed, if things are properly keyed, it should still line up.
It's not used yet but we'll need this for two separate features so
should land this so we can stack on top.
Can be passed to `JSON.stringify(...)` to generate a unique key.
Search for more generic fork files if an exact match does not exist. If
`forks/MyFile.dom.js` exists but `forks/MyFile.dom-node.js` does not
then use it when trying to resolve forks for the `"dom-node"` renderer
in flow, tests, and build
consolidate certain fork files that were identical and make semantic
sense to be generalized
add `dom-browser-esm` bundle and use it for
`react-server-dom-esm/client.browser` build
This adds an experimental `unstable_postpone(reason)` API.
Currently we don't have a way to model effectively an Infinite Promise.
I.e. something that suspends but never resolves. The reason this is
useful is because you might have something else that unblocks it later.
E.g. by updating in place later, or by client rendering.
On the client this works to model as an Infinite Promise (in fact,
that's what this implementation does). However, in Fizz and Flight that
doesn't work because the stream needs to end at some point. We don't
have any way of knowing that we're suspended on infinite promises. It's
not enough to tag the promises because you could await those and thus
creating new promises. The only way we really have to signal this
through a series of indirections like async functions, is by throwing.
It's not 100% safe because these values can be caught but it's the best
we can do.
Effectively `postpone(reason)` behaves like a built-in [Catch
Boundary](https://github.com/facebook/react/pull/26854). It's like
`raise(Postpone, reason)` except it's built-in so it needs to be able to
be encoded and caught by Suspense boundaries.
In Flight and Fizz these behave pretty much the same as errors. Flight
just forwards it to retrigger on the client. In Fizz they just trigger
client rendering which itself might just postpone again or fill in the
value. The difference is how they get logged.
In Flight and Fizz they log to `onPostpone(reason)` instead of
`onError(error)`. This log is meant to help find deopts on the server
like finding places where you fall back to client rendering. The reason
that you pass in is for that purpose to help the reason for any deopts.
I do track the stack trace in DEV but I don't currently expose it to
`onPostpone`. This seems like a limitation. It might be better to expose
the Postpone object which is an Error object but that's more of an
implementation detail. I could also pass it as a second argument.
On the client after hydration they don't get passed to
`onRecoverableError`. There's no global `onPostpone` API to capture
postponed things on the client just like there's no `onError`. At that
point it's just assumed to be intentional. It doesn't have any `digest`
or reason passed to the client since it's not logged.
There are some hacky solutions that currently just tries to reuse as
much of the existing code as possible but should be more properly
implemented.
- Fiber is currently just converting it to a fake Promise object so that
it behaves like an infinite Promise.
- Fizz is encoding the magic digest string `"POSTPONE"` in the HTML so
we know to ignore it but it should probably just be something neater
that doesn't share namespace with digests.
Next I plan on using this in the `/static` entry points for additional
features.
Why "postpone"? It's basically a synonym to "defer" but we plan on using
"defer" for other purposes and it's overloaded anyway.
Since we no longer have externally configured "process" methods, I just
inlined all of those.
The main thing in this refactor is that I just inlined all the error
branches into just `emitErrorChunk`. I'm not sure why it was split up an
repeated before but this seems simpler. I need it since I'm going to be
doing similar copies of this.
This uses the same mechanism as [large
strings](https://github.com/facebook/react/pull/26932) to encode chunks
of length based binary data in the RSC payload behind a flag.
I introduce a new BinaryChunk type that's specific to each stream and
ways to convert into it. That's because we sometimes need all chunks to
be Uint8Array for the output, even if the source is another array buffer
view, and sometimes we need to clone it before transferring.
Each type of typed array is its own row tag. This lets us ensure that
the instance is directly in the right format in the cached entry instead
of creating a wrapper at each reference. Ideally this is also how
Map/Set should work but those are lazy which complicates that approach a
bit.
We assume both server and client use little-endian for now. If we want
to support other modes, we'd convert it to/from little-endian so that
the transfer protocol is always little-endian. That way the common
clients can be the fastest possible.
So far this only implements Server to Client. Still need to implement
Client to Server for parity.
NOTE: This is the first time we make RSC effectively a binary format.
This is not compatible with existing SSR techniques which serialize the
stream as unicode in the HTML. To be compatible, those implementations
would have to use base64 or something like that. Which is what we'll do
when we move this technique to be built-in to Fizz.
We already support these in the sense that they're Iterable so they just
get serialized as arrays. However, these are part of the Structured
Clone algorithm [and should be
supported](https://github.com/facebook/react/issues/25687).
The encoding is simply the same form as the Iterable, which is
conveniently the same as the constructor argument. The difference is
that now there's a separate reference to it.
It's a bit awkward because for multiple reference to the same value,
it'd be a new Map/Set instance for each reference. So to encode sharing,
it needs one level of indirection with its own ID. That's not really a
big deal for other types since they're inline anyway - but since this
needs to be outlined it creates possibly two ids where there only needs
to be one or zero.
One variant would be to encode this in the row type. Another variant
would be something like what we do for React Elements where they're
arrays but tagged with a symbol. For simplicity I stick with the simple
outlining for now.
## Summary
This PR cleans up `useMutableSource`. This has been blocked by a
remaining dependency internally at Meta, but that has now been deleted.
<!--
Explain the **motivation** for making this change. What existing problem
does the pull request solve?
-->
## How did you test this change?
```
yarn flow
yarn lint
yarn test --prod
```
<!--
Demonstrate the code is solid. Example: The exact commands you ran and
their output, screenshots / videos if the pull request changes the user
interface.
How exactly did you verify that your PR solves the issue you wanted to
solve?
If you leave this empty, your PR will very likely be closed.
-->
This introduces a Text row (T) which is essentially a string blob and
refactors the parsing to now happen at the binary level.
```
RowID + ":" + "T" + ByteLengthInHex + "," + Text
```
Today, we encode all row data in JSON, which conveniently never has
newline characters and so we use newline as the line terminator. We
can't do that if we pass arbitrary unicode without escaping it. Instead,
we pass the byte length (in hexadecimal) in the leading header for this
row tag followed by a comma.
We could be clever and use fixed or variable-length binary integers for
the row id and length but it's not worth the more difficult
debuggability so we keep these human readable in text.
Before this PR, we used to decode the binary stream into UTF-8 strings
before parsing them. This is inefficient because sometimes the slices
end up having to be copied so it's better to decode it directly into the
format. The follow up to this is also to add support for binary data and
then we can't assume the entire payload is UTF-8 anyway. So this
refactors the parser to parse the rows in binary and then decode the
result into UTF-8. It does add some overhead to decoding on a per row
basis though.
Since we do this, we need to encode the byte length that we want decode
- not the string length. Therefore, this requires clients to receive
binary data and why I had to delete the string option.
It also means that I had to add a way to get the byteLength from a chunk
since they're not always binary. For Web streams it's easy since they're
always typed arrays. For Node streams it's trickier so we use the
byteLength helper which may not be very efficient. Might be worth
eagerly encoding them to UTF8 - perhaps only for this case.
This isn't really meant to be actually used, there are many issues with
this approach, but it shows the capabilities as a proof-of-concept.
It's a new reference implementation package `react-server-dom-esm` as
well as a fixture in `fixtures/flight-esm` (fork of `fixtures/flight`).
This works pretty much the same as pieces we already have in the Webpack
implementation but instead of loading modules using Webpack on the
client it uses native browser ESM.
To really show it off, I don't use any JSX in the fixture and so it also
doesn't use Babel or any compilation of the files.
This works because we don't actually bundle the server in the reference
implementation in the first place. We instead use [Node.js
Loaders](https://nodejs.org/api/esm.html#loaders) to intercept files
that contain `"use client"` and `"use server"` and replace them. There's
a simple check for those exact bytes, and no parsing, so this is very
fast.
Since the client isn't actually bundled, there's no module map needed.
We can just send the file path to the file we want to load in the RSC
payload for client references.
Since the existing reference implementation for Node.js already used ESM
to load modules on the server, that all works the same, including Server
Actions. No bundling.
There is one case that isn't implemented here. Importing a `"use
server"` file from a Client Component. We don't have that implemented in
the Webpack reference implementation neither - only in Next.js atm. In
Webpack it would be implemented as a Webpack loader.
There are a few ways this can be implemented without a bundler:
- We can intercept the request from the browser importing this file in
the HTTP server, and do a quick scan for `"use server"` in the file and
replace it just like we do with loaders in Node.js. This is effectively
how Vite works and likely how anyone using this technique would have to
support JSX anyway.
- We can use native browser "loaders" once that's eventually available
in the same way as in Node.js.
- We can generate import maps for each file and replace it with a
pointer to a placeholder file. This requires scanning these ahead of
time which defeats the purposes.
Another case that's not implemented is the inline `"use server"` closure
in a Server Component. That would require the existing loader to be a
bit smarter but would still only "compile" files that contains those
bytes in the fast path check. This would also happen in the loader that
already exists so wouldn't do anything substantially different than what
we currently have here.
This PR adds a preload for bootstrapScripts. preloads are captured
synchronously when you create a new Request and as such the normal logic
to check if a preload already exists is skipped.