It can be efficient to accept raw string chunks to pass through a stream
instead of encoding them into a binary copy first.
Previously our Flight parsers didn't accept receiving string chunks.
That's partly because we sometimes need to encode binary chunks anyway
so string only transport isn't enough but some chunks can be strings.
This adds a partial ability for chunks to be received as strings.
However, accepting strings comes with some downsides. E.g. if the
strings are split up we need to buffer it which compromises the perf for
the common case. If the chunk represents binary data, then we'd need to
encode it back into a typed array which would require a TextEncoder
dependency in the parser. If the string chunk represents a byte length
encoded string we don't know how many unicode characters to read without
measuring them in terms of binary - also requiring a TextEncoder.
This PR is mainly intended for use for pass-through within the same
memory. We can simplify the implementation by assuming that any string
chunk is passed as the original chunk. This requires that the server
stream config doesn't arbitrarily concatenate strings (e.g. large
strings should not be concatenated which is probably a good heuristic
anyway). It also means that this is not suitable to be used with for
example receiving string chunks on the client by passing them through
SSR hydration data - except if the encoding that way was only used with
chunks that were already encoded as strings by Flight.
Web streams mostly just work on binary data anyway so they can't use
this.
In Node.js streams we concatenate precomputed and small strings into
larger buffers. It might make sense to do that using string ropes
instead. However, in the meantime we can at least pass large strings
that are outside our buffer view size as raw strings. There's no benefit
to us eagerly encoding those.
Also, let Node accept string chunks as long as they're following our
expected constraints. This lets us test the mixed protocol using
pass-throughs. This can also be useful when the RSC server is in the
same environment as the SSR server as they don't have to go from strings
to typed arrays back to strings.
Now we can also use this in the pass-through used in renderToMarkup.
This lets us avoid the dependency on TextDecoder/TextEncoder in that
package.
Stacked on #29551
Flight pings much more often than Fizz because async function components
will always take at least a microtask to resolve . Rather than
scheduling this work as a new macrotask Flight now schedules pings in a
microtask. This allows more microtasks to ping before actually doing a
work flush but doesn't force the vm to spin up a new task which is quite
common give n the nature of Server Components
A while back we implemented a heuristic that if a chunk was large it was
assumed to be produced by the render and thus was safe to stream which
results in transferring the underlying object memory. Later we ran into
an issue where a precomputed chunk grew large enough to trigger this
hueristic and it started causing renders to fail because once a second
render had occurred the precomputed chunk would not have an underlying
buffer of bytes to send and these bytes would be omitted from the
stream. We implemented a technique to detect large precomputed chunks
and we enforced that these always be cloned before writing.
Unfortunately our test coverage was not perfect and there has been for a
very long time now a usage pattern where if you complete a boundary in
one flush and then complete a boundary that has stylehsheet dependencies
in another flush you can get a large precomputed chunk that was not
being cloned to be sent twice causing streaming errors.
I've thought about why we even went with this solution in the first
place and I think it was a mistake. It relies on a dev only check to
catch paired with potentially version specific order of operations on
the streaming side. This is too unreliable. Additionally the low limit
of view size for Edge is not used in Node.js but there is not real
justification for this.
In this change I updated the view size for edge streaming to match Node
at 2048 bytes which is still relatively small and we have no data one
way or another to preference 512 over this. Then I updated the assertion
logic to error anytime a precomputed chunk exceeds the size. This
eliminates the need to clone these chunks by just making sure our view
size is always larger than the largest precomputed chunk we can possibly
write. I'm generally in favor of this for a few reasons.
First, we'll always know during testing whether we've violated the limit
as long as we exercise each stream config because the precomputed chunks
are created in module scope. Second, we can always split up large chunks
so making sure the precomptued chunk is smaller than whatever view size
we actually desire is relatively trivial.
To support MPA-style form submissions, useFormState sends down a key
that represents the identity of the hook on the page. It's based on the
key path of the component within the React tree; for deeply nested
hooks, this keypath can become very long. We can hash the key to make it
shorter.
Adds a method called createFastHash to the Stream Config interface.
We're not using this for security or obfuscation, only to generate a
more compact key without sacrificing too much collision resistance.
- In Node.js builds, createFastHash uses the built-in crypto module.
- In Bun builds, createFastHash uses Bun.hash. See:
https://bun.sh/docs/api/hashing#bun-hash
I have not yet implemented createFastHash in the Edge, Browser, or FB
(Hermes) stream configs because those environments do not have a
built-in hashing function that meets our requirements. (We can't use the
web standard `crypto` API because those methods are async, and yielding
to the main thread is too costly to be worth it for this particular use
case.) We'll likely use a pure JS implementation in those environments;
for now, they just return the original string without hashing it. I'll
address this in separate PRs.
This uses the same mechanism as [large
strings](https://github.com/facebook/react/pull/26932) to encode chunks
of length based binary data in the RSC payload behind a flag.
I introduce a new BinaryChunk type that's specific to each stream and
ways to convert into it. That's because we sometimes need all chunks to
be Uint8Array for the output, even if the source is another array buffer
view, and sometimes we need to clone it before transferring.
Each type of typed array is its own row tag. This lets us ensure that
the instance is directly in the right format in the cached entry instead
of creating a wrapper at each reference. Ideally this is also how
Map/Set should work but those are lazy which complicates that approach a
bit.
We assume both server and client use little-endian for now. If we want
to support other modes, we'd convert it to/from little-endian so that
the transfer protocol is always little-endian. That way the common
clients can be the fastest possible.
So far this only implements Server to Client. Still need to implement
Client to Server for parity.
NOTE: This is the first time we make RSC effectively a binary format.
This is not compatible with existing SSR techniques which serialize the
stream as unicode in the HTML. To be compatible, those implementations
would have to use base64 or something like that. Which is what we'll do
when we move this technique to be built-in to Fizz.
This introduces a Text row (T) which is essentially a string blob and
refactors the parsing to now happen at the binary level.
```
RowID + ":" + "T" + ByteLengthInHex + "," + Text
```
Today, we encode all row data in JSON, which conveniently never has
newline characters and so we use newline as the line terminator. We
can't do that if we pass arbitrary unicode without escaping it. Instead,
we pass the byte length (in hexadecimal) in the leading header for this
row tag followed by a comma.
We could be clever and use fixed or variable-length binary integers for
the row id and length but it's not worth the more difficult
debuggability so we keep these human readable in text.
Before this PR, we used to decode the binary stream into UTF-8 strings
before parsing them. This is inefficient because sometimes the slices
end up having to be copied so it's better to decode it directly into the
format. The follow up to this is also to add support for binary data and
then we can't assume the entire payload is UTF-8 anyway. So this
refactors the parser to parse the rows in binary and then decode the
result into UTF-8. It does add some overhead to decoding on a per row
basis though.
Since we do this, we need to encode the byte length that we want decode
- not the string length. Therefore, this requires clients to receive
binary data and why I had to delete the string option.
It also means that I had to add a way to get the byteLength from a chunk
since they're not always binary. For Web streams it's easy since they're
always typed arrays. For Node streams it's trickier so we use the
byteLength helper which may not be very efficient. Might be worth
eagerly encoding them to UTF8 - perhaps only for this case.
Stacked on #26557
Supporting Float methods such as ReactDOM.preload() are challenging for
flight because it does not have an easy means to convey direct
executions in other environments. Because the flight wire format is a
JSON-like serialization that is expected to be rendered it currently
only describes renderable elements. We need a way to convey a function
invocation that gets run in the context of the client environment
whether that is Fizz or Fiber.
Fiber is somewhat straightforward because the HostDispatcher is always
active and we can just have the FlightClient dispatch the serialized
directive.
Fizz is much more challenging becaue the dispatcher is always scoped but
the specific request the dispatch belongs to is not readily available.
Environments that support AsyncLocalStorage (or in the future
AsyncContext) we will use this to be able to resolve directives in Fizz
to the appropriate Request. For other environments directives will be
elided. Right now this is pragmatic and non-breaking because all
directives are opportunistic and non-critical. If this changes in the
future we will need to reconsider how widespread support for async
context tracking is.
For Flight, if AsyncLocalStorage is available Float methods can be
called before and after await points and be expected to work. If
AsyncLocalStorage is not available float methods called in the sync
phase of a component render will be captured but anything after an await
point will be a noop. If a float call is dropped in this manner a DEV
warning should help you realize your code may need to be modified.
This PR also introduces a way for resources (Fizz) and hints (Flight) to
flush even if there is not active task being worked on. This will help
when Float methods are called in between async points within a function
execution but the task is blocked on the entire function finishing.
This PR also introduces deduping of Hints in Flight using the same
resource keys used in Fizz. This will help shrink payload sizes when the
same hint is attempted to emit over and over again
Added an explicit type to all $FlowFixMe suppressions to reduce
over-suppressions of new errors that might be caused on the same lines.
Also removes suppressions that aren't used (e.g. in a `@noflow` file as
they're purely misleading)
Test Plan:
yarn flow-ci
We encode strings 2048 UTF-8 bytes at a time. If the string we are
encoding crosses to the next chunk but the current chunk doesn't fit an
integral number of characters, we need to make sure not to send the
whole buffer, only the bytes that are actually meaningful.
Fixes#24985. I was able to verify that this fixes the repro shared in
the issue (be careful when testing because the null bytes do not show
when printed to my terminal, at least). However, I don't see a clear way
to add a test for this that will be resilient to small changes in how we
encode the markup (since it depends on where specific multibyte
characters fall against the 2048-byte boundaries).
The old version of prettier we were using didn't support the Flow syntax
to access properties in a type using `SomeType['prop']`. This updates
`prettier` and `rollup-plugin-prettier` to the latest versions.
I added the prettier config `arrowParens: "avoid"` to reduce the diff
size as the default has changed in Prettier 2.0. The largest amount of
changes comes from function expressions now having a space. This doesn't
have an option to preserve the old behavior, so we have to update this.
## Edit
Went for another approach after talking with @gnoff. The approach is
now:
- add a dev-only error when a precomputed chunk is too big to be written
- suggest to copy it before passing it to `writeChunk`
This PR also includes porting the React Float tests to use the browser
build of Fizz so that we can test it out on that environment (which is
the one used by next).
<!--
Thanks for submitting a pull request!
We appreciate you spending the time to work on these changes. Please
provide enough information so that others can review your pull request.
The three fields below are mandatory.
Before submitting a pull request, please make sure the following is
done:
1. Fork [the repository](https://github.com/facebook/react) and create
your branch from `main`.
2. Run `yarn` in the repository root.
3. If you've fixed a bug or added code that should be tested, add tests!
4. Ensure the test suite passes (`yarn test`). Tip: `yarn test --watch
TestName` is helpful in development.
5. Run `yarn test --prod` to test in the production environment. It
supports the same options as `yarn test`.
6. If you need a debugger, run `yarn debug-test --watch TestName`, open
`chrome://inspect`, and press "Inspect".
7. Format your code with
[prettier](https://github.com/prettier/prettier) (`yarn prettier`).
8. Make sure your code lints (`yarn lint`). Tip: `yarn linc` to only
check changed files.
9. Run the [Flow](https://flowtype.org/) type checks (`yarn flow`).
10. If you haven't already, complete the CLA.
Learn more about contributing:
https://reactjs.org/docs/how-to-contribute.html
-->
## Summary
Someone reported [a bug](https://github.com/vercel/next.js/issues/42466)
in Next.js that pointed to an issue with Node 18 in the streaming
renderer when using importing a CSS module where it only returned a
malformed bootstraping script only after loading the page once.
After investigating a bit, here's what I found:
- when using a CSS module in Next, we go into this code path, which
writes the aforementioned bootstrapping script
5f7ef8c4cb/packages/react-dom-bindings/src/server/ReactDOMServerFormatConfig.js (L2443-L2447)
- the reason for the malformed script is that
`completeBoundaryWithStylesScript1FullBoth` is emptied after the call to
`writeChunk`
- it gets emptied in `writeChunk` because we stream the chunk directly
without copying it in this codepath
a438590144/packages/react-server/src/ReactServerStreamConfigBrowser.js (L63)
- the reason why it only happens from Node 18 is because the Webstreams
APIs are available natively from that version and in their
implementation, [`enqueue` transfers the array buffer
ownership](9454ba6138/lib/internal/webstreams/readablestream.js (L2641)),
thus making it unavailable/empty for subsequent calls. In older Node
versions, we don't encounter the bug because we are using a polyfill in
Next.js, [which does not implement properly the array buffer transfer
behaviour](d354a7457c/src/lib/abstract-ops/ecmascript.ts (L16)).
I think the proper fix for this is to clone the array buffer before
enqueuing it. (we do this in the other code paths in the function later
on, see ```((currentView: any): Uint8Array).set(bytesToWrite,
writtenBytes);```
## How did you test this change?
Manually tested by applying the change in the compiled Next.js version.
<!--
Demonstrate the code is solid. Example: The exact commands you ran and
their output, screenshots / videos if the pull request changes the user
interface.
How exactly did you verify that your PR solves the issue you wanted to
solve?
If you leave this empty, your PR will very likely be closed.
-->
Co-authored-by: Sebastian Markbage <sebastian@calyptus.eu>
This extends the scope of the cache and fetch instrumentation using
AsyncLocalStorage for microtasks. This is an intermediate step. It sets
up the dispatcher only once. This is unique to RSC because it uses the
react.shared-subset module for its shared state.
Ideally we should support multiple renderers. We should also have this
take over from an outer SSR's instrumented fetch. We should also be able
to have a fallback to global state per request where AsyncLocalStorage
doesn't exist and then the whole client-side solutions. I'm still
figuring out the right wiring for that so this is a temporary hack.
escapeTextForBrowser accepts any type so flow did not identify that we
were escaping a Chunk rather than a string. It's tricky because we
sometimes want to be able to escape non strings.
I've also updated the types for `Chunk` and `escapeTextForBrowser`
so that we should be able to catch this statically in the future.
The reason this did not show up in tests is almost all of our tests of
float (the areas affected by transpositions) are tested using the Node
runtime where a chunk type is a string. It may be wise to run these
tests in every runtime in the future or at least make sure there is
broad representation of resources in each specific runtime test suite.
* Facebook -> Meta in copyright
rg --files | xargs sed -i 's#Copyright (c) Facebook, Inc. and its affiliates.#Copyright (c) Meta Platforms, Inc. and affiliates.#g'
* Manual tweaks
- method unbinding is no longer supported in Flow for soundness, this added a bunch of suppressions
- Flow now prevents objects to be supertypes of interfaces/classes
ghstack-source-id: d7749cbad8
Pull Request resolved: https://github.com/facebook/react/pull/25412
This upgrade made more expressions invalidate refinements. In some
places this lead to a large number of suppressions that I automatically
suppressed and should be followed up on when the code is touched.
I think most of them might require either manual annotations or moving
a value into a const to allow refinement.
ghstack-source-id: a45b40abf0
Pull Request resolved: https://github.com/facebook/react/pull/25410
* Add fixture for comparing baseline render perf for renderToString and renderToPipeableStream
Modified from ssr2 and https://github.com/SuperOleg39/react-ssr-perf-test
* Implement buffering in pipeable streams
The previous implementation of pipeable streaming (Node) suffered some performance issues brought about by the high chunk counts and innefficiencies with how node streams handle this situation. In particular the use of cork/uncork was meant to alleviate this but these methods do not do anything unless the receiving Writable Stream implements _writev which many won't.
This change adopts the view based buffering techniques previously implemented for the Browser execution context. The main difference is the use of backpressure provided by the writable stream which is not implementable in the other context. Another change to note is the use of standards constructs like TextEncoder and TypedArrays.
* Implement encodeInto during flushCompletedQueues
encodeInto allows us to write directly to the view buffer that will end up getting streamed instead of encoding into an intermediate buffer and then copying that data.
This function was modeled after Node streams where write returns a boolean
whether to keep writing or not. I think we should probably switch this
up and read desired size explicitly in appropriate places.
However, in the meantime, we don't have to return a value where we're
not going to use it. So I split this so that we call writeChunkAndReturn
if we're going to return the boolean.
This should help with the compilation so that they can be inlined.
* Wire up DOM legacy build
* Hack to filter extra comments for testing purposes
* Use string concat in renderToString
I think this might be faster. We could probably use a combination of this
technique in the stream too to lower the overhead.
* Error if we can't complete the root synchronously
Maybe this should always error but in the async forms we can just delay
the stream until it resolves so it does have some useful semantics.
In the synchronous form it's never useful though. I'm mostly adding the
error because we're testing this behavior for renderToString specifically.
* Gate memory leak tests of internals
These tests don't translate as is to the new implementation and have been
ported to the Fizz tests separately.
* Enable Fizz legacy mode in stable
* Add wrapper around the ServerFormatConfig for legacy mode
This ensures that we can inject custom overrides without negatively
affecting the new implementation.
This adds another field for static mark up for example.
* Wrap pushTextInstance to avoid emitting comments for text in static markup
* Don't emit static mark up for completed suspense boundaries
Completed and client rendered boundaries are only marked for the client
to take over.
Pending boundaries are still supported in case you stream non-hydratable
mark up.
* Wire up generateStaticMarkup to static API entry points
* Mark as renderer for stable
This shouldn't affect the FB one ideally but it's done with the same build
so let's hope this works.
Some legacy environments can not encode non-strings. Those would specify
both as strings. They'll throw for binary data.
Some environments have to encode strings (like web streams). Those would
encode both as uint8array.
Some environments (like Node) can do either. It can be beneficial to leave
things as strings in case the native stream can do something smart with it.
* Destroy the stream with an error if the root throws
But not if the error happens inside a suspense boundary.
* Try rewriting the test to see if it works in other Node envs
* Enable prefer-const rule
Stylistically I don't like this but Closure Compiler takes advantage of
this information.
* Auto-fix lints
* Manually fix the remaining callsites
* Rename to clarify that it's client-only
* Rename FizzStreamer to FizzServer for consistency
* Rename react-flight to react-client/flight
For consistency with react-server. Currently this just includes flight
but it could be expanded to include the whole reconciler.
* Add Relay Flight Build
* Rename ReactServerHostConfig to ReactServerStreamConfig
This will be the config specifically for streaming purposes.
There will be other configs for other purposes.