-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
Virtual File System for Node.js #61478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Review requested:
|
|
The
notable-change
Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section. |
|
Nice! This is a great addition. Since it's such a large PR, this will take me some time to review. Will try to tackle it over the next week. |
| */ | ||
| existsSync(path) { | ||
| // Prepend prefix to path for VFS lookup | ||
| const fullPath = this.#prefix + (StringPrototypeStartsWith(path, '/') ? path : '/' + path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use path.join?
| validateObject(files, 'options.files'); | ||
| } | ||
|
|
||
| const { VirtualFileSystem } = require('internal/vfs/virtual_fs'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we import this at the top level / lazy load it at the top level?
| ArrayPrototypePush(this.#mocks, { | ||
| __proto__: null, | ||
| ctx, | ||
| restore: restoreFS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| restore: restoreFS, | |
| restore: ctx.restore, |
nit
lib/internal/vfs/entries.js
Outdated
| * @param {object} [options] Optional configuration | ||
| */ | ||
| addFile(name, content, options) { | ||
| const path = this._directory.path + '/' + name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use path.join?
lib/internal/vfs/virtual_fs.js
Outdated
| let entry = current.getEntry(segment); | ||
| if (!entry) { | ||
| // Auto-create parent directory | ||
| const dirPath = '/' + segments.slice(0, i + 1).join('/'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use path.join
lib/internal/vfs/virtual_fs.js
Outdated
| let entry = current.getEntry(segment); | ||
| if (!entry) { | ||
| // Auto-create parent directory | ||
| const parentPath = '/' + segments.slice(0, i + 1).join('/'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path.join?
lib/internal/vfs/virtual_fs.js
Outdated
| } | ||
| } | ||
| callback(null, content); | ||
| }).catch((err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| }).catch((err) => { | |
| }, (err) => { |
lib/internal/vfs/virtual_fs.js
Outdated
| const bytesToRead = Math.min(length, available); | ||
| content.copy(buffer, offset, readPos, readPos + bytesToRead); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primordials?
lib/internal/vfs/virtual_fs.js
Outdated
| } | ||
|
|
||
| callback(null, bytesToRead, buffer); | ||
| }).catch((err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| }).catch((err) => { | |
| }, (err) => { |
|
Left an initial review, but like @Ethan-Arrowood said, it'll take time for a more in depth look |
|
It's nice to see some momentum in this area, though from a first glance it seems the design has largely overlooked the feedback from real world use cases collected 4 years ago: https://github.com/nodejs/single-executable/blob/main/docs/virtual-file-system-requirements.md - I think it's worth checking that the API satisfies the constraints that users of this feature have provided, to not waste the work that have been done by prior contributors to gather them, or having to reinvent it later (possibly in a breaking manner) to satisfy these requirements from real world use cases. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #61478 +/- ##
==========================================
- Coverage 89.74% 89.67% -0.08%
==========================================
Files 675 690 +15
Lines 204601 210640 +6039
Branches 39325 40148 +823
==========================================
+ Hits 183616 188883 +5267
- Misses 13273 14043 +770
- Partials 7712 7714 +2
🚀 New features to boost your workflow:
|
|
And why not something like OPFS aka whatwg/fs? const rootHandle = await navigator.storage.getDirectory()
await rootHandle.getFileHandle('config.json', { create: true })
fs.mount('/app', rootHandle) // to make it work with fs
fs.readFileSync('/app/config.json')OR const rootHandle = await navigator.storage.getDirectory()
await rootHandle.getFileHandle('config.json', { create: true })
fs.readFileSync('sandbox:/config.json')fs.createVirtual seems like something like a competing specification |
5e317de to
977cc3d
Compare
I generally prefer not to interleave with WHATWG specs as much as possible for core functionality (e.g., SEA). In my experience, they tend to perform poorly on our codebase and remove a few degrees of flexibility. (I also don't find much fun in working on them, and I'm way less interested in contributing to that.) On an implementation side, the core functionality of this feature will be identical (technically, it's missing writes that OPFS supports), as we would need to impact all our internal fs methods anyway. If this lands, we can certainly iterate on a WHATWG-compatible API for this, but I would not add this to this PR. |
|
Small prior art: https://github.com/juliangruber/subfs |
8d711c1 to
73c18cd
Compare
|
I also worked on this a bit on the side recently: Qard@73b8fc6 That is very much in chaotic ideation stage with a bunch of LLM assistance to try some different ideas, but the broader concept I was aiming for was to have a module.exports = new VirtualFileSystem(new LocalProvider())I intended for it to be extensible for a bunch of different interesting scenarios, so there's also an S3 provider and a zip file provider there, mainly just to validate that the model can be applied to other varieties of storage systems effectively. Keep in mind, like I said, the current state is very much just ideation in a branch I pushed up just now to share, but I think there are concepts for extensibility in there that we could consider to enable a whole ecosystem of flexible storage providers. 🙂 Personally, I would hope for something which could provide both read and write access through an abstraction with swappable backends of some variety, this way we could pass around these virtualized file systems like objects and let an ecosystem grow around accepting any generalized virtual file system for its storage backing. I think it'd be very nice for a lot of use cases like file uploads or archive management to be able to just treat them like any other readable and writable file system. |
just a bit off topic... but this reminds me of why i created this feature request: Would not lie, it would be cool if NodeJS also provided some type of static example that would only work in NodeJS (based on how it works internally) const size = 26
const blobPart = BlobFrom({
size,
stream (start, end) {
// can either be sync or async (that resolves to a ReadableStream)
// return new Response('abcdefghijklmnopqrstuvwxyz'.slice(start, end)).body
// return new Blob(['abcdefghijklmnopqrstuvwxyz'.slice(start, end)]).stream()
return fetch('https://httpbin.dev/range/' + size, {
headers: {
range: `bytes=${start}-${end - 1}`
}
}).then(r => r.body)
}
})
blobPart.text().then(text => {
console.log('a-z', text)
})
blobPart.slice(-3).text().then(text => {
console.log('x-z', text)
})
const a = blobPart.slice(0, 6)
a.text().then(text => {
console.log('a-f', text)
})
const b = a.slice(2, 4)
b.text().then(text => {
console.log('c-d', text)
})An actual working PoC (I would not rely on this unless it became officially supported by nodejs core - this is a hack) const blob = new Blob()
const symbols = Object.getOwnPropertySymbols(blob)
const blobSymbol = symbols.map(s => [s.description, s])
const symbolMap = Object.fromEntries(blobSymbol)
const {
kHandle,
kLength,
} = symbolMap
function BlobFrom ({ size, stream }) {
const blob = new Blob()
if (size === 0) return blob
blob[kLength] = size
blob[kHandle] = {
span: [0, size],
getReader () {
const [start, end] = this.span
if (start === end) {
return { pull: cb => cb(0) }
}
let reader
return {
async pull (cb) {
reader ??= (await stream(start, end)).getReader()
const {done, value} = await reader.read()
cb(done ^ 1, value)
}
}
},
slice (start, end) {
const [baseStart] = this.span
return {
span: [baseStart + start, baseStart + end],
getReader: this.getReader,
slice: this.slice,
}
}
}
return blob
}currently problematic to do: also need to handle properly clone, serialize & deserialize, if this where to be sent of to another worker - then i would transfer a MessageChannel where the worker thread asks main frame to hand back a transferable ReadableStream when it needs to read something. but there are probably better ways to handle this internally in core with piping data directly to and from different destinations without having to touch the js runtime? - if only getReader could return the reader directly instead of needing to read from the ReadableStream using js? |
- Separate readFileSync calls in SEA VFS example for clarity - Make getSeaVfs() throw ERR_INVALID_ARG_VALUE if called with a different prefix than the first call, preventing subtle bugs PR-URL: nodejs#61478
Separate initialization from retrieval for cleaner API: - initSeaVfs(options): Initialize with custom options, throws if called twice (ERR_INVALID_STATE) - getSeaVfs(): Get VFS instance, auto-initializes with defaults if not yet initialized This is an internal-only API change; the public behavior remains the same (SEA VFS auto-initializes at startup). PR-URL: nodejs#61478
- Add VFS docs link in test.md mock-fs section - Update TODO precision in test-runner-mock-fs.js - Move async truncate test to test-vfs-promises.js - Expand promise tests with full method coverage - Remove trivial assertion and clean up comments in test-vfs-chdir.js - Add ESM cache explanation in test-vfs-import.mjs - Split overlay worker code into test/fixtures/vfs-overlay-worker.js - Reorganize test-vfs-provider-memory.js to provider-specific tests - Add ESM require tests to test-vfs-require.js
Remove normalizePath and joinMountPath from router.js in favor of path.normalize, path.resolve, and path.join. Remove unused VFS_FD_BASE export from fd.js.
Inline VFS_FD_BASE constant since it was only used once. Replace custom getFormatFromExtension with extensionFormatMap from internal/modules/esm/formats.
- Expand test-vfs-provider-memory.js with comprehensive tests: - Provider instantiation and VFS creation - File operations (read, write, append, copy, rename, unlink) - Directory operations (mkdir, rmdir, readdir, recursive mkdir) - Stat/lstat operations with error handling - Symlink operations (symlink, readlink, lstat) - File handle operations (sync and async) - Mounting tests - Readonly mode tests (sync and async) - Add promises.symlink and promises.readlink tests to test-vfs-promises.js for complete promises API coverage
| added: REPLACEME | ||
| --> | ||
|
|
||
| * `provider` {VirtualProvider} The provider to use. **Default:** `MemoryProvider`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ergonomically it might make sense to allow this to be string | VirtualProvider where string is a constrained set of default values like 'memory'.
| * `moduleHooks` {boolean} Enable module loading hooks. **Default:** `true`. | ||
| * `virtualCwd` {boolean} Enable virtual working directory. **Default:** `false`. | ||
|
|
||
| Creates a new `VirtualFileSystem` instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should explain whether multiple VirtualFilesystem instances are possible to create and how the potentially interact, if at all.
|
|
||
| const myVfs = vfs.create(); | ||
| myVfs.writeFileSync('/data.txt', 'Hello'); | ||
| myVfs.mount('/virtual'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an ergonomics perspective, it might be worth exploring whether create should have an auto mount option where mounting happens on creation...
const myVfs = vfs.create('memory', {
mount: '/virtual',
});| Unmounts the virtual file system. After unmounting, virtual files are no longer | ||
| accessible through the `fs` module. The VFS can be remounted at the same or a | ||
| different path by calling `mount()` again. Unmounting also resets the virtual | ||
| working directory if one was set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume unmount() is idempotent.
|
|
||
| const myVfs = vfs.create(); | ||
| myVfs.writeFileSync('/data.txt', 'Hello'); | ||
| myVfs.mount('/virtual'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this should probably show an example of mounting on a Windows system. Does it use drive letters?
| realVfs.chdir('/project/src'); | ||
| console.log(realVfs.cwd()); // '/project/src' | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning worker threads here raises a question: should VirtualFileSystem be cloneable/transferable such that we can do..
parentPort.postMessage(vfs);So that the VirtualFileSystem instance can be shared/transferred across thread boundaries? Doesn't need to be done now but it's worth considering for later.
|
|
||
| Returns `true` if overlay mode is enabled. In overlay mode, the VFS only | ||
| intercepts paths that exist in the VFS, allowing other paths to fall through | ||
| to the real file system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably discuss what happens when you're overlaying on a file system that is case-sensitive vs. one that is case-insensitive.
There's also a question about encodings of file/directory names. The clarification would be: in overlay mode, does handling of the file names inherit the behavior of the overlaid file system or overwrite it?
| myVfs.writeFileSync('/app/greet.js', 'module.exports = (name) => "Hello, " + name + "!";'); | ||
|
|
||
| // Mount the VFS at a path prefix | ||
| myVfs.mount('/virtual'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a security backup, in case a malicious module decided to create a virtual file system secretly to try to intercept writes to critical paths, it might make sense to support a process.on('vfs-mount') and process.on('vfs-unmount') set of events. An application can be notified when a path is overlaid.
doc/api/vfs.md
Outdated
|
|
||
| const myVfs = vfs.create(); | ||
| myVfs.mkdirSync('/data'); | ||
| myVfs.writeFileSync('/data/config.json', '{}'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| myVfs.writeFileSync('/data/config.json', '{}'); | |
| myVfs.writeFileSync('/data/config.json', JSON.stringify({})); |
doc/api/vfs.md
Outdated
|
|
||
| const myVfs = vfs.create({ overlay: true }); | ||
| myVfs.mkdirSync('/data'); | ||
| myVfs.writeFileSync('/data/config.json', '{"source": "vfs"}'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| myVfs.writeFileSync('/data/config.json', '{"source": "vfs"}'); | |
| myVfs.writeFileSync('/data/config.json', JSON.stringify({source: 'vfs'})); |
| this.#autoClose = options.autoClose !== false; | ||
|
|
||
| // Open the file on next tick so listeners can be attached | ||
| process.nextTick(() => this.#openFile()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth adding a short comment here that #openFile will not throw and if it fails the stream will be destroyed.
| return new Glob(pattern, options).globSync(); | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary whitespace change.
- Use JSON.stringify() instead of string literal in overlay example - Document multiple VirtualFileSystem instances interaction - Clarify mountPoint returns absolute path - Document unmount() is idempotent - Add Windows mount example with drive letters
Add test-vfs-windows.js to verify VFS mounting with Windows drive letter paths. Tests include: - Mounting at paths with drive letters (e.g., C:\temp\vfs-test) - Mounting at drive root (e.g., C:\vfs-test-root) - Verifying mountPoint returns Windows-style absolute path - Require from Windows VFS paths
- Revert unintended whitespace change in lib/fs.js - Use JSON.stringify() in documentation examples for consistency - Add comment in streams.js explaining #openFile error handling - Document path encoding behavior in overlay mode
| the native file system APIs which handle encoding according to platform | ||
| conventions (UTF-8 on most Unix systems, UTF-16 on Windows). This means the | ||
| VFS inherits the underlying file system's encoding behavior for paths that | ||
| fall through, while VFS-internal paths always use UTF-8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd extend this to give some warning about it meaning that some paths on disk might not be shadowed, or some might actually be unexpectedly shadowed with a simple example. I just know someone is going to hit this case and it's going to come up as a bug report .
Add security monitoring events that are emitted when a VFS is mounted or unmounted. Applications can listen to these events to detect unauthorized VFS usage or enforce security policies. Events include: - mountPoint: The path where the VFS is mounted - overlay: Whether overlay mode is enabled - readonly: Whether the VFS is read-only
A first-class virtual file system module (
node:vfs) with a provider-based architecture that integrates with Node.js's fs module and module loader.Key Features
Provider Architecture - Extensible design with pluggable providers:
MemoryProvider- In-memory file system with full read/write supportSEAProvider- Read-only access to Single Executable Application assetsVirtualProvider- Base class for creating custom providersStandard fs API - Uses familiar
writeFileSync,readFileSync,mkdirSyncinstead of custom methodsMount Mode - VFS mounts at a specific path prefix (e.g.,
/virtual), clear separation from real filesystemModule Loading -
require()andimportwork seamlessly from virtual filesSEA Integration - Assets automatically mounted at
/seawhen running as a Single Executable ApplicationFull fs Support - readFile, stat, readdir, exists, streams, promises, glob, symlinks
Example
SEA Usage
When running as a Single Executable Application, bundled assets are automatically available:
Public API
Disclaimer: I've used a significant amount of Claude Code tokens to create this PR. I've reviewed all changes myself.