-
Notifications
You must be signed in to change notification settings - Fork 249
reboot wit-bindgen-go
#1447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
reboot wit-bindgen-go
#1447
Conversation
|
I just realized I need to make CI run the go tests. I'll push an update. |
This re-introduces Go support, with a new implementation written from scratch. While the previous incarnation targeted TinyGo and reused the C generator for lifting and lowering, this one targets "big" Go and handles lifting and lowering itself. In addition, it supports idiomatic, goroutine-based concurrency on top of the component model async ABI, including streams and futures. This implementation is also distinct from the [go-modules](https://github.com/bytecodealliance/go-modules) project, which has a number of limitations (e.g. no async support, not safe for use with "big" Go) and is no longer being actively maintained. Note that the async support currently requires [a small patch](dicej/go@a1c8322) to the Go `runtime` library. I plan to work with the upstream project to make that patch unecessary in the future. Components that don't use async features should work with stock, unpatched Go. One of the tricky parts about lowering values and passing pointers to e.g. `{stream,future}.{read,write}` is that we must tell the Go garbage collector to pin the pointers (i.e. mark the pointee as both immovable and uncollectable) for as long as the host has access to them, but then unpin them promptly afterward to avoid leaks. We do this using `runtime.Pinner` instances, most of which are locally scoped and easy to reason about. However, we must use a couple of globally-scoped pinners as well: one for `cabi_realloc` allocations by the host when it lowers parameters (which we unpin after lifting those parameters) and another for returning values from sync-lifted exports (which we unpin in post-return functions). There are a couple of known missing features which I'll open GitHub issues for: 1. Resource handles are not being restored for unwritten items in the case of an incomplete stream or future write. 2. Importing and/or exporting multiple versions of the same package will cause name clashes. In addition, I plan to expand the test coverage beyond the basics covered in this commit. Signed-off-by: Joel Dice <joel.dice@fermyon.com>
alexcrichton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed we'll want to ge this running on CI, but overall looking great! I only skimmed codegen though and would mostly lean on tests to exercise bits and bobs
| func (f *FutureReader[T]) Drop() { | ||
| handle := f.handle | ||
| if handle != 0 { | ||
| f.handle = 0 | ||
| f.vtable.DropReadable(handle) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend pairing this with a finalizer installed via the GC on f to auto-call this if it's not otherwise called. Calling this explicitly would still be supported but it'd optionally be cleaned up if forgotten or otherwise not called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just link to my previous rant on this topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, to clarify, I'm not saying this method should be removed. Can you explain why a finalizer specifically should not be present?
My impression is that a huge benefit of GC languages is if you don't think about resources everything works out and there are no leaks. Without a finalizer then programs will leak unless Drop is called, which to me I feel like will trip up a lot of folks using this. I understand that conventional usage will probably call Drop, but inevitably that will be forgotten and/or some error path will be encountered that doesn't call Drop, and to me it seems like it would be unfortunate if that results in a leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My position is that, because the GC is only concerned about memory pressure and is completely oblivious to non-memory resources such as resource/future/stream handles, it should not be responsible for disposing of those handles. As an application programmer, I'd prefer to leak them entirely rather than have them disposed at some unpredictable future time determined by an opaque algorithm whose job is completely unrelated to such things.
It feels akin to saying "if I forget to explicitly dispose this handle, please dispose of it for me the next time it rains in New York City". It makes reproducing and debugging resource management issues harder, not easier. I spent years dealing with this in the Java world before wising up and changing all the finalizers to log big, noisy error messages instead of trying to silently compensate for inadequate resource management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with Joel about this and the conclusion we had was "do what os.File does" which looks to have close + a finalizer, so current conclusion is to add finalizers to these streams + keep the close method.
| @@ -0,0 +1,41 @@ | |||
| package wit_types | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question on these packages: how are you envisioning that they're going to be distributed? For example should these be published as a standalone package that wit-bindgen refers to? Do these otherwise play well if there's multiple versions of the runtime linked into one binary?
Or is this basically "works at the app level nowhere else" for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this basically "works at the app level nowhere else" for now?
Correct, and I think it's a reasonable place to start, at least. I do think we'll want to revisit that as we start publishing SDKs and/or an official WASI package containing pre-generated bindings, but I don't feel we need to address that right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, and I understand how this is a starting point, but would it be worthwhile to try to chart a course ahead? For example is there a possible viable path to using bindgen in libraries one day? (e.g. like wit-bindgen in Rust)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's definitely a path forward: wit_runtime.go, wit_async.go, wit_option.go, wit_stream.go, etc. could all be published as part of a reusable package which such libraries could depend on. I just don't feel we need that right away.
| } | ||
|
|
||
| func offset(ptr, align uintptr) uintptr { | ||
| newptr := (ptr + align - 1) &^ (align - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is &^ a typo? If so I'm curious, what does it do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stole that code from @tschneidereit's PR 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&^ is AND NOT, i.e. this aligns the pointer, and the function returns offset between the aligned and unaligned addresses. As the name "offset" clearly says, obviously. (Sorry for the poor naming and lack of documentation!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL!
| } | ||
| } | ||
|
|
||
| func (s *StreamReader[T]) ReadInto(dst []T) uint32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be the only interface that this module supports? And depending on Lift it either reads directly into the provided slice or otherwise allocates temporary storage and then deserializes from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can use this exact signature for an arbitrary T, since it would require that dst be already sized to the maximum desired item count and thus filled with already-initialized values. That wouldn't be possible if T is e.g. an imported resource type with no constructor, and even if it was constructable it wouldn't make sense to force the caller to fill the slice with dummy data just to have it overwritten.
We could however do something like ReadInto(dst []T) []T such that the caller passes a slice with zero size and nonzero capacity and returns the same slice with some number of items added. I'm still getting used to how Go slices work, but I think that's the idiom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection of Go is that any type can be allocated with a zero pattern at any time, so it's not possible to have a public type with representation requirements. For example if T were an imported resource with no constructor then I believe you could still make a slice of T and the expectation is that you should initialize it before use. Given that I think it would be possible to still make this API I'm thinking.
Now whether it's idiomatic, that I don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection of Go is that any type can be allocated with a zero pattern at any time, so it's not possible to have a public type with representation requirements.
Yeah, if that's true, and the caller can do e.g. make([]T, 1024) for arbitrary T without futher initialization, then you could be right. Runs completely counter to my intuition for type safety, but if that's how Go works then we can go roll with it.
| //go:wasmimport [export]wasi:cli/run@0.3.0-rc-2025-09-16 [task-return]run | ||
| func taskReturn(result int32) | ||
|
|
||
| //go:wasmexport [async-lift]wasi:cli/run@0.3.0-rc-2025-09-16#run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised by these directives, is this not something we'd expect the bindings generator to take care of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is awkward, and it's a result of a combination of factors:
wit-bindgen-testexpects all therunnercomponents to exportwasi:cli/run, and yet none of therunnerworlds in thetest.witfiles actually export that interface, so the bindings generator never sees it.- As of this writing, Go-generated modules are incompatible with
wasi_snapshot_preview1.command.wasm(there's an unbounded recursion during runtime initialization), so we have to usewasi_snapshot_preview1.reactor.wasminstead, and that won't automatically exportwasi:cli/run. - For the async tests, we must export
wasi:cli/run@0.3.0-rc-2025-09-16becausewasi:cli/run@0.2.xis a sync function, and we can't return e.g.CALLBACK_CODE_WAITif we implement the latter. The Rust tests currently usewit_bindgen::block_on, but that's no longer viable given the new trapping behavior.
I'm open to alternative approaches to addressing these challenges; what I have now works, but it might not be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with Joel -- conclusion here is that this is because the WIT export isn't actually part of the WIT used to generate this test case, so it's expected.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
I'm assuming the default version of Go on the GitHub worker image will be sufficient; if not, we can install a different one explicitly. I'm not enabling the async tests for Go yet, since that will require publishing and installing a build of Go with [this patch](dicej/go@a1c8322); I'll need to do some homework to find out the best way to do that. Signed-off-by: Joel Dice <joel.dice@fermyon.com>
|
I was so focused on the runtime tests that I never ran the codegen ones. Looks like I have more work to do. I'll switch this to draft mode while I do that. |
This commit switches the way tests are built such that all binaries produced are "reactors" or those without a `main` function. This ensures that all imports/exports are fully described with WIT in this repository as opposed to implicitly relying on `wasi:cli/run`. This additionally means that testing in this repository no longer relies on toolchain conventions for the `wasi:cli/run` world making testing a bit more self-contained here, especially in the context of async. This notably came up on bytecodealliance#1447 which should make some of the tests there easier to write and more idiomatic.
This re-introduces Go support, with a new implementation written from scratch. While the previous incarnation targeted TinyGo and reused the C generator for lifting and lowering, this one targets "big" Go and handles lifting and lowering itself. In addition, it supports idiomatic, goroutine-based concurrency on top of the component model async ABI, including streams and futures.
This implementation is also distinct from the
go-modules project, which has a number of limitations (e.g. no async support, not safe for use with "big" Go) and is no longer being actively maintained.
Note that the async support currently requires a small patch to the Go
runtimelibrary. I plan to work with the upstream project to make that patch unecessary in the future. Components that don't use async features should work with stock, unpatched Go.One of the tricky parts about lowering values and passing pointers to e.g.
{stream,future}.{read,write}is that we must tell the Go garbage collector to pin the pointers (i.e. mark the pointee as both immovable and uncollectable) for as long as the host has access to them, but then unpin them promptly afterward to avoid leaks. We do this usingruntime.Pinnerinstances, most of which are locally scoped and easy to reason about. However, we must use a couple of globally-scoped pinners as well: one forcabi_reallocallocations by the host when it lowers parameters (which we unpin after lifting those parameters) and another for returning values from sync-lifted exports (which we unpin in post-return functions).There are a couple of known missing features which I'll open GitHub issues for:
In addition, I plan to expand the test coverage beyond the basics covered in this commit.