Skip to content

Conversation

blblack
Copy link
Contributor

@blblack blblack commented Aug 30, 2025

Some of these may require some debate or correction, so I split this up into several distinct commits and will mark as draft initially.

@blblack blblack marked this pull request as draft August 30, 2025 14:58
@blblack
Copy link
Contributor Author

blblack commented Aug 31, 2025

CC @alexrp and @rootbeer - Thoughts on whether this is a reasonable approach? My aim here is Zig software that wants to use these kinds of system interfaces shouldn't have to directly rely on std.c and be forced to link libc on Linux or other targets that don't otherwise require libc.

There are other alternative pathways to consider here, perhaps.

We could make some minor breaking changes to the current set and get interfaces to unlock at least some of these edge cases. For example, the get call could return a slice of the optval buffer instead of void and assert a returned length less than or equal to the input optval len, for the case that the returned length is less than the buffer input length and the caller needs to know it. And the set call could accept an optional slice as input to handle a case like SO_ACCEPTFILTER where one needs to be able to set null.

Even if we did both of those things, Linux SO_PEERSEC would still be unusable. It needs the caller to be able to both dynamically adapt the buffer size upwards and receive the final data length which may be smaller than the provided buffer. To abstract that, at bare minimum you need a getsockopt that takes an allocator argument so that it could do the resizing loop internally, which is kinda ewww. I assume SO_PEERSEC is probably not the lone outlier like this in the universe of platforms, socket types, and levels (esp if we want the interface to be stable into the future, where more options may appear).

We could also try to do something way-fancier with comptime and/or runtime switching on the optname itself, for varying the internal behavior per-option for all the known edge cases, and return some complex struct or tuple that can represent all the odd cases, and even specialize the error return outcomes to the optname as well? I suspect if we tried to go down this path, this would (a) get very unwieldy over time with a ton of nested switches and a ton of @hasDecl() or target-condition lists for the SO_FOO that exist on various subsets of platforms, and (b) end up being an un-ergonomic interface for the majority cases where the option is just a simple boolean or integer setting.

@alexrp alexrp self-assigned this Aug 31, 2025
Copy link
Contributor

@rootbeer rootbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes seem reasonable to me. I'm just a drive-by contributor, so take my directional comments here with a large grain of salt: I think the goal of the std.posix layer is to provide posix-y APIs that are familiar, with a bit of Zig safety if it can be squeezed in. If you want to fully Ziggify the socket option APIs with robust safety for all of the options, that probably makes more sense to try in Zig's std.net layer (vs. deviating from POSIX APIs here in the std.posix layer).

The EINVAL handling in this diff is better, too.

I recommend adding some test cases if you can. Even simple cases that will get an error at runtime are fine, as otherwise the unreachable code won't even get compiled.

} || UnexpectedError;

/// Get a socket's options.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe document that getsockopt is only for socket options that are a well-know size, and getsockoptRaw should be used otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in new push (but with new naming)

/// optlen, which will usually be shorter than the buffer length, and thus
/// would fail the otherwise-useful optlen assertion check in the the regular
/// getsockopt() interface.
pub fn getsockoptRaw(fd: socket_t, level: i32, optname: u32, optval: [*]u8, optlen: *socklen_t) GetSockOptError!void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this getsockoptVariable? Or call this getsockopt (since its closer to the semantics of the underlying call) and rename the Ziggified wrapper function to getsockoptFixed? (getsockoptSimple?). My names aren't obviously better, so feel free to leave as is if you like the Raw suffix. (I don't see an existing naming convention to follow for simplifying wrapper functions like this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this getsockoptVariable?

Yes, this name is the right answer here I think! 🥇

@blblack
Copy link
Contributor Author

blblack commented Sep 2, 2025

To codify one of the alternative paths - if we're willing to change/break the existing interfaces a bit, and lose a little bit of ergonomics in the get-case, it could look something like this:

diff --git lib/std/posix.zig lib/std/posix.zig
index 1bce41c2e9..127d2caaaa 100644
--- lib/std/posix.zig
+++ lib/std/posix.zig
@@ -4353,22 +4353,33 @@ pub const GetSockOptError = error{
 
     /// Insufficient resources are available in the system to complete the call.
     SystemResources,
+
+    /// The supplied opt buffer was too small for the data to be returned.
+    BufferTooSmall,
+
+    /// This can mean different things depending on the platform, the option
+    /// name, and conditions.  One known oddball example is SO_ACCEPTFILTER on
+    /// BSDs, which returns this error to indicate that no filter is currently
+    /// installed.
+    InvalidOption,
 } || UnexpectedError;
 
-pub fn getsockopt(fd: socket_t, level: i32, optname: u32, opt: []u8) GetSockOptError!void {
+pub fn getsockopt(fd: socket_t, level: i32, optname: u32, opt: []u8) GetSockOptError![]const u8 {
     var len: socklen_t = @intCast(opt.len);
     switch (errno(system.getsockopt(fd, level, optname, opt.ptr, &len))) {
         .SUCCESS => {
-            std.debug.assert(len == opt.len);
+            std.debug.assert(len <= opt.len);
+            return opt[0..len];
         },
         .BADF => unreachable,
         .NOTSOCK => unreachable,
-        .INVAL => unreachable,
         .FAULT => unreachable,
+        .INVAL => return error.InvalidOption,
         .NOPROTOOPT => return error.InvalidProtocolOption,
         .NOMEM => return error.SystemResources,
         .NOBUFS => return error.SystemResources,
         .ACCES => return error.AccessDenied,
+        .RANGE => return error.BufferTooSmall,
         else => |err| return unexpectedErrno(err),
     }
 }
@@ -6721,6 +6733,14 @@ pub const SetSockOptError = error{
     /// Setting the socket option requires more elevated permissions.
     PermissionDenied,
 
+    /// This can mean many different things depending on the platform, the
+    /// option name, and conditions.  For example:
+    /// - That the socket was not listen()ing when setting SO_ACCEPTFILTER on BSDs.
+    /// - That an non-multicast IP address was set for IP_ADD_MEMBERSHIP on Linux.
+    /// - That the socket is shut down and the option requires otherwise
+    /// - etc...
+    InvalidOption,
+
     OperationNotSupported,
     NetworkSubsystemFailed,
     FileDescriptorNotASocket,
@@ -6729,9 +6749,11 @@ pub const SetSockOptError = error{
 } || UnexpectedError;
 
 /// Set a socket's options.
-pub fn setsockopt(fd: socket_t, level: i32, optname: u32, opt: []const u8) SetSockOptError!void {
+pub fn setsockopt(fd: socket_t, level: i32, optname: u32, opt: ?[]const u8) SetSockOptError!void {
+    const opt_ptr: ?[*]const u8 = if (opt) |o| o.ptr else null;
+    const opt_len: usize = if (opt) |o| o.len else 0;
     if (native_os == .windows) {
-        const rc = windows.ws2_32.setsockopt(fd, level, @intCast(optname), opt.ptr, @intCast(opt.len));
+        const rc = windows.ws2_32.setsockopt(fd, level, @intCast(optname), @ptrCast(opt_ptr), @intCast(opt_len));
         if (rc == windows.ws2_32.SOCKET_ERROR) {
             switch (windows.ws2_32.WSAGetLastError()) {
                 .WSANOTINITIALISED => unreachable,
@@ -6744,12 +6766,12 @@ pub fn setsockopt(fd: socket_t, level: i32, optname: u32, opt: []const u8) SetSo
         }
         return;
     } else {
-        switch (errno(system.setsockopt(fd, level, optname, opt.ptr, @intCast(opt.len)))) {
+        switch (errno(system.setsockopt(fd, level, optname, @ptrCast(opt_ptr), @intCast(opt_len)))) {
             .SUCCESS => {},
             .BADF => unreachable, // always a race condition
             .NOTSOCK => unreachable, // always a race condition
-            .INVAL => unreachable,
             .FAULT => unreachable,
+            .INVAL => return error.InvalidOption,
             .DOM => return error.TimeoutTooBig,
             .ISCONN => return error.AlreadyConnected,
             .NOPROTOOPT => return error.InvalidProtocolOption,

The tradeoffs with this approach:

  • Slight negatives:
    • posix.getsockopt() callers would need to update their simple-fixed-case calls to throw away the result, e.g. _ = try posix.getsockopt(sock, posix.SOL.FOO, posix.SO.BAR, std.mem.asBytes(&some_i32));
    • For the common fixed-length cases, the assert on the returned length in getsockopt() is weakened to match the general case (<=). For example, in the point above, nothing would catch that getsockopt() claimed to only write 2/4 of the i32's bytes.
    • (or, I guess, they could not throw away the result and assert on its length for themselves, but that looks pretty messy at the call site)
    • setsockopt()'s change to an optional opt would lose some ability to catch cases where sending a null was not intended (e.g. because the caller forgot to unwrap an optional earlier on, and thus sent an optional when they didn't mean to).
  • Upsides:
    • Correctly handles almost all the odd cases I've found so far with reasonable ergonomics
    • Doesn't require new separate xRaw() interfaces.

Linux SO_PEERSEC is the only case (that I personally have found so far, anyways!) that doesn't quite work out optimally even with this variant, but at least it is usable for this case at all. The caller would have to possibly loop multiple times on error.BufferTooSmall while growing their buffer size until they got a success with a buffer that was >= the required len. With a fully-raw interface, they would call at most twice, as they'd know an exact size that will definitely-work after the first failure. If that turns out to be the only such case in practice, then anyone that cares enough about this one suboptimal pattern could always drop to std.os.linux.getsockopt() to handle it optimally.

This could also be blended with @rootbeer 's Fixed/Simple suffix as well in the get case, just to clean up some of the negatives above. Basically, on top of the changed get interface above, have a posix.getsockoptFixed() variant that returns void and asserts equal lengths. That could perhaps be the best of all worlds here? The only real downside I see on this third path is that it assumes we won't find (or OSes won't in the future add) enough other odd cases that we have to go back and revisit all this again later.

@blblack
Copy link
Contributor Author

blblack commented Sep 2, 2025

I recommend adding some test cases if you can. Even simple cases that will get an error at runtime are fine, as otherwise the unreachable code won't even get compiled.

Yeah, I tend to think I want this as well. It is a little tricky, as any decent test will inevitably have consequences. But we could, perhaps, find very-trivial cases that are fairly portable? Maybe create a temporary tcp socket (just socket creation, no binding), get SO_REUSEADDR on it, then set it to the same value that was returned from get?

@blblack
Copy link
Contributor Author

blblack commented Sep 3, 2025

Well, I was going to add a test, but then I realized that even the most-basic test of the existing getsockopt() on the master branch fails on Windows targets because it's already broken there. At least, it fails to compile for me when testing from Linux with -fwine.

New force-push first adds a basic test of set+get for socket options (which fails on Windows), then fixes up the Windows calls, then applies the various fixes I had here before, and then has a final commit tacked on which represents the alternative "best of all worlds" mentioned earlier. I could go either way on that alternative outcome, personally. I think it's a little nicer than the leaving in the Raw() interfaces, but it is a minor breaking change for posix.getsockopt() users, and it is possible we'll find counter-examples in the future that still won't quite work right with these somewhat-less-limited interfaces.

Copy link
Contributor

@rootbeer rootbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a solid improvement to me.

If the test on Windows (or other platforms) has more problems, its generally acceptable to file a bug and return errors.SkipZigTest on the problematic platforms.

/// options with a known, fixed size, e.g. simple integer values.
pub fn getsockoptFixed(fd: socket_t, level: i32, optname: u32, opt: []u8) GetSockOptError!void {
const returned = try getsockopt(fd, level, optname, opt);
std.debug.assert(returned.len == opt.len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L4427 below is a bare assert (without the std.debug prefix), so I think you can drop that here too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in upcoming push

/// optlen, which will usually be shorter than the buffer length, and thus
/// would fail the otherwise-useful optlen assertion check in the
/// getsockoptFixed() interface.
pub fn getsockopt(fd: socket_t, level: i32, optname: u32, opt: []u8) GetSockOptError![]u8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a return value here is a bit funky (as it diverges from the POSIX API a bit), but does communicate the result length in a relatively safe way. Given most callers should be able to use getsockoptFixed() this seems worth the tradeoff to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in some abstract sense, returning a sub-slice of opt is the meaning of the full POSIX API. But I agree it's funky for the common case. Flipping around from having a Fixed() variant to having a Variable() variant in next push instead!

@rootbeer
Copy link
Contributor

rootbeer commented Sep 3, 2025

Also, probably fine to un-draft this now?

@blblack
Copy link
Contributor Author

blblack commented Sep 4, 2025

After another night to sleep on this: I think since the getsockoptFixed() case is going to be the most-common one by far, it's silly to break the existing callers and make them use some alternate function name they have to discover. I'm gonna clean up the patch chain a bit, and flip it so getsockoptVariable() is the one that returns a slice.

[also cleaned up what I could from earlier comments, and fixed failing test on macos]

Note this test fails for me on Windows targets when testing from
Linux using -fwine, with errors like:

    compile test Debug thumb-windows-msvc 1 errors
    lib/std/c.zig:10717:12: error: dependency on libc must be explicitly specified in the build command
    pub extern "c" fn getsockopt(sockfd: fd_t, level: i32, optname: u32, noalias optval: ?*anyopaque, noalias optlen: *socklen_t) c_int;

I suspect that posix.getsockopt() needs a specific variant using
ws2_32 like setsockopt() already has.
getsockopt() needed new code to use ws2_32 like setsockopt()
already did. Since I had to research the error values for
getsockopt(), I went ahead and did the same for setsockopt() to
clean up the error set there as well.

Works for me with emulation via -fwine anyways.  Now I get a clean
run for test-std with:
-fqemu -fwine -fwasmtime -Dtest-filter='posix.test.test.sockopt'
In both the set and get cases, EINVAL can happen for reasons other
than code-level programmer error for *nix targets as well, and
should be a condition that users can catch and handle. Examples
are given in the doc comments.

getsockopt() ERANGE known case: Linux SO_PEERSEC.  There is no
documented buffer size that will always be large-enough.  If the
buffer is too small, getsockopt returns ERANGE and returns an
appropriate buffer size via optlen.  We can't return this size to
the posix.getsockopt() caller without a breaking interface change,
but at least with this, they could do some kind of
grow-by-X-and-retry loop.

While this is better than nothing for such a case, there's still
the problem that the returned value's actual length cannot be
known without reading the return optlen value, and need not be
sentinel-terminated within the opt buffer, either, in the general
case.  We need some kind of new or changed interface here to
support exposing the returned optlen in some way.
getsockoptVariable() is for the edge cases where the caller needs
to know the actual data length returned by the system, which may
be smaller than the supplied buffer size.  Known example cases are
Linux's SO_PEERSEC and SO_BINDTODEVICE.
By making setsockopt()'s opt argument optional, we can support the
edge case of needing to send null as the optval.  Known example is
clearing an existing filter via SO_ACCEPTFILTER on supporting
BSDs.
@blblack blblack marked this pull request as ready for review September 4, 2025 13:15
@rootbeer
Copy link
Contributor

rootbeer commented Sep 4, 2025

LGTM! Hopefully the Zig team likes it too. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants