Program Dream•15mo ago

Zarg

The minimalistic, Zig command line argument parser. Following (or making it easier to follow) the CLIG guidelines

28 Replies

Dumb BirdOP•15mo ago

Here is the banner that will be going up on the repo

Dumb BirdOP•15mo ago

Here are the xcf files for these

zarglogo.xcf

zarg.xcf

Dumb BirdOP•15mo ago

Both of these can be cleaned up and @earth's penguin mentioned using something like inkscape to convert the logo into a vector. Which would clean up all the finer details. I also think the text could be cleaned up, maybe with a different font, or rounding the edges. I have messed with this project quite a lot, about 8 different tests and such. I think this will be the last time I'll have to keep doing continued rewriting. I will be planning out things very extensively in this channel. If not here, then likely on a notion page (which I would send here for your viewing).

Dumb BirdOP•15mo ago

The goal is to follow (or making it easier to follow) the CLIG guidelines. I won't bend over back too far to make every single thing adhere. I plan on making a list of things that will be supported and (for the time being) what wont.

Command Line Interface Guidelines

An open-source guide to help you write better command-line programs, taking traditional UNIX principles and updating them for the modern day.

keith•15mo ago

I plan on using this library for all argument parsing done with #Hexdump

Dumb BirdOP•15mo ago

This list is going to be quite verbose, listing out every single feature mentioned in CLIG. If you want more detail on a feature refer back to CLIG. I have marked some things with (optional), meaning they can be done but aren't required to be done. Anything marked as optional likely means it's something in the API that can be set, not something Zarg is expected to have or do. This list is just stuff from CLIG, Zarg will have other features too that aren't mentioned in this message! SUPPORTED HELP - Return zero exit code on success, non-zero on failure - Send output to stdout. - Send messaging to stderr. - Display help text when passed no options. - Display a concise help text by default. - A description of what your program does. (optional) - One or two example invocations. (optional) - Show full help when -h and --help is passed. - Provide a support path for feedback and issues. (optional) - In help text, link to the web version of the documentation. (optional) - Use formatting in your help text. - If the user did something wrong and you can guess what they meant, suggest it. - If your command is expecting to have something piped to it and stdin is an interactive terminal, display help immediately and quit. DOCUMENTATION - Consider auto-generating man pages. This isn't something in CLIG, just rather my own idea. I'm on the fence about it just because of simplicity. OUTPUT This entire section for now will be skipped. Output is done the the developer, not Zarg. Do follow CLIG's guidelines for output though... ERRORS - Catch errors and rewrite them for humans. - If there is an unexpected or unexplainable error, provide debug and traceback information, and instructions on how to submit a bug. - Make it effortless to submit bug reports. All of these are done for Zarg itself, but this is expected to be done by the developer themselves. FLAGS AND ARGUMENTS - If a flag can accept an optional value, allow a special word like “none.” - If possible, make arguments, flags and subcommands order-independent. Robustness - Responsive is more important than fast. - Do stuff in parallel where you can, but be thoughtful about it. UNSUPPORTED - Display the most common flags and commands at the start of the help text. this should just be done by the developer, not zarg as to keep it minimal

anic17•15mo ago

What language are you going to write this in? I assume it's Zig judging by the name

keith•15mo ago

That's correct

Dumb BirdOP•15mo ago

Yes, I will be using Zig Unlike the SUPPORTED and UNSUPPORTED sections, here is a hard list of features Zarg plans to or has implemented. This list differs from the prior one mentioned as we're no longer in the realm of philosophy (which is up to different peoples interpretation) but now to cold hard facts of features. This list will be pinned and updated as time goes on, (and at some point) tagging features with emojis for their status: ✅ : implemented & tested ☑️ : implemented but haven't been tested ⚠️ : implemented but failed testing 🧪 : working on implementation ❌ : not implemented PLANNED FEATURES not a complete list due to my laziness - ❌ | Flags - 🧪 | Short flags -s - 🧪 | Long flags --long - ❌ | Chaining flags like -abc (where a and b do not take values) - ❌ | Passing values using spaces or = (-a 100, -a=100) You could also infer this from Zarg's logo and or banner, where the Ziglang logo is used Created a Github repo, nothing on it yet. Just a novelty for right now

Dumb BirdOP•15mo ago

https://github.com/ZackeryRSmith/zarg

GitHub

ZackeryRSmith/zarg

The minimalistic, Zig command line argument parser - ZackeryRSmith/zarg

Dumb BirdOP•15mo ago

For speed reasons I'll be dropping some features Namely: 1. auto-generating help menu 2. Tab completion of flags (maybe?) 3. Validating flags at comptime or runtime (maybe?) 4. Abstracting the underlying cast, meaning the developer gets positions and flag values as []const u8 Why? 1. Although this could all be done at comptime, it would require a very verbose "magic" api. It wouldn't keep much of it's "low levelness" 2. Same reason as #1 3. Requires allocating space for all args, or searching for them at runtime 4. Makes Zarg seem more like magic, not a low level argument parser TL;DR Although this message was not long... like at all. Here are my final words and goal of Zarg as of now. Zarg shouldn't look like magic, following Zigs lead, nothing is magic, everything being written out. Explicitly, making the control flow of the code much more obvious. Example API Going Forward Here is the pseudo code idea for the API I strive for:

pub fn main() void {
    var iter = zarg.iterator();

    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('h') or zarg.long("help") => {
                // print help menu here :)
            },
            zarg.short('v') or zarg.long("version") => {
                std.debug.print("{}: 1.0.0", .{arg.name});
            },
            zarg.short('f') or zarg.long("file") => {
                std.debug.print("{}: {}", .{ arg.name, arg.value });
            },
        }
    }

    // for all other positional values
    for (iter.positionals()) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 22 lines of code

pub fn main() void {
    var iter = zarg.iterator();

    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('h') or zarg.long("help") => {
                // print help menu here :)
            },
            zarg.short('v') or zarg.long("version") => {
                std.debug.print("{}: 1.0.0", .{arg.name});
            },
            zarg.short('f') or zarg.long("file") => {
                std.debug.print("{}: {}", .{ arg.name, arg.value });
            },
        }
    }

    // for all other positional values
    for (iter.positionals()) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 22 lines of code

The Old API Here is the old API just for fun, this was my original concept and testing API:

pub fn main() !void {
    // NOTE: Both args and flags given MUST be known at comptime
    comptime var app: App = .{
        .flags = &[_]Flag{
            .{ .id = "help", .short = "-h", .long = "--help" },
            .{ .id = "version", .short = "-v", .long = "--version" },
            .{ .id = "file", .short = "-f", .long = "--file", .type = .str },
        },
    };
    const res = app.parse();

    if (res.help) {
        // print help menu here :)
    } else if (res.version) {
        std.debug.print("version: 1.0.0");
    } else if (res.file) {
        std.debug.print("file: {}", .{res.file.value});
    }

    // for all other positional values
    for (res.positionals) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 24 lines of code

pub fn main() !void {
    // NOTE: Both args and flags given MUST be known at comptime
    comptime var app: App = .{
        .flags = &[_]Flag{
            .{ .id = "help", .short = "-h", .long = "--help" },
            .{ .id = "version", .short = "-v", .long = "--version" },
            .{ .id = "file", .short = "-f", .long = "--file", .type = .str },
        },
    };
    const res = app.parse();

    if (res.help) {
        // print help menu here :)
    } else if (res.version) {
        std.debug.print("version: 1.0.0");
    } else if (res.file) {
        std.debug.print("file: {}", .{res.file.value});
    }

    // for all other positional values
    for (res.positionals) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 24 lines of code

In my opinion the new API is just better. It also solves many issues and allows Zarg to parse arguments in just a few microseconds While the old API takes around 5-7 milliseconds, and is just a bit messier in my opinion. Although not shown in the example it could also have a autogenerating help menu, along with flag autocomplete (thanks to having to explicitly define the flags taken) I would like to hear what you think about this @earth's penguin, as you're the one who is going to be using it. Zarg will be using a state machine to parse the command line arguments

keith•15mo ago

@earth's bird how would I get certain arguments later in the code. is it using the switch statement with just one case?

Dumb BirdOP•15mo ago

I'd say you'd just setup a switch like so

pub fn main() void {
    var iter = zarg.iterator();
    var file: []const u8 = undefined;
    
    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('f') or zarg.long("file") => file = arg.value,
        }
    }
}

pub fn main() void {
    var iter = zarg.iterator();
    var file: []const u8 = undefined;
    
    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('f') or zarg.long("file") => file = arg.value,
        }
    }
}

This would be a somewhat downside to this, as unlike the old API not everything is provided under one struct and accessible at any point.

is it using the switch statement with just one case?

I'm pretty sure you meant something like this https://discord.com/channels/728958932210679869/1199401594630963221/1200888484949397684. If that was the case, then yeah Be sure to let me know how you feel about it, after all Zarg is really just designed for Hexdump Seeing as you wanted Zarg to be fast fast I've been messing around with the API to get it to be fast. This was my solution but I'm open to rethinking it Working just on Linux support for the time being Why is it not just cross platform already you say? Well Windows needs allocations, Linux doesn't. Allocating uses a (likely many) syscall, syscalls are slow. Hopefully you see my dilemma, so I must break Zarg into a Windows and then Unix section. Thankfully all that really changes is the init function, but now I must manage an allocator safely. Here is the init function currently

Dumb BirdOP•15mo ago

As you can see I do a quick switch on the os.tag This makes Zarg very fast on Linux, but slower on Windows thanks to the pesky unavoidable allocations I'm trying to think of how I could write my own allocator that would be smarter about allocating memory for command line arguments Hence this is the reason I am "Working just on Linux support for the time being" the other option is just to allocate on both platforms, but no, no. Not happening I have created the tokenizer, which was the most annnoying task, jeez. Here is the test I ran if anyone would like to know

test "tokenizer" {
    const args = &.{
        "-f",
        "-f=val",
        "-f=",
        "-fgh",
        "-fgh=value",
        "-fgh=",
        "",
        "",
        "--flag",
        "--flag=value",
        "--flag=",
        "positional",
        "",
    };

    var tokenizer = Tokenizer.init(args);

    try expectToken(tokenizer.nextToken().?, .short_flag);
    try expectToken(tokenizer.nextToken().?, .short_flag_with_value);
    try expectToken(tokenizer.nextToken().?, .short_flag_with_empty_value);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_tail);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_value);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_empty_value);

    try expectToken(tokenizer.nextToken().?, .long_flag);
    try expectToken(tokenizer.nextToken().?, .long_flag_with_value);
    try expectToken(tokenizer.nextToken().?, .long_flag_with_empty_value);

    try expectToken(tokenizer.nextToken().?, .positional);
}

test "tokenizer" {
    const args = &.{
        "-f",
        "-f=val",
        "-f=",
        "-fgh",
        "-fgh=value",
        "-fgh=",
        "",
        "",
        "--flag",
        "--flag=value",
        "--flag=",
        "positional",
        "",
    };

    var tokenizer = Tokenizer.init(args);

    try expectToken(tokenizer.nextToken().?, .short_flag);
    try expectToken(tokenizer.nextToken().?, .short_flag_with_value);
    try expectToken(tokenizer.nextToken().?, .short_flag_with_empty_value);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_tail);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_value);
    try expectToken(tokenizer.nextToken().?, .short_flags_with_empty_value);

    try expectToken(tokenizer.nextToken().?, .long_flag);
    try expectToken(tokenizer.nextToken().?, .long_flag_with_value);
    try expectToken(tokenizer.nextToken().?, .long_flag_with_empty_value);

    try expectToken(tokenizer.nextToken().?, .positional);
}

I'll run a quick test on speed here I'm a bit concerned as I want to majorly optimise every little thing, why? Just because If the tokenizing takes over a milisecond I will be sad

keith•15mo ago

@earth's bird how long does it take

Dumb BirdOP•15mo ago

Too long :(

keith•15mo ago

ouch

Dumb BirdOP•15mo ago

Way more then I want Yeah well I ditched the system I told you last night I keep changing everything because I find issues with it Tokenizing like this fixes a lot, but as it's no longer abstract it still takes a hot minute. This is the longest task Zarg will have to do though. Throwing my code onto my Linux machine to do some profiling

anic17•15mo ago

How would an auto generated help menu require a "magic API"

keith•15mo ago

because auto generating the help menu requires it to now what options there are if the options are handled by a switch statement as done without the magic api, the program doesnt know what arguments there are thats a shitty way to explain it tho, @earth's bird can probably explain better

Dumb BirdOP•4mo ago

Magic just meaning you don't understand what it's doing, it seems like magic because you have no clue what is actually happening under the hood. Maybe for a high level language this isn't an issue, but for Zig it is. Regardless as @earth's penguin mentioned it would require me to know what arguments and flags the program wants to accept and their descriptions before hand There is now tons of overhead that was never required before Currently Zarg doesn't care what you want or what you don't want. What flags take values and what don't. It tokenizes the command line arguments and then gives a low level api for working with the tokens That's what this is

Example API Going Forward Here is the pseudo code idea for the API I strive for:

pub fn main() void {
    var iter = zarg.iterator();

    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('h') or zarg.long("help") => {
                // print help menu here :)
            },
            zarg.short('v') or zarg.long("version") => {
                std.debug.print("{}: 1.0.0", .{arg.name});
            },
            zarg.short('f') or zarg.long("file") => {
                std.debug.print("{}: {}", .{ arg.name, arg.value });
            },
        }
    }

    // for all other positional values
    for (iter.positionals()) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 22 lines of code

pub fn main() void {
    var iter = zarg.iterator();

    for (iter.next()) |arg| {
        switch (arg) {
            zarg.short('h') or zarg.long("help") => {
                // print help menu here :)
            },
            zarg.short('v') or zarg.long("version") => {
                std.debug.print("{}: 1.0.0", .{arg.name});
            },
            zarg.short('f') or zarg.long("file") => {
                std.debug.print("{}: {}", .{ arg.name, arg.value });
            },
        }
    }

    // for all other positional values
    for (iter.positionals()) |positional| {
        std.debug.print("positional: {}", .{positional});
    }
}
// 22 lines of code

Wow, this is like my millionth time on this project 😅 However with some help from the Zig community I have gotten from my ideas to actual code Thanks to the help of one of my Zig heroes, I was pointed in a much better direction. Taking my hacky ideas and laying them out idiomatically has undoubtedly been a blessing. I'm here less to tell you about the progress of Zarg as, honestly, I'm just trying to code without losing motivation... however I am here to actually show you some cool "ziggy-ness" For those unaware, Zig has a cool feature called comptime. Compile-time means that: - At the call site, the value must be known at compile-time, or it is a compile error. - In the function definition, the value is known at compile-time. This allows the compiler to take lines of code, or even entire functions, and do the work then, so it doesn't need to be when running your code. This allows for crazy speedups in some places where you know the input and what the output should be. In my case, it's being used to make a very flexible API, so let me show you that:

const Input = struct {
    version: []const u8,
};

fn populateStruct(comptime T: type, cli: anytype) !T {
    var t: T = undefined;
    inline for (@typeInfo(T).Struct.fields) |f| {
        @field(t, f.name) = try cli.getFlag(f.type, f.name);
    }
    return t;
}

const CLI = struct {
    pub fn getFlag(self: CLI, comptime T: type, comptime name: []const u8) !T {
        if (self.flags.get(name)) |v| {
            if (@TypeOf(v) == T)
                return v;
        }
        return error.UnknownParameter;
    }
};

const Input = struct {
    version: []const u8,
};

fn populateStruct(comptime T: type, cli: anytype) !T {
    var t: T = undefined;
    inline for (@typeInfo(T).Struct.fields) |f| {
        @field(t, f.name) = try cli.getFlag(f.type, f.name);
    }
    return t;
}

const CLI = struct {
    pub fn getFlag(self: CLI, comptime T: type, comptime name: []const u8) !T {
        if (self.flags.get(name)) |v| {
            if (@TypeOf(v) == T)
                return v;
        }
        return error.UnknownParameter;
    }
};

Now this code is something to ignore Although this is missing some required code to run it has all the information for me to show what comptime can do for us The compiled executable is actually running code akin to:

const Input = struct {
    version: []const u8,
};

fn populateStruct(cli: anytype) !Input {
    var input: Input = undefined;
    input.version = try cli.getFlag();

    return input;
}

const CLI = struct {
    pub fn getFlag(self: CLI) ![]const u8 {
        if (self.flags.get("version")) |v| {
            if (@TypeOf("version") == []const u8)
                return "version";
        }
        return error.UnknownParameter;
    }
};

const Input = struct {
    version: []const u8,
};

fn populateStruct(cli: anytype) !Input {
    var input: Input = undefined;
    input.version = try cli.getFlag();

    return input;
}

const CLI = struct {
    pub fn getFlag(self: CLI) ![]const u8 {
        if (self.flags.get("version")) |v| {
            if (@TypeOf("version") == []const u8)
                return "version";
        }
        return error.UnknownParameter;
    }
};

This is quite cool as it allows a developer to express what kinds of inputs they want their app to take intuitively without sacrificing performance. It's the opposite; doing this is relatively fast! Get more while doing less, that's something I can get behind. Leveraging this compile-time duck typing allows for some crazy cool things. As a final message from me, try Zig.

anic17•3mo ago

If that's what you're aiming to do it seems pretty straightforward So basically it's precomputing everything it can to increase performance

Dumb BirdOP•3mo ago

pretty much, yes Basic parsing is implemented. I'm working towards following CLIG guidelines as I have before. Along with my own strict rules on what will be parsed. The only reason these rules are enforced is to keep things simple. For example, duplicate flags are not allowed, period. There is no use case where a duplicate flag is practical. Simple things like these conform to CLIG, along with my ideas of a command line parser. As to what is currently supported, I'll quote a section from CLIG:

- Arguments, or args, are positional parameters to a command. For example, the file paths you provide to cp are args. The order of args is often important: cp foo bar means something different from cp bar foo. - Flags are named parameters, denoted with either a hyphen and a single-letter name (-r) or a double hyphen and a multiple-letter name (--recursive). They may or may not also include a user-specified value (--file foo.txt, or --file=foo.txt). The order of flags, generally speaking, does not affect program semantics.

Both of the featured outlined here are supported Due to this section, I'm hesitant to add support for --file = foo.txt As for the error, this would return? Who knows...

keith•3mo ago

make sure you actually implement the order not mattering because I know some libraries where it does

Dumb BirdOP•3mo ago

Arguments matter; flags won't. Well, actually, it's really due to the programmer All the flags are parsed but you choose how and when to handle them

Gaming

Programming

Zarg

Did you find this page helpful?