Hexdump

Repo: https://github.com/KeithBrown39423/Hexdump My hexdump program that I started like 2 years ago. I'm now starting to add a lot of new features with some help from @earth's bird
GitHub
GitHub - KeithBrown39423/Hexdump: The alternative cross platfrom he...
The alternative cross platfrom hex dumping utility - GitHub - KeithBrown39423/Hexdump: The alternative cross platfrom hex dumping utility
190 Replies
Dumb Bird
Dumb Bird12mo ago
Currently making a PR adding more testing to optimize_test.py Making it a cli
Dumb Bird
Dumb Bird12mo ago
No description
Dumb Bird
Dumb Bird12mo ago
Example of verbose logging
No description
Dumb Bird
Dumb Bird12mo ago
GitHub
Release Version 1.2.0 · KeithBrown39423/Hexdump
What's Changed Speed Optimization Drastically improved the speed of Hexdump. See the graph attached for a comparison. Logging (#20) Hexdump now has built-in logging and will log the following i...
KeithBrown7526
KeithBrown752612mo ago
Also, just a little comparison of the execution times
No description
Dumb Bird
Dumb Bird12mo ago
Yep, looks very good, I can actually read it now 😲
<Tim>
<Tim>12mo ago
Wow that's cool!
KeithBrown7526
KeithBrown752612mo ago
The craziest part was it was such small changes that made such a big difference
<Tim>
<Tim>12mo ago
What changes for example?
KeithBrown7526
KeithBrown752612mo ago
I list them out in the morning, it's 1am rn and I'm tired as hell
<Tim>
<Tim>12mo ago
haha good night 😂
Dumb Bird
Dumb Bird12mo ago
Colored output is one thing Converting something to hex too? I forget :)
KeithBrown7526
KeithBrown752612mo ago
One of the biggest ones was i was reading the entire file to a buffer, then writing the output to a stringstream and then writing that string stream to either stdout or a file stream Now, it will read the file 16 bytes at a time, convert it to the proper display format, and then output it directly to the output stream whether that be a file stream or stdout There was also a function that took every byte as an integer, then created a string stream, passed it through std::hex and then into the string stream, then return the string from that stream. This was done for every single byte, so if I file was 1MB (1048576 bytes) the function was run over a million times i was also adding the ascii color escape squence after every single ascii character as opposed to just once per line, and that also slowed down the speed quiite a bit
<Tim>
<Tim>12mo ago
Ah nice, very interesting And as expected it's always IO that is the most expensive
KeithBrown7526
KeithBrown752612mo ago
yeah, very true mainly its the terminal displaying it all that takes a while on average, file output takes around 1/3 the time of terminal output I've added the different display types (octal, decimal, etc.) I'll add a screenshot once I get back to my computer
KeithBrown7526
KeithBrown752612mo ago
KeithBrown7526
KeithBrown752612mo ago
Here's a video with the features
Dumb Bird
Dumb Bird12mo ago
Im not sure if cxx-ops allows this, but separate the commands in the help menu, into groups
Dumb Bird
Dumb Bird12mo ago
Like for example what I did with optimize_test.py
No description
KeithBrown7526
KeithBrown752612mo ago
1: I assume you mean cxxopts, 2: I can, give me a sec and I'll show you what that looks like
KeithBrown7526
KeithBrown752612mo ago
offsets have been added
No description
KeithBrown7526
KeithBrown752612mo ago
the header also adapts based on the offset
KeithBrown7526
KeithBrown752612mo ago
No description
KeithBrown7526
KeithBrown752612mo ago
same for ascii
KeithBrown7526
KeithBrown752612mo ago
No description
KeithBrown7526
KeithBrown752612mo ago
done
Dumb Bird
Dumb Bird12mo ago
Very nice 👏 Much more readable now in my opinion
KeithBrown7526
KeithBrown75267mo ago
Yea, I had to mess with the cxxopta code a tad bit Coming back to hexdump, after taking a break for a couple days, I now forgot what was ever wrong and why I hadn't made a release yet. An update, Release v2.0 with be remade and written in zig as opposed to c or c++ The main reason behind this is zig has amazing stdout write speed
const std = @import("std");
const stdout = std.io.getStdOut().writer();

pub fn main() !void {
var file = try std.fs.cwd().openFile("hexdump.zig", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
std.debug.print("Empty File", .{});
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{ buf[i] };
const hexbyte = std.fmt.fmtSliceHexUpper(&byte);
std.debug.print("{s} ", .{hexbyte});
}
std.debug.print("\n", .{});
}

}
const std = @import("std");
const stdout = std.io.getStdOut().writer();

pub fn main() !void {
var file = try std.fs.cwd().openFile("hexdump.zig", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
std.debug.print("Empty File", .{});
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{ buf[i] };
const hexbyte = std.fmt.fmtSliceHexUpper(&byte);
std.debug.print("{s} ", .{hexbyte});
}
std.debug.print("\n", .{});
}

}
KeithBrown7526
KeithBrown75267mo ago
No description
KeithBrown7526
KeithBrown75267mo ago
i have a working hexdump now in 22 sloc
const std = @import("std");
var std_writer = std.io.bufferedWriter(std.io.getStdOut().writer());
var stdout = std_writer.writer();


pub fn main() !void {
var file = try std.fs.cwd().openFile("random.bin", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
_ = try stdout.write("Empty File");
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{ buf[i] };
_ = try std.fmt.fmtSliceHexUpper(&byte).format("${s} ", .{}, stdout);
}
_ = try stdout.write("\n");
}

}
const std = @import("std");
var std_writer = std.io.bufferedWriter(std.io.getStdOut().writer());
var stdout = std_writer.writer();


pub fn main() !void {
var file = try std.fs.cwd().openFile("random.bin", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
_ = try stdout.write("Empty File");
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{ buf[i] };
_ = try std.fmt.fmtSliceHexUpper(&byte).format("${s} ", .{}, stdout);
}
_ = try stdout.write("\n");
}

}
slightly longer, but now over 6 times faster than the original hexdump (still needs some modification)
Dumb Bird
Dumb Bird7mo ago
* where as the old hex dump would take 5 minutes on about a 512MB file where the zig version takes under a minute
KeithBrown7526
KeithBrown75267mo ago
1,048,576 bytes in 58 seconds as opposed to around 400 (6.5 minutes)
Dumb Bird
Dumb Bird7mo ago
Note that write should be replaced with writeAll as the usize is being discarded anyway The exception is this line _ = try stdout.write("\n");, where write should be writeChar Dev branch of Hexdump is now a clean slate with a more up-to-date readme.md
Dumb Bird
Dumb Bird7mo ago
GitHub
GitHub - KeithBrown39423/Hexdump at dev
The alternative cross platfrom hex dumping utility - GitHub - KeithBrown39423/Hexdump at dev
Dumb Bird
Dumb Bird7mo ago
Here are those changes
const std = @import("std");
var std_writer = std.io.bufferedWriter(std.io.getStdOut().writer());
var stdout = std_writer.writer();

pub fn main() !void {
var file = try std.fs.cwd().openFile("test.zig", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
try stdout.writeAll("Empty File");
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{buf[i]};
_ = try std.fmt.fmtSliceHexUpper(&byte).format("${s} ", .{}, stdout);
}
try stdout.writeByte('\n');
}

try std_writer.flush();
}
const std = @import("std");
var std_writer = std.io.bufferedWriter(std.io.getStdOut().writer());
var stdout = std_writer.writer();

pub fn main() !void {
var file = try std.fs.cwd().openFile("test.zig", .{});
defer file.close();

var buf_reader = std.io.bufferedReader(file.reader());
var in_stream = buf_reader.reader();

var buf: [16]u8 = undefined;
var cnt: u64 = 16;
while (cnt == 16) {
cnt = try in_stream.readAtLeast(&buf, 16);
if (cnt == 0) {
try stdout.writeAll("Empty File");
break;
}

for (0..cnt) |i| {
const byte: [1]u8 = .{buf[i]};
_ = try std.fmt.fmtSliceHexUpper(&byte).format("${s} ", .{}, stdout);
}
try stdout.writeByte('\n');
}

try std_writer.flush();
}
Also make sure to capture and display the error is a more useful way on this line var file = try std.fs.cwd().openFile("test.zig", .{}); Maybe use a catch |err| and not a try
KeithBrown7526
KeithBrown75267mo ago
@earth's bird should I set a max file size (i.e. return if file is two big in order to not spend three hours)? V1.2 had this feature with any file larger than 4 GiB
Dumb Bird
Dumb Bird7mo ago
Sure, but send a warning out, something like: "This file would take aprox ... hours to complete. If you'd like to run anyways use --run-away" Crappy example but you get the point
KeithBrown7526
KeithBrown75267mo ago
Yeah, that actually makes sense What should I set a max to?
Dumb Bird
Dumb Bird7mo ago
Well wait till you can run tests and see the projected time for a file of n size to take I would say anything that over 10 minutes hexdump should have a warning that the projected time is over 10 minutes or something like that But the max unless --run-anyway or something is passed should be an hour Those aren't concrete numbers but should work to get you started
KeithBrown7526
KeithBrown75267mo ago
Working on hexdump on the road :)
No description
KeithBrown7526
KeithBrown75267mo ago
I'm more doing logical work than writing actual code
Dumb Bird
Dumb Bird7mo ago
Oh ok good, that would be hell
KeithBrown7526
KeithBrown75267mo ago
@earth's bird I have the following options, but don't know what short opts to define them to Here is what I currently have. (Ignore the numbers)
KeithBrown7526
KeithBrown75267mo ago
No description
Dumb Bird
Dumb Bird7mo ago
Well short flags aren't required
KeithBrown7526
KeithBrown75267mo ago
Did you mean arent?
Dumb Bird
Dumb Bird7mo ago
yes
KeithBrown7526
KeithBrown75267mo ago
I want to have a short opt for all of them though, I just don't know what to assign them to
Dumb Bird
Dumb Bird7mo ago
Uh idk but what you're doing now is ugly
KeithBrown7526
KeithBrown75267mo ago
I know The one bye octal and one byte decimal idk what to do woth Byte*
Dumb Bird
Dumb Bird7mo ago
maybe just
-d --one-byte-decimal
-o --one-byte-octal
-c --one-byte-char
-h --one-byte-hex

-D --two-byte-decimal
-O --two-byte-octal
-C --two-byte-char
-H --two-byte-hex
-d --one-byte-decimal
-o --one-byte-octal
-c --one-byte-char
-h --one-byte-hex

-D --two-byte-decimal
-O --two-byte-octal
-C --two-byte-char
-H --two-byte-hex
KeithBrown7526
KeithBrown75267mo ago
The problem there is -o and -h are already on ise Shit, I have to change two byte octal now Let me do long opts first and I'll come back
Dumb Bird
Dumb Bird7mo ago
oh true x for hex and for octal I have no clue
KeithBrown7526
KeithBrown75267mo ago
No description
KeithBrown7526
KeithBrown75267mo ago
Here is my list of options I have right now a for ascii, s for skip, n for length, h for help, and v for version are the only ones I can think of that are for sure Maybe l or L for disable-color
Dumb Bird
Dumb Bird7mo ago
Or maybe just a long option
KeithBrown7526
KeithBrown75267mo ago
Maybe I could just have the format options long only
Dumb Bird
Dumb Bird7mo ago
You don't need a short for everything
KeithBrown7526
KeithBrown75267mo ago
True
Dumb Bird
Dumb Bird7mo ago
Well short flags aren't required
KeithBrown7526
KeithBrown75267mo ago
I think I'll just have to go with that
Dumb Bird
Dumb Bird7mo ago
Also note you can do something like --color=true or color=false
KeithBrown7526
KeithBrown75267mo ago
That seems a bit redundant because color is true by default
Dumb Bird
Dumb Bird7mo ago
Yeah I was just using it as an example
KeithBrown7526
KeithBrown75267mo ago
Although it would allow for enabling outputting VT100 codes to a file
Dumb Bird
Dumb Bird7mo ago
Then disable color when outputting
KeithBrown7526
KeithBrown75267mo ago
Well yea, if outputting to a file, vt100 is disabeled Well yea, if outputting to a file, vt100 is disabeled By default
Dumb Bird
Dumb Bird7mo ago
Although I doubt you need an output flag It seems useless
KeithBrown7526
KeithBrown75267mo ago
Output for outputting to file
Dumb Bird
Dumb Bird7mo ago
>
KeithBrown7526
KeithBrown75267mo ago
?
Dumb Bird
Dumb Bird7mo ago
hexdump somefile.txt > file.txt vs hexdump somefile.txt -o file.txt it just creates more work for you and removes the letter o for octal
KeithBrown7526
KeithBrown75267mo ago
Wasn't it your suggestion to add -o in the first place?
Dumb Bird
Dumb Bird7mo ago
Not as an output flag
KeithBrown7526
KeithBrown75267mo ago
Yea, when I first started creating it
Dumb Bird
Dumb Bird7mo ago
Oh then whoops Now I've changed my mind less work for you just letting the user use a pipe to direct stdout to a file Having an output flag in this case is just weird as the content the want anyway is in stdout
KeithBrown7526
KeithBrown75267mo ago
Not exactly, if they pipe it, they have to remember to do --color=false
Dumb Bird
Dumb Bird7mo ago
Then detect if stdout is being piped to a file Rather maybe have something like
--color=true
--color=false
--color=auto (default)
--color=true
--color=false
--color=auto (default)
True is always color, false is never color, and auto decides based of where stdout is going to a TTY or a file
KeithBrown7526
KeithBrown75267mo ago
For security purposes, can you even detect if it's been piped?
Dumb Bird
Dumb Bird7mo ago
Yes I mean things like tee can do it So I'm sure you can Nothing insecure about knowing where stdout is going to
KeithBrown7526
KeithBrown75267mo ago
Unix & Linux Stack Exchange
How does a program know if stdout is connected to a terminal or a p...
I'm having trouble debugging a segfaulting program because the ouput right before the segfault is what I need, but this is lost if I'm piping the output to a file. According to this answer: https:/...
Dumb Bird
Dumb Bird7mo ago
Yeah I knew you could It's not anything about security because you only need to know what type stdout is connect to: in your case a TTY or a file
KeithBrown7526
KeithBrown75267mo ago
Hexdump shows up one google images
No description
Dumb Bird
Dumb Bird7mo ago
Yeah it's what shows up for the topic hexdump
Dumb Bird
Dumb Bird7mo ago
No description
KeithBrown7526
KeithBrown75267mo ago
A versatile cross-platform hex-dumping alternative @earth's bird
Dumb Bird
Dumb Bird7mo ago
Vs
The alternative cross-platform hex dumping utility
KeithBrown7526
KeithBrown75267mo ago
The alternative cross-platform hex dumping utility.
Dumb Bird
Dumb Bird7mo ago
The alternative cross-platform hexdump utility
KeithBrown7526
KeithBrown75266mo ago
Can now check if stdout is tty or not
No description
KeithBrown7526
KeithBrown75266mo ago
Hexdump v2.0.0 ~ The alternative cross-platform hexdump utility

Usage:
hexdump [options...] <file>

Options:
-a, --ascii Show ASCII represntation in column on the side
-s, --skip <offset> Skip reading the first <offset> bytes
-n, --length <length> Only read <length> bytes
-h, --help Display this message
-v, --version Display the current version
--color=[bool] Enable or disable color output. Default: true
--squeeze Show identical lines as *

Formatting Specifiers:
--one-byte-char Display as ASCII character, escape code string
(’\n’, ‘\t’, etc.), or as DEC
--one-byte-decimal
--one-byte-octal
--two-byte-decimal
--two-byte-octal
--two-byte-hex


Arguments:
<length> and <offset> arguments can be followed by xxx suffixes.
Lowercase suffixes (k, m, g, ...) indicate a base of 1000, while
uppercase suffixes (K, M, G, ...) represent a base of 1024.
Hexdump v2.0.0 ~ The alternative cross-platform hexdump utility

Usage:
hexdump [options...] <file>

Options:
-a, --ascii Show ASCII represntation in column on the side
-s, --skip <offset> Skip reading the first <offset> bytes
-n, --length <length> Only read <length> bytes
-h, --help Display this message
-v, --version Display the current version
--color=[bool] Enable or disable color output. Default: true
--squeeze Show identical lines as *

Formatting Specifiers:
--one-byte-char Display as ASCII character, escape code string
(’\n’, ‘\t’, etc.), or as DEC
--one-byte-decimal
--one-byte-octal
--two-byte-decimal
--two-byte-octal
--two-byte-hex


Arguments:
<length> and <offset> arguments can be followed by xxx suffixes.
Lowercase suffixes (k, m, g, ...) indicate a base of 1000, while
uppercase suffixes (K, M, G, ...) represent a base of 1024.
Also here is the list of options If output is a tty, color is disabled and only text is written to stdout This is per @earth's bird's suggestion, rather than having a --output flag, just detect whether stdout is piped There are three possible conditions to disable color. Either stdout is not tty, tty does not support ansi escape codes, or --color=false is supplied which actually brings the question, @earth's bird, should i replace --color with --disable-color ? You can override the first two. If output isn't a tty, doing --color=true will not override it and since --color is true by default, you would never need to call it
Dumb Bird
Dumb Bird6mo ago
Yes, but I also suggest maybe having a --force-color too. So if someone wants the ansi in the final output no matter what. Even if it's not a tty, and even if it doesn't support the codes. For example you can actually display cat with color by doing echo -e $(cat test.txt) Nice to see you implemented it, how is this done in zig? Just curious, I know how it's done with C so I doubt it's much different
KeithBrown7526
KeithBrown75266mo ago
its not much different although i think its a tad bit more verbose in zig
std.io.getStdOut().isTty()
std.io.getStdOut().isTty()
Dumb Bird
Dumb Bird6mo ago
Oh, well no different then using termios by the looks of it Just Zig doesn't make your code platform dependent for using it this way Well it makes sense, I wouldn't even really call this verbose it just makes sense
KeithBrown7526
KeithBrown75266mo ago
i think in c you have to do isatty with a file descriptor which would be something like isatty(x)
Dumb Bird
Dumb Bird6mo ago
Yeah
KeithBrown7526
KeithBrown75266mo ago
which zig is more verbose than the c version
Dumb Bird
Dumb Bird6mo ago
well getStdOut().isTty()is better than isatty(1) imo
KeithBrown7526
KeithBrown75266mo ago
or this ISATTY(FILENO(stdout)) it just looks so much better and more readable to someone you might not be familiar with zig
Dumb Bird
Dumb Bird6mo ago
Yeah
KeithBrown7526
KeithBrown75266mo ago
or low level in general so should i have a --disable-color and --force-color or just --color=true (i.e. --force*) and --color=false (i.e. --disable*) the second would be easier because i don't have to figure out whether to prioritze disable and force if both are supplied
Dumb Bird
Dumb Bird6mo ago
I prefer --disable-color and --force-color as to me it's more descriptive then --color=true and --color=false
KeithBrown7526
KeithBrown75266mo ago
okay, which one should take priority
Dumb Bird
Dumb Bird6mo ago
force color
KeithBrown7526
KeithBrown75266mo ago
okay
if ((!isTty
or !stdout_handle.supportsAnsiEscapeCodes()
or disable_color
) and !force_color) {
writeAnsi = false;
}
if ((!isTty
or !stdout_handle.supportsAnsiEscapeCodes()
or disable_color
) and !force_color) {
writeAnsi = false;
}
I love ugly if statements
KeithBrown7526
KeithBrown75266mo ago
No description
KeithBrown7526
KeithBrown75266mo ago
I know have a working hexdump just need to make it pretty and add ascii support as well as clean up the othert formats So far they all work except for onebyte char and the two byte ones (those have a unique case for handling) I also need to work on speed
Dumb Bird
Dumb Bird6mo ago
whats slow? and how slow compared to the original hexdump
KeithBrown7526
KeithBrown75266mo ago
1.5x slower than the linux hexdump Approximately
Dumb Bird
Dumb Bird6mo ago
Eek What happened there? ...
KeithBrown7526
KeithBrown75266mo ago
Idk Outputting a 1MB file takes 1.56 seconds Whereas coreutils hexdump takes 1.03 And that's with optimize release fast, I think Yeah Thats with realize fast I think thats because I'm running like 2 million switch statements though
Dumb Bird
Dumb Bird6mo ago
Woah!? Why? I mean faster than using an if statement at least
KeithBrown7526
KeithBrown75266mo ago
Formatting each and every byte
Dumb Bird
Dumb Bird6mo ago
I wonder how you'd go about optimizing that
KeithBrown7526
KeithBrown75266mo ago
I have to handle each byte separately
Dumb Bird
Dumb Bird6mo ago
Why is the C impl. faster? Doesn't it also format every byte
KeithBrown7526
KeithBrown75266mo ago
Doing the format switch before print everything and then just formatting each byte by a variable instead of a switch for each byte Because it doesn't have support for multiple formats
Dumb Bird
Dumb Bird6mo ago
Ah I'm not too sure I'm following, but I think I have an idea of what you mean
KeithBrown7526
KeithBrown75266mo ago
When I get home from school, I can show you
Dumb Bird
Dumb Bird6mo ago
You just mean you're doing the same computation over and over for every byte when you really only need to do it once?
KeithBrown7526
KeithBrown75266mo ago
Kind of I couldn't figure out how yo pass a format into stdout print via variable and then pass an argument into that format
Dumb Bird
Dumb Bird6mo ago
I'll need code to understand this lol what time do you get home from school?
KeithBrown7526
KeithBrown75266mo ago
Probably around 4:30 (ur time) I have to head to town to see about them fixing my laptop and how much it will cost (I won't be dropping it off today though)
KeithBrown7526
KeithBrown75266mo ago
@earth's bird
No description
No description
KeithBrown7526
KeithBrown75266mo ago
im not home put i got the chance to pull out my laptop
KeithBrown7526
KeithBrown75266mo ago
putting this here for how to handle odd numbered two byte counts
No description
KeithBrown7526
KeithBrown75266mo ago
to be fair, my hexdump does have a couple extra characters too (extra spaces, an extra offset half-byte) I'm currently running into issues with handling the offset for the maximum file size
KeithBrown7526
KeithBrown75266mo ago
I'm now debating doing it the way core-util hexdump does
KeithBrown7526
KeithBrown75266mo ago
if you didnt see the jump, all it does is add another digit and shift everything over one character
Dumb Bird
Dumb Bird6mo ago
I'm now debating doing it the way core-util hexdump does
Seems like the best idea
KeithBrown7526
KeithBrown75266mo ago
@earth's bird removing the catch and doing try does not make it quicker im going to rework my algorithm for how i read and print data
Dumb Bird
Dumb Bird6mo ago
I told you I assumed it wouldn't
KeithBrown7526
KeithBrown75266mo ago
im going to look into have a strict algorithm and order to the entire program to clean up some minor inconsistensys
Dumb Bird
Dumb Bird6mo ago
What's a "strict algorithm". Didn't you mention the speed is about the same and on averge slightly lower than the original hexdump? Maybe thats for a reason? Maybe it doesn't get much faster?
KeithBrown7526
KeithBrown75266mo ago
Well it averages 10 milliseconds faster, but I'm going to be adding some things that might slow it down so I'm trying to speed it up now while I'm still in early dev stages of this version I have decided to modularize handling different formats. I have also made the decision to post-pone two byte formats until I have a stable release, and then will add on later. The way I am doing this will also allow for custom formatting add-ons (the only cost being you have to recompile it yourself) So after mine and @earth's bird argument last night, he mentioned an amazing way of doing the formats. The formats are handling by passing the byte to a function and the function is picked beforehand based on the format. I have also completed handling the offset and length options so you can skip the first x bytes and read only y bytes As a result of rewriting it, and doing it in zig, the program is not 33% faster than the current release and around 250-500 ms faster than core-utils hexdump
Dumb Bird
Dumb Bird6mo ago
Are you going to push anything to dev? @earth's penguin I would like to do some profiling on the code to see where most the codes time is spent of doing
KeithBrown7526
KeithBrown75266mo ago
I can in about 10 minutes
Dumb Bird
Dumb Bird6mo ago
Alroght
Dumb Bird
Dumb Bird6mo ago
Panic on file not found?
No description
KeithBrown7526
KeithBrown75266mo ago
tempoary
Dumb Bird
Dumb Bird6mo ago
You can still easily print out to stderr Ok
KeithBrown7526
KeithBrown75266mo ago
i havent reworjed the whole error yet i just panick for any error
Dumb Bird
Dumb Bird6mo ago
?
No description
KeithBrown7526
KeithBrown75266mo ago
i dont handle options yet...
Dumb Bird
Dumb Bird6mo ago
Oh right, whoops
KeithBrown7526
KeithBrown75266mo ago
you need to change the filename variable in the main function for some reason i decided it was smart to do an absolute path
Dumb Bird
Dumb Bird6mo ago
Does it work with relative paths?
KeithBrown7526
KeithBrown75266mo ago
i dont know it should
Dumb Bird
Dumb Bird6mo ago
I'll find out then It does
Dumb Bird
Dumb Bird6mo ago
GitHub
GitHub - sharkdp/hyperfine: A command-line benchmarking tool
A command-line benchmarking tool. Contribute to sharkdp/hyperfine development by creating an account on GitHub.
Dumb Bird
Dumb Bird6mo ago
Something you might want to keep note of
KeithBrown7526
KeithBrown75266mo ago
using hyperfine, I have done some benchmarking and found a slightly quicker way (albeit more verbose) was have formatting text
bin/new ran
1.01 ± 0.04 times faster than bin/old
bin/new ran
1.01 ± 0.04 times faster than bin/old
When benchmarking, I found the biggest impact of the speed and that is formatting. When adding formatting, the execution speed increases by around 170% After analizing zig source code, the format functions are quite optimized, just intensive especially when running several million times (these benchmarks were done on a 8 MB file with a complete drop of all file caches)
KeithBrown7526
KeithBrown75266mo ago
Here is an analysis of the speed
No description
Dumb Bird
Dumb Bird6mo ago
Eek, can you send graphs seperately too I'm having trouble seeing the data Or export it some other way
KeithBrown7526
KeithBrown75266mo ago
No description
No description
Dumb Bird
Dumb Bird6mo ago
It seems like you may have just taken a screenshot?
KeithBrown7526
KeithBrown75266mo ago
that was a pdf to image
Dumb Bird
Dumb Bird6mo ago
Ah Could you add a comparson for the c++ hexdump too?
KeithBrown7526
KeithBrown75266mo ago
i meant to do that and forgot
Dumb Bird
Dumb Bird6mo ago
Silly bird
KeithBrown7526
KeithBrown75266mo ago
@earth's bird
No description
KeithBrown7526
KeithBrown75266mo ago
zig is all time superior
Dumb Bird
Dumb Bird6mo ago
I knew that was the case, I just wanted to have them graphed out
KeithBrown7526
KeithBrown75262mo ago
per @earth's bird's suggestion, I will be deploying hexdump to package managers upon release if v2 I don't quite know how exactly I will do this in linux just yet but I week figure that out
Dumb Bird
Dumb Bird2mo ago
I have opened 3 new issues. 2 feature requests and 1 bug
KeithBrown7526
KeithBrown75262mo ago
I know :(
Dumb Bird
Dumb Bird2mo ago
#30, #29, #28 Yes I saw you responded to them, although trying to place blame on cxx opts was silly
KeithBrown7526
KeithBrown75262mo ago
I didn't look at it to closely, I was looking at it on my phone and I saw a lot of cxxopts.h
Dumb Bird
Dumb Bird2mo ago
Ah
KeithBrown7526
GitHub
Release Release v2.0.0 · KeithBrown39423/Hexdump
What's Changed Just some small README changes by @ZackeryRSmith in #24 Minor README changes by @ZackeryRSmith in #25 change the note formatting by @ZackeryRSmith in #26 Fix build href by @Zack...
KeithBrown7526
@earth's bird here is the graph
Dumb Bird
Dumb Bird4w ago
Looks good @earth's penguin did you also make sure to compile the binary with optimization flags?
Dumb Bird
Dumb Bird4w ago
GitHub
[Bug] Silent Fail Windows · Issue #35 · KeithBrown39423/Hexdump
Describe the bug When running the Windows binary, if you specify a filename, it silently fails with the error code -1073741819. It also takes around 15-25 seconds before it actually fails. When run...
Dumb Bird
Dumb Bird4w ago
Thankfully your issue seems to be right in front of your eyes after you fix the silent failing problem. At least your issue is pretty tracable and should lead to the solution I will look into it more later if you feel too lazy to do so (which tbh I feel) also the graph you sent me would be nice to see in the README or release page I may also get around to improving that README as it's missing a lot of selling point of this Hexdump clone and it's a bit messy rn Also your graph is lacking units of time which is quite annoying as someone trying to look at the graph I like this graph quite a bit better
KeithBrown7526
yes. its compiled with release fast
KeithBrown7526
GitHub
Release Release v2.1.0 · KeithBrown39423/Hexdump
What's Changed stop treating u64 as usize by @ZackeryRSmith in #40 🐛 Fix ascii print when last line is less than 8 bytes by @PauloCampana in #41 Add --squeeze option Speed optimization to null...
KeithBrown7526
I love making progress
No description
Dumb Bird
Dumb Bird4w ago
I love when the issues I made don't end up on the todo list :(
KeithBrown7526
this to-do list is older than the repo itself i havent even added to it in a while i probably should add those on here thoguh
Dumb Bird
Dumb Bird4w ago
You mean you will add those on there Well you don't have to
KeithBrown7526
no, i might
Dumb Bird
Dumb Bird4w ago
Honestly little things Gotta love a maintainer who doesn't care about issues made on the project :(
KeithBrown7526
:)
Dumb Bird
Dumb Bird4w ago
I'll just sob I want to install hexdump on all my devices, so get some stuff working!
KeithBrown7526
you can install it you just have to do so manually
Dumb Bird
Dumb Bird4w ago
Well I do have it installed on my Windows, MacOS, and Linux machines But it would be nice to have it on some package managers
KeithBrown7526
yea, ill add it to package managers, i just dk when
Dumb Bird
Dumb Bird4w ago
It shouldn't be all that hard really At least for brew I know it's not