❔ Understanding reading values from binary
I'm reading a version number from the binary of a client via hex. However, I get different results with different client versions.
Using IDA, I have found the correct offsets for each client, so that is not the issue. I have reached my intended goal of getting the value for different client versions, but I am trying to understand why.
I am new to this so there must be something I do not comprehend.
-------------------------
Using BinaryPrimitives.ReadUInt16LittleEndian
Client 1, I get the correct value:
Version: 7.0.98.16
for the version. Client 2, I get
Version: 29554.28521.8302.29477
-------------------------
Using Encoding.ASCII.GetString
for the version. Client 2, I get the intended value for Client 2
Version: 1.25.35.00
------------------------------------------
47 Replies
Cont...
If I need to clarify or provide more info let me know
i'm just wondering what in the world you chose as the syntax highlighting in these blocks
lol, I used css. My bad.
Changed it
So you are trying to turn 8 bytes into a version string?
Yup, 8 bytes for each part
I see 2 bytes for each part.
if I am right, for every two hexadecimal digits, it is 8 bytes?
No, that's 2 bytes
Er, one byte
4 bits for each character
And you're reading as 16-bit integers which are two bytes at a time.
I feel like you have dumped a lot of extraneous code into this question which, might eventually be needed for your program, but is just complicating and confusing the issue here.
Hrmm, so each hexadecimal digit represents 4 bits or half a byte?
Yes
I am pretty confused tbh, but Im trying. Guess my question represents that
You're using IDA?
yup
No offense, but...did you pay for it?
Because IDA is pretty expensive.
ida has a free version
It's the free version
yup
Ok, I only ever had the commercial version.
So you should probably simplify the problem.
Take the exact bytes you see in IDA and write them to disk in the exact same order to a separate file.
why is there 1 short skipped though
or actually
3 bytes are skipped
how random
Now read that file starting at offset zero and make your code work.
I have a feeling the computed offsets are not right.
no chance they are
the ushorts wouldn't be aligned
When your code works and you understand it, go back to starting with the offset inside the actual file.
i guess you can pack(1) but that's unlikely right
unless the game is ultra old
if you are referring to privatePart, that is something I need to handle bc the client versions do not always have a privatePart. In this case it did not.
Well the bytes for that probably still occupy memory - usually these kind of structures are fixed size.
done
The bytes might be zero.
So you created a binary file?
And not a text file?
Ah I just copy pasted in the correct order to a text file
Do i save directly from IDA to a bin?
I have three blog posts in my rarely updated blog that I think you should read:
https://treit.github.io/programming,/interviewing/2019/03/12/BitAndBytesPart1.html
https://treit.github.io/programming,/interviewing/2019/03/17/BitAndBytesPart2.html
https://treit.github.io/powershell,/programming/2019/05/05/GettingComfortableWithBinary.html
The last one shows how to easily create binary files using PowerShell
Notes to self
Fundamentals I wish you knew: bits and bytes. Part 1.
The scene: a programming interview The interview candidate is staring at me, frowning. Behind them is the whiteboard, where they have tentatively sketched out a rudimentary function prototype. They are trying to understand more about what exactly they need to do to implement the function, and it’s going poorly.
Notes to self
Fundamentals I wish you knew: bits and bytes. Part 2.
Here’s a file. It contains a series of integer values, stored as binary. I’d like you to do some processing of each value. How would you do it?
Very nice, want me to check it out then check back in?
I think that would probably really help you
Working with binary data isn't actually all that hard, but if you have never done it before there are some fundamental things to understand (like, the computer doesn't store things in hexadecimal) that are helpful.
Trying to understand the way the binary is displayed in hexadecimal when choosing a bit mode.
When using IDA, the bit mode you choose determines how the hex is formatted? if the binary was compiled for a 16 bit OS, I should choose 16 bit mode for readability?
Any links that may help me out with this would also help.
It doesn't change the actual values
Presumably it changes how it interprets the program as an executable
got yuh
This is throwing me off looking at two different versions of the same file. One from 1998, one from 2023. Definitely a lot of changes have taken place since 98, but that brings a question. Was it common not use multi-byte data back then? Or, use an endian format? Maybe I'm missing a setting in IDA, but I'm pretty sure the exact same.
1998:
55 4F 20 56 65 72 73 69 6F 6E (1.25.35)
31 2E 32 35 2E 33 35 (1.25.35)
2023:
37 00 2C 00 20 00 30 00 2C 00 20 00 39 00 38 00 2C 00 20 00 31 00 00 00 36 (7, 0, 98, 16)
Like one is using ASCII and the other UTF-16
The second one is a UTF16 string. The first one seems like the data doesn't remotely match what you think the value should be
Looks like an ascii string
Starts with "UO V" and I'm too lazy to decode the rest on my phone at the brewery after work. You can do it easily with an ascii table or CyberChef
Or just any hex viewer which normally shows the ascii string on the right
My mistake, I coped the wrong values, but yes it is "UO Version." I corrected it above, but yes that answers my question.
31 2E 32 35 2E 33 35
ascii was pretty normal in the 90s
Except on Windows NT where everything was Unicode
Windows NT didn't take over for consumer operating systems until 2001 when Windows XP released
I've been watching Computer Chronicles, Windows NT is very interesting.
I worked on Windows NT5 which was rebranded Windows 2000. Windows XP followed.
Knowing that the first 128 characters in Unicode are directly mapped to ASCII helps me plan how I will do things when I rewrite my methods to compensate for different versions of the client.
You should use System.Text.Encoding and not try to do it all yourself
That's awesome. Must have changed a lot during it's lifetime considering the UI was based off of WIndows 95.
Yeah it's a very different landscape now 🙂
Will do
Is the string length encoded in the file?
Or it's null terminated?
I want to say the 2023 version is using null terminated UTF-16 looking at the values:
37 00 2C 00 20 00 30 00 2C 00 20 00 39 00 38 00 2C 00 20 00 31 00 00 00 36
37 00 -> "7"
2C 00 -> ","
20 00 -> " "
30 00 -> "0"
2C 00 -> ","
20 00 -> " "
39 00 -> "9"
38 00 -> "8"
2C 00 -> ","
20 00 -> " "
31 00 -> "1"
00 00 -> null character
36 -> "6"
However, I am not sure on how to find out if string length is encoded in the file.Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.