Interacting with Desktop Files such as .pdf documents
Hello all, first time poster here. I've been learning mojo for about a month and I'm having an issue trying to open a .pdf file using just Mojo, no Python (likely not supported at this time?) following this guide: https://docs.modular.com/mojo/stdlib/builtin/file/
Am I missing something here or is reading content from a .pdf file not possible a this time?
file | Modular Docs
Implements the file based methods.
4 Replies
Hello Andrey. I would think it is possible. I've had a Bitwig preset parser up and running for a few months.. Have a look at https://github.com/carlca/ca_mojo.git and in particular the
./bitwig/preset_parser
for some ideas.
Essentially you just open a file using var f = open(file_name, "r")
and then use methods like var data: List[UInt8] = f.read_bytes(size)
to read the data.
Or did you mean more specialised code geared towards PDF file in particular? I fear that for the moment, it's a case of using first-principles and a handy reference to the file-format in question.GitHub
GitHub - carlca/ca_mojo
Contribute to carlca/ca_mojo development by creating an account on GitHub.
Thank you so much for the response I really appreciate it! I've been scratching my head with this one and couldn't manage to get it going. I'll give this a shot once I'm home!
Just got home and gave your code a shot. Just to clarify, I'm looking to extract text from .pdf files. The parsing and formatting isn't my issue, it's the actual text extraction. The output that I'm getting is a list of SIMD[DType.uint8, 1] values. I'm not entirely sure how to proceed from here as I'm now uncertain as to how I can convert the bytes back to a string.
Hi Andrey! In my
parser
code, where I needed to convert from the bytes to a string, I used this method...
Something similar should do the trick 😉
I see that @maxim has a new string handling package out https://discord.com/channels/1087530497313357884/1151418092052815884/1273544370578133046
I am certain that his code will be much more efficient than mine. I haven't looked at it yet, though. In my case it's a case of "if it ain't broke, don't fix it" 😉You guys are amazing. I'll give this a shot!