❔ Alternatives to Antlr for C#
Antlr needs a runtime library, and the parsing is veeery slow.
Any good alternative?
50 Replies
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
language parser
A parser generator
@T = (Q, Σ, Γ, q₀, *, δ) you have any suggestions?
Instead I'd suggest pasting into google, takes less time than asking in a question thread
I'd honestly make a suggestion based on the exact use-case
Is it really slow? On what grammar? Do you have additional non-functional requirements? (good error-recovery, incrementality, ...)
If you just need speed, then I'd look around in the world of LR parsers, they are generally linear-time, if you can find a sensible upped bound for the lookahead (sensible sadly meaning 0 or 1 in the vast majority of cases)
If you are dealing with a truly nasty grammar you can't hammer into something an LR generator can eat, then you might also want to look at the handwriting option
consider writing a simple recursive descent parser manually. that's how all programming language compilers parse code nowadays
It is a file format parser, I have the ANTLR grammar I've made already
Maybe generating a railroad diagram from it would help me reimplementing it from scratch
I generally create the parsers from scratch, but this one is "strange", so I was trying to auto-generate it
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
The question was likely for people that actually know a thing about parsing and the standard ecosystem around it. Maybe the question simply wasn't for you
3D FBX files ^^^
The structure is kinda simple, but with some strange rules
Lemme see if I can generate a railroad diagram from that
That'd help a lot in reading it, thanks
Tbh, I have to test this grammar again, one sec
I'm not in the mood to rewrite it from scratch, I've spent months in this project alone, so if I have to redo the parser, it must be the last time...lol
The grammar doesn't look complex, just a bit weirdly structured
It is, the file format is strange
But if it's for gigantic 3D mesh/skeletal/animation/whatever data, then I can imagine speed being concerning, LL(*) is fast but not "file-format-parser fast"
I just need a diagram, I guess, so I can rewrite it from scratch doing it the right way
I'll look through more in depth to see if this is something remotely parsable with LR(0), If it is, then you are in luck, that's linear time
Tks
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
Don't take me out of context and read on
This is just a chunk of an actual file:
Yep, this should be dead simple to parse with basically anything
There's no ambiguity, at least not at first glance. Most of your concern would be the lexer then tbh
There are many ambigualities
Trust me...lol
Where? I don't see any honestly
What I see here is a format where most of your concern would be fast lexing to eat the file as fast as you can
SceneInfo: "SceneInfo::GlobalInfo", "UserData"
this is a node with metadata
Properties70:
this is a node with a single sub-node
P:
this is a node with multiple properties
This is an array:
The a:
there acts as a P:
from the properties, basically
But you see there is the *
followed by the array length, which indicates it is an array
There is a catch, and I can't remember where, exactly, which breaks all the parser, if not parsed correctly
There is a long time I've written the parser tbhProperties70: this is a node with a single sub-node P: this is a node with multiple propertiesWhat's the difference between these 2? I don't immediately see any notational diff
P
and a
are ambiguousa
is sort of ambiguous but the *
a few tokens back can disambiguate it
That shouldn't be a problem
Oooh wait, is the space significant there?Not sure, haven't tested without it
I can see dataValue can derive a space, so
P:
and P:
would be different?But there is something with the newline char, that I can't recall exactly
One issue
I don't think so
Yeah, it def needs a char before that's alphabetical
Then I don't see a problem. Is this grammar written by you, and you are not sure if it's actually 100% correct, or this is given as the oracle for the format?
Oh, I remember the issue
I remember now
This can happen and is perfectly accepted by the parsers
Like, breaking an array with a newline
Yeah, I'd expect a newline to be fine there
I remember it caused me issues before, but I've fixed it by checking the comma first
I guess the main issue is that my parser is prehistoric and I need to rewrite it from scratch
It uses basically the same kind of lexer/parser I use here:
https://github.com/rickomax/JsonParser/blob/main/JsonParser.cs
GitHub
JsonParser/JsonParser.cs at main · rickomax/JsonParser
A simple JSON parser written in C# without external dependencies - JsonParser/JsonParser.cs at main · rickomax/JsonParser
You already throw away newlines, that should be fine tho
This is a weird-ass parser
Why?
It's properly structured but totally weird
It avoids defining types, value-checks leak into parse logic (what should be lex logic), ...
It's not horrible, but it's another new I've seen
Heck if I've finished my episode, I'll write a lexer and parser for this format, it shouldn't be too bad honestly
why the constants?
(yeah, why not enums)
The hash system I use
if you're going to define the constants, name them by the way they're used
I use it everywhere (yes, I know there are collisions), but I had no issues so far with the keys I use
so squarebracket should be arraystartchar
When I don't have much info to store, I prefer to use some vars instead of structs/etc
The only thing I'm totally not happy with my JsonParser above is the fact that I'm storing the
BinaryReader
instance at every node...
I've made this lib so I don't have to create strings everywhereI would better store these guys elsewhere, but that is an optimization I can do later:
wait you do what
well if it's a proxy sort of object then it's fine
It is, but I could move at least the BinaryReader outside the struct, and pass it to the methods that will consume the struct
I might do that later
The BinaryReader wont differ between the JsonParser nodes
Well, time to rewrite the FBX thing from scracth
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.