ArkType mutates original input?
I was always under the assumption that ArkType leaves the original input intact. However:
Outputs:
My assumption was that value would remain as string, and the result returned by
Parser
would be this value. But based on this, it seems like I was under the wrong assumption.46 Replies
Yeah there are quite a few discussions on this. If you want to avoid it, you can pipe to the clone operator of your choice. At some point I'll add a built-in config option to clone before morphing by default.
To me feels like a separation of concerns thing. Can be much faster without copying every object you validate, plus cloning can not always be done safely, so if it's going to happen best make it opt-in/semi-explicit.
Open to feedback on this though, potentially making the default to clone before morphing
as a new user, my expectation coming from other validation libraries in the ecosystem would be that a validation process would leave the input unmodified. Arktype's examples show parsing of unknown input data from plain objects - is it intended to be used with e.g. complex classes that can't be cloned safely? My assumption would be that if you're dealing with a complex constructed class, you already know what type it is, since you have some sort of interface that generated it - as opposed to plain objects that might be deserialized json, for example
I struggle to think of a situation in which you have an object that you can't safely clone, but that you also need arktype in order to get the type information in order to interact with it
Yes this is often the case, but ArkType still supports parsing cyclic data which makes cloning much more complex and requires more assumptions to implement on our end.
It's also just much slower in many cases where most often you don't need the input once you've parsed it to the correct shape anyways
Hm. My only qualm is a) this isn't documented anywhere that I can find (which it really should be, this is not obvious behavior) and b) it ends up breaking the type system, since a value of type T being passed into an arktype type may no longer be of type T after the call is made
Documentation definitely needs to be better- working on morphs now!
if an arktype parse call mutates the original input that needs to be in 10 mile tall capital letters, seriously - this is a huge caveat!
That seems more like a weakness of TS in general to me. Within ArkType, the outputs are precisely inferred
It is. Unfortunately there is no way in typescript to assert the type of an input and return a value at the same time
Well I'll continue to listen to feedback on that. I'm sure it is unintuitive for people coming from most other validation libraries, but I want to make sure I that's not my primary motivation for the behavior I end up finalizing
what would be lovely is e.g.
parse(x: unknown): T & asserts x is never
so that it narrows the type such that you cannot use the input anymore, since it's not safe
but it's not possible to narrow an input variable type and return a value at the same time
I think the behavior is fine, if it is documented, and not in a "you didn't read the fine print" way but a "you need to know this for basic usage" way. I'd advoate to put that in the very first example
(I really appreciate this library! please don't take this as too harsh criticism 🙂 )Once there is a global config available for this I think it will mitigate it quite a lot. I am curious to hear more of the concrete cases where things have actually broken as a result of mutating the input-not just "I didn't expect this".
I've heard a few cases, but I'd guess the vast majority of the time once you parse the raw input, you're done with it
Haha I don't take it as harsh criticism there are a lot of trade offs here and it's good to hear peoples' intuitions about them
the only case I can think of is when your function is handed something that you shouldn't modify
i.e. someone else, for efficiency, handed you a reference to some internal data structure marked as readonly rather than cloning to return it, and then you pass it to arktype and break stuff
AHHHH this actually explains so much!
But something very similar applies here to your comment about prototypes etc.- that if it's internal data it's probably already been valdiated
I was just writing tests for a utility function I have that uses arktype and my test cases defined as
[ some array] as const
wouldn't work, because arktype doesn't accept readonly
arrays or const objects as an argument. This is why! now it makes sense!
I do wish there were explicit typescript syntax for 'this function marks its input as unusable'I'm not sure about this, it accepts
data
of type unknown
- there's no way for now in TS to say "mutable or primitive"one sec - let me change my test case again
ah, never mind. it's the arktype output that is not marked as readonly.
my mistake
I was assigning a test case to
typeof myType.infer
and that was the issue.
anyway, I understand why you implemented the way you did, I think we need to document this somewhere where people are guaranteed to read this before they run into this problem and have to go diggingThere are three distinct things people may want from a type definition library
- Parsing. Popularized by "Parse, don't validate" paper, it's the primary thing people want to have for external IO. It's generally expected that parser doesn't modify the input value. It also has exact output shape which may be useful for js compiler optimisations
- Validation. This may be useful for branching behavior depending on the input
- Morphing. This operates on the input, mutating it to the expected shape. Useful for processing data in functional way(?), migrations, you name it.
Basically this seem to need three different compile techniques even if it has the same definition
I think we need an RFC to point peoples to
I would argue parsing is inseparable from validation, since validation is effectively validating that the input value conforms to some type specification (which parsing also does, no? it's just that the type specification may be "this is an email" instead of "this is an object of type string")
actually - this is sort of a philosophy question, related to the github issue i opened a couple days ago 😉 which is, should arktype commit to making sure the typescript type system remains valid on the arguments to arktype calls?
in that case, it's not about input data, and more about other arktype types. I am not sure if I am using it wrong, but it looks like
myType.pipe(otherType)
mutates otherType
such that the type of otherType
is no longer accurate in the scope after the call is madeIf it comes down to philosophy, I think mutation by default wins. It's a separation of concerns problem, and if your stated goal is to always guarantee the types from arbitrary inputs are correct, you're just trading for the scenarios where the output is a type that can't be accurately cloned
Certainly that should not happen
The reason I'd consider cloning by default is that I'm willing to be somewhat pragmatic if it's a problem too often
🤔
@ssalbdivad will there be a keyword for Parse? (as .allows for pure validate and .() for morph)
@ssalbdivad it 100% does occur, I am trying to duplicate without my business logic
but I have the call
const acuityVariables = acuityWebhookSchema.assert(dataObj);
IfI call type("parse.formData).pipe(acuityWebhookSchema)
anywhere in scope before making that unchanged former call, it no longer has the same runtime behaviorSo what is it then, just check the input and return a narrowed type?
ah yup, here we go, I made it happen
If the output is a union there was a bug that will be fixed in this release related to that (it would just fail all the time though)
but it doesn't!
if I remove that second line, it works, meaning that calling
.pipe(myType)
somehow mutated myType
such that it is now invalidDeconstruct-reconstruct clone
that's the concern, not the form data parsing not working - it's that calling
.pipe(myType)
mutates myType
which I'm not sure is intentional or expectedAs you need to deconstruct anyways and shape instatiation should be way faster then cloning I expect it to work quite well
It is definitely not expected, that is the union bug I was referring to
I already have 4 validation implementations for allows/apply JIT/dynamic
I'm not looking to add that in the immediate term
Hmm
Do you have perf-tests for those?
@ssalbdivad ah, I misunderstood. I thought you meant that the union wasn't expected to work even in the absence of
pipe
Yeah some. Yes obviously if I write an optimized parser that creates the shape of the object it will be faster in scenarios where the person needs to avoid mutating anything
That is so low ROI compared to a ton of other things I could do though
follow up: I'm a dummy, ignore me. I didn't realize that the error in my case was coming from the
.pipe
call itself.That's good, I knew that case was a problem but I wouldn't have expected caching issues (the only thing that would cause that kind of non-determinism)
it's an interesting problem! I like thinking about this sort of stuff. I am probably describing it poorly but I am intrigued by your approach - it looks like you are, for lack of a better term, distributing the morph operation over the union members? i.e. rather than a morph operation with a union output
(in: InType) => Out<{ foo: 'bar' | foo 'baz' }>
it is modeled internally as a union of morphs, (In: inType) => Out<{ foo: "bar" }> | (In: inType) => Out<{ foo: "baz" }>
Yes the structure is always Union=>Morph=>intersection=>props=>...
But really it should only be distributed based on the input
What you're seeing there is a bug with the last release
Hmm
Do you deprecate bugged releases?
Not explicitly and this isn't easy enough to repro that it's a catastrophe or anything, especially since it's a compile time failure
(or arktype compile time)
I mean generally, do you deprecate ones with obvious bugs?
No? I don't really know of that process for most libraries
You just publish a new release and put bug fix in the changelog
U go 2 npm and hit deprecate
Now all ppls with that version in lockfile will get screamed on by npm install so they get updated
The last version is still the best version there has ever been so I don't think it makes sense
Well this blew up. I understand the reasoning for mutation and I'm in agreement. It just needs to be documented. I've been using ArkType for a long time and I never considered it mutated the original object. If it was made clear, there wouldn't be an issue
just had a thought - what if there were another method on the returned type object, similar to
.assert()
, such as parseIdempotent
(or a better name, I'm bad at naming things)
instead of constructing a new type to pipe through a clone, this would clone, then parse. Plus, having this method exist in the docs would let people know that the default behavior without it mutates inputYeah this is a good idea I will add this as a method once the option is supported
Second this, "If you want 'Parse, don't validate' use T.parse"