A
arktype•5mo ago
SynthLuvr

ArkType mutates original input?

I was always under the assumption that ArkType leaves the original input intact. However:
import { type } from "arktype";

const original = { value: "100" };
const Parser = type({ value: type("string").pipe(BigInt) });
Parser.assert(original);
console.log(original);
import { type } from "arktype";

const original = { value: "100" };
const Parser = type({ value: type("string").pipe(BigInt) });
Parser.assert(original);
console.log(original);
Outputs:
{ value: 100n }
{ value: 100n }
My assumption was that value would remain as string, and the result returned by Parser would be this value. But based on this, it seems like I was under the wrong assumption.
46 Replies
ssalbdivad
ssalbdivad•5mo ago
Yeah there are quite a few discussions on this. If you want to avoid it, you can pipe to the clone operator of your choice. At some point I'll add a built-in config option to clone before morphing by default. To me feels like a separation of concerns thing. Can be much faster without copying every object you validate, plus cloning can not always be done safely, so if it's going to happen best make it opt-in/semi-explicit. Open to feedback on this though, potentially making the default to clone before morphing
francis
francis•5mo ago
as a new user, my expectation coming from other validation libraries in the ecosystem would be that a validation process would leave the input unmodified. Arktype's examples show parsing of unknown input data from plain objects - is it intended to be used with e.g. complex classes that can't be cloned safely? My assumption would be that if you're dealing with a complex constructed class, you already know what type it is, since you have some sort of interface that generated it - as opposed to plain objects that might be deserialized json, for example I struggle to think of a situation in which you have an object that you can't safely clone, but that you also need arktype in order to get the type information in order to interact with it
ssalbdivad
ssalbdivad•5mo ago
Yes this is often the case, but ArkType still supports parsing cyclic data which makes cloning much more complex and requires more assumptions to implement on our end. It's also just much slower in many cases where most often you don't need the input once you've parsed it to the correct shape anyways
francis
francis•5mo ago
Hm. My only qualm is a) this isn't documented anywhere that I can find (which it really should be, this is not obvious behavior) and b) it ends up breaking the type system, since a value of type T being passed into an arktype type may no longer be of type T after the call is made
ssalbdivad
ssalbdivad•5mo ago
Documentation definitely needs to be better- working on morphs now!
francis
francis•5mo ago
if an arktype parse call mutates the original input that needs to be in 10 mile tall capital letters, seriously - this is a huge caveat!
ssalbdivad
ssalbdivad•5mo ago
That seems more like a weakness of TS in general to me. Within ArkType, the outputs are precisely inferred
francis
francis•5mo ago
It is. Unfortunately there is no way in typescript to assert the type of an input and return a value at the same time
ssalbdivad
ssalbdivad•5mo ago
Well I'll continue to listen to feedback on that. I'm sure it is unintuitive for people coming from most other validation libraries, but I want to make sure I that's not my primary motivation for the behavior I end up finalizing
francis
francis•5mo ago
what would be lovely is e.g. parse(x: unknown): T & asserts x is never so that it narrows the type such that you cannot use the input anymore, since it's not safe but it's not possible to narrow an input variable type and return a value at the same time I think the behavior is fine, if it is documented, and not in a "you didn't read the fine print" way but a "you need to know this for basic usage" way. I'd advoate to put that in the very first example (I really appreciate this library! please don't take this as too harsh criticism 🙂 )
ssalbdivad
ssalbdivad•5mo ago
Once there is a global config available for this I think it will mitigate it quite a lot. I am curious to hear more of the concrete cases where things have actually broken as a result of mutating the input-not just "I didn't expect this". I've heard a few cases, but I'd guess the vast majority of the time once you parse the raw input, you're done with it Haha I don't take it as harsh criticism there are a lot of trade offs here and it's good to hear peoples' intuitions about them
francis
francis•5mo ago
the only case I can think of is when your function is handed something that you shouldn't modify i.e. someone else, for efficiency, handed you a reference to some internal data structure marked as readonly rather than cloning to return it, and then you pass it to arktype and break stuff AHHHH this actually explains so much!
ssalbdivad
ssalbdivad•5mo ago
But something very similar applies here to your comment about prototypes etc.- that if it's internal data it's probably already been valdiated
francis
francis•5mo ago
I was just writing tests for a utility function I have that uses arktype and my test cases defined as [ some array] as const wouldn't work, because arktype doesn't accept readonly arrays or const objects as an argument. This is why! now it makes sense! I do wish there were explicit typescript syntax for 'this function marks its input as unusable'
ssalbdivad
ssalbdivad•5mo ago
I'm not sure about this, it accepts data of type unknown- there's no way for now in TS to say "mutable or primitive"
francis
francis•5mo ago
one sec - let me change my test case again ah, never mind. it's the arktype output that is not marked as readonly. my mistake I was assigning a test case to typeof myType.infer and that was the issue. anyway, I understand why you implemented the way you did, I think we need to document this somewhere where people are guaranteed to read this before they run into this problem and have to go digging
Dimava
Dimava•5mo ago
There are three distinct things people may want from a type definition library - Parsing. Popularized by "Parse, don't validate" paper, it's the primary thing people want to have for external IO. It's generally expected that parser doesn't modify the input value. It also has exact output shape which may be useful for js compiler optimisations
function parse(o) {
let { s, n, d, rest } = o;
if (typeof a !== 'string' || ...) throw 'bad data';
let d1 = Number(n); if (d1 !== d1) throw 'd bad';
(Object.values(rest).length !== 0) throw 'bad rest';
return { s, n, d: d1 }
}
function parse(o) {
let { s, n, d, rest } = o;
if (typeof a !== 'string' || ...) throw 'bad data';
let d1 = Number(n); if (d1 !== d1) throw 'd bad';
(Object.values(rest).length !== 0) throw 'bad rest';
return { s, n, d: d1 }
}
- Validation. This may be useful for branching behavior depending on the input - Morphing. This operates on the input, mutating it to the expected shape. Useful for processing data in functional way(?), migrations, you name it.
function morph(o) {
}
function morph(o) {
}
Basically this seem to need three different compile techniques even if it has the same definition I think we need an RFC to point peoples to
francis
francis•5mo ago
I would argue parsing is inseparable from validation, since validation is effectively validating that the input value conforms to some type specification (which parsing also does, no? it's just that the type specification may be "this is an email" instead of "this is an object of type string") actually - this is sort of a philosophy question, related to the github issue i opened a couple days ago 😉 which is, should arktype commit to making sure the typescript type system remains valid on the arguments to arktype calls? in that case, it's not about input data, and more about other arktype types. I am not sure if I am using it wrong, but it looks like myType.pipe(otherType) mutates otherType such that the type of otherType is no longer accurate in the scope after the call is made
ssalbdivad
ssalbdivad•5mo ago
If it comes down to philosophy, I think mutation by default wins. It's a separation of concerns problem, and if your stated goal is to always guarantee the types from arbitrary inputs are correct, you're just trading for the scenarios where the output is a type that can't be accurately cloned Certainly that should not happen The reason I'd consider cloning by default is that I'm willing to be somewhat pragmatic if it's a problem too often
Dimava
Dimava•5mo ago
🤔 @ssalbdivad will there be a keyword for Parse? (as .allows for pure validate and .() for morph)
francis
francis•5mo ago
@ssalbdivad it 100% does occur, I am trying to duplicate without my business logic but I have the call const acuityVariables = acuityWebhookSchema.assert(dataObj); IfI call type("parse.formData).pipe(acuityWebhookSchema) anywhere in scope before making that unchanged former call, it no longer has the same runtime behavior
ssalbdivad
ssalbdivad•5mo ago
So what is it then, just check the input and return a narrowed type?
francis
francis•5mo ago
ah yup, here we go, I made it happen
ssalbdivad
ssalbdivad•5mo ago
If the output is a union there was a bug that will be fixed in this release related to that (it would just fail all the time though)
francis
francis•5mo ago
but it doesn't!
const myType = type({ foo: "'bar'" }).or({ foo: "'baz'" });
type("parse.formData").pipe(myType);
myType({ foo: 'bar' })
const myType = type({ foo: "'bar'" }).or({ foo: "'baz'" });
type("parse.formData").pipe(myType);
myType({ foo: 'bar' })
if I remove that second line, it works, meaning that calling .pipe(myType) somehow mutated myType such that it is now invalid
Dimava
Dimava•5mo ago
Deconstruct-reconstruct clone
francis
francis•5mo ago
that's the concern, not the form data parsing not working - it's that calling .pipe(myType) mutates myType which I'm not sure is intentional or expected
Dimava
Dimava•5mo ago
As you need to deconstruct anyways and shape instatiation should be way faster then cloning I expect it to work quite well
ssalbdivad
ssalbdivad•5mo ago
It is definitely not expected, that is the union bug I was referring to I already have 4 validation implementations for allows/apply JIT/dynamic I'm not looking to add that in the immediate term
Dimava
Dimava•5mo ago
Hmm Do you have perf-tests for those?
francis
francis•5mo ago
@ssalbdivad ah, I misunderstood. I thought you meant that the union wasn't expected to work even in the absence of pipe
ssalbdivad
ssalbdivad•5mo ago
Yeah some. Yes obviously if I write an optimized parser that creates the shape of the object it will be faster in scenarios where the person needs to avoid mutating anything That is so low ROI compared to a ton of other things I could do though
francis
francis•5mo ago
follow up: I'm a dummy, ignore me. I didn't realize that the error in my case was coming from the .pipe call itself.
ssalbdivad
ssalbdivad•5mo ago
That's good, I knew that case was a problem but I wouldn't have expected caching issues (the only thing that would cause that kind of non-determinism)
francis
francis•5mo ago
it's an interesting problem! I like thinking about this sort of stuff. I am probably describing it poorly but I am intrigued by your approach - it looks like you are, for lack of a better term, distributing the morph operation over the union members? i.e. rather than a morph operation with a union output (in: InType) => Out<{ foo: 'bar' | foo 'baz' }> it is modeled internally as a union of morphs, (In: inType) => Out<{ foo: "bar" }> | (In: inType) => Out<{ foo: "baz" }>
ssalbdivad
ssalbdivad•5mo ago
Yes the structure is always Union=>Morph=>intersection=>props=>... But really it should only be distributed based on the input What you're seeing there is a bug with the last release
Dimava
Dimava•5mo ago
Hmm Do you deprecate bugged releases?
ssalbdivad
ssalbdivad•5mo ago
Not explicitly and this isn't easy enough to repro that it's a catastrophe or anything, especially since it's a compile time failure (or arktype compile time)
Dimava
Dimava•5mo ago
I mean generally, do you deprecate ones with obvious bugs?
ssalbdivad
ssalbdivad•5mo ago
No? I don't really know of that process for most libraries You just publish a new release and put bug fix in the changelog
Dimava
Dimava•5mo ago
U go 2 npm and hit deprecate Now all ppls with that version in lockfile will get screamed on by npm install so they get updated
ssalbdivad
ssalbdivad•5mo ago
The last version is still the best version there has ever been so I don't think it makes sense
SynthLuvr
SynthLuvrOP•5mo ago
Well this blew up. I understand the reasoning for mutation and I'm in agreement. It just needs to be documented. I've been using ArkType for a long time and I never considered it mutated the original object. If it was made clear, there wouldn't be an issue
francis
francis•5mo ago
just had a thought - what if there were another method on the returned type object, similar to .assert(), such as parseIdempotent (or a better name, I'm bad at naming things) instead of constructing a new type to pipe through a clone, this would clone, then parse. Plus, having this method exist in the docs would let people know that the default behavior without it mutates input
ssalbdivad
ssalbdivad•5mo ago
Yeah this is a good idea I will add this as a method once the option is supported
Dimava
Dimava•5mo ago
Second this, "If you want 'Parse, don't validate' use T.parse"
Want results from more Discord servers?
Add your server