How to check if createMemo should be used?
I'm trying to figure out if using createMemo is or isn't worth it.
I know that for super basic equality checks, there is probably no point.
But what about something slow, like code highlighting?
What I don't understand is that what happens when my content changes. I think the whole function needs to be re-run, so the memo doesn't help.
Or does it, still?
Also, what about the case, when this whole component is just item in a <For> loop. I mean I'm parsing the markdown like this:
So I have lot of small tokens which is processed in a <For> component, like this:
So what is happening here exactly with Solid? I mean I have a changing Markdown input -> changing array of parsed tokens -> a potentially lot of possible code-reuse for syntax highlighting. Is createMemo helping here or not?
How can I check if it helps? Putting a console.log inside createMemo which should not trigger?
31 Replies
What I'm suspecting is that the changing for loop from my source function "throws out" all the optimisations from Solid in the For loop, including the memo.
in the markdown case,
I am guessing every time you change the input and the markdown produces the tokens, then your
<For>
will re render all the tokens.
because there is no referential Identity, and so I don't think it will be an optimal update.
my 2 guess is that memo will not help in this case. the only thing that could have helped, is if you can somehow update your list of tokens in a way that you don't completely replace the whole token list.
even then I am not sure how optimal that will be, but probably better than what you have now.createMemo holds the computed value until a dependency changes.
So it only makes sense when you access the value (call the function) multiple times.
In your example you call it only once in the For component. So createMemo does not optimise there.
BTW for does memo each item. So it Supports granular updates.
So createMemo doesn't use like a global cache of input->output, it uses a locally scoped cache, right? So basically because my for loop generates a new array each time, there is no way createMemo would remember what has happened right?
Basicall the solution is to create my own cache, which I manually reset, right?
Hm. I don't what exactly you mean. But you can see if createMemo reruns if you place a simple console.log inside. So you see if there's a difference between a regular function call and createMemo. Maybe you can move the logic of the tokens.map inside the
For
so you don't recreate a new array on every change.
Something like
I mean what I need to do is to make my own memo, independent of the component I think.
yes, a memo is not a cache. it's simply a computation that does (by default) a shallow equal check with the previous computation.
your cache idea will work, but it will still be slow with big text sizes
vscode-textmate
has a tokenizeLine
which allows to pass a previous context to it, so that all the lines before that line do not have to be recalculated
i was playing around with syntax highlighting a bit back too. here you can see that tokenizeLine
in action.Thanks! I'm thinking that in a chat response, there are multiple max. 200 line code snippets for example. The last one is constantly being rewritten / recolorized, but all the ones before it should be cacheable.
yes, but you can't just highlight each line individually
you will need to pass the context of the previous lines to it
For how to speedup the currently extended code block, I have no idea, I think it needs full refresh, or possibly some token level optimisation in the highlighting library.
you are right, probably IDEs have to figure it out in a super optimised way
a i misread this sentence
you mean caching the whole codeblock
yes, by saving the previous tokenized lines as stack of tokens, that's how
vscode-textmate
does it
you can get really optimized with that: https://pota.quack.uy/Reactivity/mutable-tests this is using my tm-textarea
under the hood (oof editing of that file is not so great lol)
but you are right, for max 200 loc snippets it's a bit unnecessary
if there are no signals updated in the memo, it should not recalculate the memo too1. That type of caching assumes that your tokens are value objects, i.e. token equality is entirely based on the prop values of the token but not its identity. For certain types of analysis, where a token appears in the list in relation to the tokens before or after it helps to establish its identity.
Now if you can guarantee that every token in your list will always be unique (i.e. tokens with the same properties cannot appear in multiple places within the same list) you should be OK.
However if separate tokens with identical properties can appear in multiple places you'll likely confuse the hell out of the
For
because you will replace the separate occurrences with the same identical reference. For
doesn't expect references to appear more than once in the list; it needs them to be unique; that's how it tracks a DOM fragment as belonging to an item
reference, so that it can move around the fragment when the item
position moves.
2. Whenever you generate a new token list transfer tokens you keep from the old Map
to a new Map
. That way you can discard removed nodes immediately and don't have them hanging around unnecessarily growing the cache.
3. createMemo
can help you manage the “cache”I'll measure how much it takes to render and see if I need more optimisations.
So marked markdown parser gives me a list of object:
Many of these will be identical, for example all the <hr> lines will be exactly the same.
So you mean that the Solid <For> loop tracking these items will break?
I mean it displays correctly, but I don't know anything about how optimal it is.
I mean I suspect that in every single loop the whole For loop is being recreated, so maybe all optimisations are just not doing anything?
So you mean that the Solid <For> loop tracking these items will break?Right now each
<hr>
will have it's own referential identity. Your proposed caching scheme would collapse that to one single referential identity. My prediction is that the For
will have a pretty good chance of glitching out.But the whole caching would be hidden inside the component. The
For
wouldn't see any difference, or would it?For the optimization to work, you have to manage the referential identity of the
item
s passed to the For
.
Right now each new list will simply drop all previous fragments and create fresh ones as there is no overlap in referential identity of the items.Yes, it'd be great to keep the old items. But what can I do? My input is an array from a 3rd party library. I can calculate a hash on each item, for example, but how do I tell Solid "not to destroy and rerender everything"?
How can I measure how much does it take for Solid to render a component though?
I can calculate a hash on each item.Well, imagine this. - process the previous token list into a
Map
; key
: the hash, value
an array of tokens (reverse order of appearance in list) with the matching hash.
- once the new token list comes in
- for each token
- generate the hash; if hash is NOT in Map continue
- if hash in Map remove token from the end of array and replace the token in the new token list with the old token.
That way you stabilize the reference identity of items to reuse the DOM fragments.
And of course you can use a createMemo
to transfer that “previous token list map” from one update to the next.so basically I should try to have a single reference array, which I modify by .push() and similar, instead of always taking a new array from my function?
you are saying that I don't even need to use anything more advanced, like solid-primitives/keyed?
If you are managing referential identity yourself Solid will just do the rest. Also note that you can build the next Map for the next update as you consume the old Map to minimize the times you run the hash.
instead of always taking a new array from my functionYou are still getting that new array from your function. You are just running a rudimentary diff to determine which items to (referentially) keep from the last update. So sure, you are creating a new array of the “old items” where possible, only using a “new items” when there is no match in the “old items” (and in the process dropping old items that are no longer relevant).
I see. So basically I'm building my own reference tracking function, taking an immutable array and diffing it into a mutable array. But at this point I might as well write that hash into an id and use Keyed, shouldn't I?
https://primitives.solidjs.community/package/keyed
A hash isn't an ID.
Only because of the duplicates, right?
By definition an ID is expected to be unique.
then I can append an index to the id
The idea is that IDs are consistent between updates.
That would be the case, as long as only the end of the array changes, so I think it could work.
What I mean is that you would have to track the used index to create an ID for each hash individually to be "somewhat" consistent. Otherwise you would rarely have matches between updates-rendering the effort moot.
keyed
applies in situations where
but
Telling us that model2
from the current update slots into where model1
was used on the last update.
For stores reconcile
accomplishes that.I haven't looked deeply into, but I thought I have to do the same anyway.
Also, wouldn't keyed work with duplicate IDs in the array? I mean there is no reason why it couldn't render the same element twice.
The ID establishes the strict relationship between the data and the DOM fragment that was rendered based on it (and more importantly the connecting, reactive props). There can't be a duplicate IDs otherwise the relationship is ambiguous.
That's why in React you have to provide a
key
prop.
Solid uses the reference identity of the data instead; something that React cannot do because data is immutable so correlation of data over time is accomplished with key
(a unique ID).
Keyed
is closer to what React does and is useful with data coming from the server as it isn't possible to have referential identity between updates from the server; so an ID of your choice is used to correlate data instead.
reconcile
on the other hand is used to maintain referential identity inside a store while using the configured key
to orchestrate the necessary updates to match the supplied data.Thank you for the explanation!