X
Xata7mo ago
AD Amorim

what is the ideal column type to store the content of a Tiptap editor?

I’m building page the saves the whole document provided by the rich text editor using tiptap. I was wondering if the ideal column type would be text or json. Not sure about the the pros and cons of both when considered things like performance, amount of data it can store, compatibility with other databases like SQLite etc. My current requirement is just to query the whole document at page load and update it when the user changes the content with some debouncing. Appreciate any suggestions!
8 Replies
Oreki
Oreki7mo ago
my idea would be store the AST from tiptap directly into JSON column and load that into tiptap upon startup, xata has limits on how much can you store in text and string columns so you're more likely to hit them on long documents, plus tiptap has this in beta: https://tiptap.dev/docs/collaboration/documents/conversion convert to/from tiptap's AST to markdown and other formats
Document conversion API | Tiptap Collaboration
Use Tiptap to convert documents from docx, odt or markdown to Tiptap
50BytesOfJohn
50BytesOfJohn7mo ago
Hey, I'm actually using xata and tiptap in one project. I was also wondering about the approach and reading about this for a while, as it's kinda not obvious and it also depends on use case and what features do you need. I'm using basic editor stuff, like simple formatting and code blocks. After user edits etc. I'm sending html, and processing it with rehype pipeline, some sanitization, code highlighting etc. and then saving the produced html to text column. When the user wants to edit document, i'm loading this html to tiptap. For now I haven't seen any issues, tiptap understands the formatted html without issues, performance is okey and everything is cool. I belive that the problem may be when you need to start using some very specific or complicated stuff, like some custom nodes, comments, collaboration. Then I think it's better to keep both HTML and JSON, where html can be used to speed up the rendering (no need to always convert JSON to HTML), and JSON can be used when user want's to edit the document. I found this approach to be used by many people while researching this topic. When it comes to the column type for JSON, I'm not so experienced, but if you won't be filtering db by the fields from that JSON, I think you can use text column for stringified JSON as well. Some better postgres expert can say if there's any difference :D. Also, as @Oreki mentioned, keep in mind field limits if a document is huge.
AD Amorim
AD AmorimOP7mo ago
Hi @Oreki & @50BytesOfJohn thank you so much for your inputs. Really appreciate! Based on that and my initial tests, I'll start using a json column. The page component in this project has a preset that takes 120KB - which is too close to the limit xata imposed to text columns (200KB). For now, I might be fine with a json column without worrying too much about it. However, I also see that xata has a Record size limit of 600 KB. So, in the near future I might need to break the page content into smaller chunks to access the whole content.
50BytesOfJohn
50BytesOfJohn7mo ago
@AD Amorim not sure about the performance of such approach but in case of bigger documents maybe you could store it as files in some storage like AWS S3, or maybe even xata file storage and just keep the reference to the file in database.
Oreki
Oreki7mo ago
Page component? Why'd you store Page component in the database? Idk if I got this but unless it's actually very big document that you're putting in database I don't think you'll even be closer to reaching the 600kb limit, I currently have a 500~ lines post using about 25~ kbs
AD Amorim
AD AmorimOP7mo ago
Hi guys thanks again for these additional input. Appreciate! Yeah… I’m building a kind of google doc with one single editor to render the whole page content. And we have a couple of custom nodes that generate more data than the basic rich text thing. But I did the math again and my initial template for each page in JSON is actually 95KB… I’ve decided for avoiding over engineering the solution for know. Since the record size limit is 600kb, I’ll have just another column inside the table to track the json size and see how close the pages are getting to 600kb. If they became near 500kb I plan to split the content in another linked table since each item inside tiptap content is also a json object that can be considered as a section of the page. I had initially started doing this but it is a bit tricky to map the sections inside tiptap content to their xata id if my users have the ability of reordering the sections as they please. It seems I would need to create a plugin to ensure tiptap keeps the xata id of each section available for me to use… And also deal with deleting operations for a particular sections. Lots of fun but I don’t have the time to invest in it now. If we get near 500kb I’ll have data to show my boss we need to invest in it. Once again thank you all for sharing your thoughts and suggestions!
kostas
kostas7mo ago
Just to clarify, these documented record size limitations apply to the REST API and to columns that are created over the UI/CLI. Columns created with DDL statements via wire protocol are not subject to these limitations, instead there is a catch-all limit behind the scenes of 1MB total (column or document size) for the Free and Pro plan, mostly to keep resource utilization in check on the Search components (free text search on large documents can become resource intensive) and to make scaling predictable on our Shared infrastructure. If you are concerned about future growth, with Dedicated Instances (which start from roughly 50usd/month) there are no limits imposed over wire protocol for the SQL database part.
AD Amorim
AD AmorimOP7mo ago
That’s great to know we can grow beyond 600kb in that way 🙏

Did you find this page helpful?