so, somehow i ended up reading a long
so, somehow i ended up reading a long back and from december in this channel about client/server io protocols which reminded me, i've been bouncing the idea around for some time that it might be beneficial from a compatibility and design perspective to implement (or modify) the kitty keyboard protocol and/or the (in-progress) ghostty terminal api for stardust? in addition to the fact that these protocols seem likely to become de facto standards, it might be both a good way for people to feel at-ease in the stardust environment and a launchpad for extending the functionality of the client landscape as-is.
Also, in case it's not obvious, i'm not suggesting a literal keyboard layout but putting some creativity behind how to implement these protocols in order to interact with existing software more thoroughly.
What do y'all think? I think it would def require some careful thought to get right (and would likely require some flexibility regarding implementation for different contexts/clients), but i like the constraints/direction it would put on what we're trying to do here. Thoughts? (btw Nova i just caught your talk from a few years back in the pinned posts... was excellent to hear the vision laid out so clearly)
Also, in case it's not obvious, i'm not suggesting a literal keyboard layout but putting some creativity behind how to implement these protocols in order to interact with existing software more thoroughly.
What do y'all think? I think it would def require some careful thought to get right (and would likely require some flexibility regarding implementation for different contexts/clients), but i like the constraints/direction it would put on what we're trying to do here. Thoughts? (btw Nova i just caught your talk from a few years back in the pinned posts... was excellent to hear the vision laid out so clearly)
166 Replies
PS i like the fact that the kitty protocol is progressive, so we can start with baby steps if it's a good idea. Also, i think i read that you already have some of those lower level keybindings implemented on the lower level?
i don't quite understand, what is the kitty protocol... for?
also stardust doesn't work like a terminal
it's a protocol for an input device to talk to a server using established escape codes/patterns that have established semantics
for example in that long back and forth i think there was some debate on how we might know a client is 'listening' for input, the protocol aims to take care of that. it should also standardize some "keybindings" that folks are used to, like ctrl+ a key being some action
i'm curious to hear about what you mean by stardust doesn't work like a terminal, this is why i bring it up cuz it's entirely possible this abstraction completely breaks down at a lower level and just ends up being square peg round hole-ing it
Nova — 12/9/24, 3:16 PM anyway, i'm debating what to make the interface... do i make it have a separate set_keymap method from key that tells a key press/releasethis is the type of problem these protocols are designed to address, with the bonus of being battle tested and compatible with some low level processes widely in use. I know one of the stated goals of ghostty is to have a "headless" terminal that you can slap a UI on as you please, and it would handle most of the communication headaches
but that's 2D input in a CLI
stardust is 2D input given to specific clients
also i am overwhelmed with stuff to to today so will have to think about this more later
plz dump your info here and i will look at it when i can
as far as i can tell, the kitty protocol and the protocol for stardust have completely different goals, the kitty one is for terminal apps and communicates using stdin/stdout, stardust doesn't use stdin/stdout at all tho, the keyboard (in in the future mouse) stuff in stardust needs to be spatial, stardust also doesn't have a terminal built into the server, i don't see a reason why stardust should use this instead of letting a 2d terminal emulator implement that on top of wayland, which in stardust is implemented ontop of the stardust keyboard protocol
what is the stardust keyboard protocol?
a dbus interface for sending keypresses to other clients
yes, that was what i was reading about. Are we confident in that implementation? Does it map well to existing UIs?
genuine questions
it literally forwards raw key events over
implementing the kitty wouldn't gain us any client support because you can't directly run cli apps ontop of stardust, you would run them inside something like kitty or alacritty, which implement the protocol ontop of wayland
the alternative would be sending keycodes and keysyms separately but the STUPID XKB SYSTEM does not like that
i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb i hate xkb
they forced you to use it in wayland to get any keysym data out and it's a hot garbage mess
well, again, ghostty is aiming to be a headless API
but afaik i cannot plug that into wayland
that requires raw keycodes and a keymap to send to apps
then the apps use that keycode and parse it with the keymap to get out the keysyms
why would we "have" to run them inside a terminal emulator? cant that just be the job of stardust clients?
what does that even mean? stdin/stdout (which as far as i can tell is what they use) is already headless, or do you mean background processes like what zellij/tmux/screen does?
stdin isn't the thing that worries me
it's hardcoding in cli common key shortcuts into 3D clients
and also wayland clients too
well if we had a client that runs cli stuff, using those new apis, displaying the textoutput, that would be a terminal emulator, just not one ontop of wayland but ontop of the stardust protocol
i'm honestly not really thinking about this as a text interface. the whole point of escape codes is to communicate non-textual keypresses between client and server, and ultimately i'm interested in things such as keybindings that can aide in making the new frontier of XR UI a little more intuitive/friendly/familiar
most people don't use cli tho
ok, but everyone knows what ctrl+c does
no
(ironically... unless you're in a terminal)

as an optional thing, sure
lol my point is that the utility of keybindings are not restricted to text interfaces
also those escape codes are used to escape from textual representation, i just don't see any point in using the protocol instead of doing it custom
sure but they should be implemented on the interface
not on the protocol
can you explain that?
or, i know you said you were busy lol so maybe that's something i can chew on
well think about it, if I have a 3D git client how do half those commands make sense
can you really extend those keybindings to every 3D object ever and make them compatible with wayland?
like i said totally willing to concede this is not the right tool for the job, I just think it could provide some battle tested solutions to old problems
it just doesn't make sense
i mean if you really want global shortcuts (and i and nova both can't really think of good uses) you would implement https://github.com/flatpak/xdg-desktop-portal/pull/711
i dont understand what git has to do with keyboard protocols
a git GUI
in 3D
i think schmarni gets it so i'll let them explain
like again, this is a good idea for stuff that knows how to use it but it's a choice on the client dev
okay think about it this way, what would a 3d client gain from this protocol being implemented at the core stardust level? how would you generically map specific key combinations in 3d space? remember, stardust doesn't (and shouldn't) need a physical keyboard,if its about having some generic actions that can be mapped to key inputs sure, but i feel like they should be mapped by the client that brings the keyboard input into 3d, or a client that provides some virtual keyboard
yeah, i mean that's more or less what i'm trying to suss out here, I think there is a benefit to thinking about the way that the effective aspects of 2D ui translate into 3D problems. not saying "lets implement this protocol exactly" more like "let's use these solutions to help us think about established solutions to common problems when we interact with software."
attached are a few examples from the ghostty protocol, and it's worth going over the whole list if you wanna go through the thought experiment. say we have an arrangement of objects in 3D space, like a git tree in this hypothetical git UI. Do we have a simple, universally understood way to iterate between objects? to scroll through them? to multiselect? Have we even thought about the concept of a cursor that isn't just wherever one of your fingers/controllers arent pointing?
Some of these actions are a stretch, for sure, but i think we could get a lot of mileage just going through stuff and seeing what sticks, what can be thrown out.

that i will agree to
but can't just directly apply it
gotta reinterpret it and adapt the intent to the new medium
100
the key aspect here is that a client shouild be able to know "these bytes" iterate the cursor to the "next" object, and yeah, what that means totally depends on the client, but that's the way it already is
no
disagree
fun fact: the server -> client input relation breaks in the case of stardust, only input methods (things like xr hands and xr controllers) are handled through the server, non-spatial-input like keyboard and mouse are send in an inter client way, one client like
non-spatial-input/simular
gets the input from the hardware and then sends it to other clients using a dbus based protocol, each object that wants to receive keyboard input is exposed through dbus, together with a spatial and some field, the input spatializer (in this case simular) gets to select the objects it sends input too in any way it wantsthat's too close to implementation in the old mendium, we need to keep the intent and remake the interface to what makes sense
because that way the server only worries about spatial stuff
separatin of concerns
that's good info and gives me a bit to chew on
can someone link to the dbus implementation?
uhhh we haven't exactly centralized it yet
:blobcatgoogly:
actually @Schmarni i want to fix the keyboard impl and just YEEET keymaps to the curb
one thing i definitely agree on is i never want to touch a keyboard in vr lol
so like, i want to develop a keymap that maps keycodes to keysyms with the same exact code
for compatibility
then copy windows and send keycode and keysym over the protocol
molecules no?
oh right yea
:blobcatgoogly:
what confuses me about this response is that i'm 100% suggesting using xr hands and controllers to implement this hypothetical protocol
ultimately the keyboard protocol is not about keyboards but about sending bytes that signify common actions
ESCAPE is just a label
this is even more confusing
wait a moment you're just talking about the openxr input aren't you
@Schmarni does this sound like input actions to you?
the problem with those is that they flatten intent, they do not make it contextual
ok yeah this is pretty close to what i have in mind https://docs.unity3d.com/Packages/[email protected]/manual/ActionBindings.html
kinda? this would have a tiny spatial aspect tho because of keyboard handlers, so as spatial is the current one ig? also it maps closer to concepts than buttons? ig think the datamap for input methods?
at least i am working under the expectation that we would keep the keyboard handlers
this sounds good in theory but like, every time you try to do it in practice you end up restricting the types of apps that can be built in practice
oh no no no no no no, like i don't hate the OpenXR actions system, but for stardust? NO
it globalizes the set of actions
whereas stardust, given there are tons of devs independently making stuff, localizes it
this is why i like protocols
implement all the protocols you want, bind them to interfaces on either end
(to add onto this) like how do you select an object that receives input? would it just be global?
provide as much context as you can
in stardust you just switch which input you send input to because the keyboard (or in non-spatial-input, the spatializer) chooses
yeah, but not if you do OpenXR style global actions
yeah
if you want a standard set of actions, add it to the toolkit and make it trivial to add where relevant
just like many UX conventions are
what do you mean by openXR style global actions, and how is what i'm suggesting restrictive?
give me a practical example of this please
but in current stardust stuff yeah, tho only global things in stardust i am aware of are a SkyBox and SkyLighting, both of which i want to remove soon and replace with spatial alternatives
from the high level action the user wants to do
and then how you see it implemented
it'll help me understand and critique better
ok, let's stick with the git UI example** and our main goal is just to navigate and inspect the DAG, maybe perform a merge by dragging and dropping a node onto another branc, etc. And let's say the DAG is represented with a simple ball and stick model. We want designer of the client to be able to map user actions to actions on the git tree. Having a protocol that implements established byte sequences can make it easier for the designer to implement common functionality, like "jumping to the beginning of the tree" (such as what the "home" button might do) or "skipping 5 nodes at a time" (such as what the tab might do). IMO these arent implementation specific or restrictive (the actual actions can be whatever we want, but are represented consistently in the machine code!), but just a client-friendly way to not need to reinvent the wheel on some things, some of the most common things that people will want to do, at that.
**This example might very well break down with the way your client/server model is implemented, which i'm very much am naive about and trying to learn more about, the focus here is demonstrating what an established, published so-called"VT" protocol can bring to the table instead of the specifics of client/server interaction and the textual representations of the bytes
hmmm, so the first thing i see is, why wouldn't you just do direct interaction with the DAG? also how would you map that onto something like hands? ideally without having to use a big keyboard style object
like this DAG would be represented as multiple connected nodes, why treat it as one object/concept?
just make a scrollbar and make the root commit the bottom
then make it have momentum
user just grabs scrollbar and flicks up fast and voom, they're there
in a way that's what i want to figure out! like, how can we take principles that we've learned about interacting with our computers and apply in in XR in a way that actually makes intuitive sense. There are 10 buttons at our disposal, 10 fingers if we're going controllerless. Our arms have about 6 additional degrees of freedom we can tap into as well. can we design a system that provides some coherence to the way our input components interact to do the things that we all know how to do in 2D, or at least provide some way for client designers to make these customizable?
that's not really how that works in xr
fingers aren't float values in practice to a designer
not really my point
it is infinitely customizable
you get raw input in clients for hands and controllers and so on relative to the input handler's field
how can we take principles that we've learned about interacting with our computers and apply in in XRi think that this is the complete wrong question to ask, rather we should ask how we can apply what we have learned irl and apply that to how we interact with computers using xr
yes and i'm saying "infinitely customizable" is not inherently a good thing especially when it comes to design and adoption
but it's infinitely customizable while local
as the base layer "infinitely customizable" is the correct amount of customizability as long as you can avoid conflicts, which we can (using the SUIS)
the protocol determines the most customizable it can be
barring straight-up exploits
I feel like this glosses over the fact that we are inherently working with software. If im designing a new Photoshop or whatever in stardust it would be helpful both as a designer and to my potential customers that they have at least some idea how to perform certain actions
as a more highlevel designer you would probably use some premade elements and glue them together, the elements is the correct level to "restrict" things
i wanna learn more about the SUIS too, on my to-do list
i really need a good explainer on this stuff
but i suck at writing things well-formatted
tho i can do cool graphs
but i'd need to show graphs over time and blrgh
i feel like using better abstractions like a pen or brush would make more sense than trying to do things like a keyboard would, having to learn esoteric gestures is not more intuitive than picking up a brush, or a select tool where you grab 2 corners and place and scale it that way
so yeah this isn't far from my perspective i think, i think we might just differ in providing a more opinionated framework vs a more open field approach
we can do soft standards by choosing what's in the toolkit
yes that's exactly what i'm saying and have tried to repeat several times, this isn't about keyboards or text
make that the standard by convenience
text is just bytes
yes but like, to a designer that doesn't matter
this is a UX design thing, not a technical one
xr is a completely new medium, it being software is just an implementation detail
trust me, you are preaching to the choir. But I think it being a new medium just necessitates all the more providing a framework that brings out intuitive flow intead of impeding it
i dont want to be too restrictive either, i'm honestly trying to open things up because vr is so, so restrictive right now
my question is, why the kitty keyboard protocol
lol
And i think these common actions/shortcuts can provide a bridge for people
yes but (as far as i can tell) you are saying we should use this kitty protocol at the low stardust protocol level, implementing something like it on a framework level makes sense, even exposing it in a more generic way through dbus might make sense, but the base protocol is the wrong layer for this
for the record, i support making d-bus interfaces for every widget
like, dials should expose programmatic control from d-bus and all
where i see the kitty protocol potentially helping is with things like having a systematic way to delineate from different types of input
what do you mean by this, give a concrete example?
like they make a distinction between normal input, escape input, event input, etc in a way that has become industry standard for good reasons
only some tho probably (oh no, do we need to implement components in asteroids through dbus?)
no i mean just hooking up dbus into asteroids elements
like, make the elements expose a d-bus object to control
anyway this isn't relevant
could you give me even more examples of what you want to do @schlich?
from a user perspective
an industry standard for text input based cli apps, sure, but that is not even close to what stardust is, or how pretty much anything using stardust works
that's not really a helpful distinction for me, sorry
here's an excerpt from the kitty protocol:
If you are an application or library developer just interested in using this protocol to make keyboard handling simpler and more robust in your application, without too many changes, do the following: Emit the escape code CSI > 1 u at application startup if using the main screen or when entering alternate screen mode, if using the alternate screen. All key events will now be sent in only a few forms to your application, that are easy to parse unambiguously. Emit the escape sequence CSI < u at application exit if using the main screen or just before leaving alternate screen mode if using the alternate screen, to restore whatever the keyboard mode was before step 1. Key events will all be delivered to your application either as plain UTF-8 text, or using the following escape codes, for those keys that do not produce text (CSI is the bytes 0x1b 0x5b):Try to forget everything having to do with UTF and keyboards, the protocol is about input processing in a way that disambiguates different implementations from clients
just make more protocols over d-bus or swap out UI controls
does the same thing
if i get you correctly
i don't think it makes sense to use one "event stream" for things like selecting stuff in stardust, instead you would want to tag spatial things as selectable, then other clients (or your own one of course) can select things, then your client can decide how to handle that select request, should it select that object alongside others, should it select the object and unselect the other ones?
i don't think its necessarily a good idea to treat one client as one continues object/thing that you control, trying to push everything through one api surface doesn't make sense and adds unneeded complexity
yeah, sure, i just want to like, actually do the thing in a way that will (hopefully) free us up to make more interesting apps
it's closer to compare stardust hands to touchscreen input and xr controller to wiimote
even though it can do both seamlessly and do more
seriously tho this would help a ton tho
i need end user cases
then i can figure out the best way to do that in an xr friendly manner
like this, and don't include specifics of keyboards or such
word, i mean i brought up the photoshop thing but for real the whole reason i'm here is to try and develop some kick ass creativity suites, and i just don't think that's possible without some kind of systematic way to interpret user input
if i wanted to make Ableton in XR how would we replicate that "pro user" flow
in a way thats accessible and intuitive to noobs
ableton live?
yeah
ok idk that interface well
can you explain it a bit more or give me another one that i might know?
i mean, substitute your favorite creative suite. How do we bridge power and functionality with control, simplification and ease of use
is blender a good comparison?
obviously we want to make 3D skeumorphisms or whatever, but we still need an intricate way to like acess and switch between objects and actions
yes
ohhhh yes ok so
switching tools
yeah and patterend outpuit
Meta Quest
YouTube
Introducing Google Blocks—Available Now on Rift!
Today, we’re excited to share that Google Blocks is now available on the Oculus Store. The latest VR app from our neighbors in Mountain View, Blocks lets you build 3D models in virtual space.
Check out vr.google.com/objects (http://vr.google.com/objects) to see what people are making, and share your own on social with #madewithblocks.
https:...
how do i ctrl+c, ctrl+d
etc
what do you need ctrl+c for?
there's no modes to exit from
ctrl+c is just a stand in for a pattern of input that reliably maps to an action
Open Blocks
YouTube
Open Blocks Trailer
Open Blocks Early Access released for Quest! The most intuitive 3D modelling app is now available for standalone headsets. Free and open source forever. Full "publish and share" solution coming soon.
Store page: https://www.meta.com/en-gb/experiences/open-blocks/8043509915705378/
Steam: https://store.steampowered.com/app/3077230/Open_Blocks/
...
yes but can you put that in a practical example of what it's used for in a bunch of applications?
i think this overall pattern is generally just... not how XR works at all
it's a very 2D computer centric method entirely
so you can do the end goal without using it
like here there you switch tools by snapping them into your controller and you don't select, you just... move things directly
for blender, in XR i'd use a similar thing
i mean, blender is a particularly spatial app. that video does actually portray some of the functionality i'd like to see in stardust, but it kinda breaks down when you need to introduce abstractions, like you would when you're making music instead of 3D visual art. Regardless of what you are doing, it will be massively helpful to have a defined way to iterate through items, group items, duplicate items, advance through layers of hierarchy, etc
duplicating is pulling the items apart when they're unresizeable
advance through layers... what do you mean?
yes and what i'm interested in is all the details around making that happen
and providing some infrastructure for people who want to do this across domains
also btw the action system you're proposing is not possible in xr
and ideally in a way that's customizable
because you have 50 types of controllers with varying controls
not like a keyboard with a mostly common key layout and every key being the same as others
doesn't openXR constrain that some? is that what the rant was about earlier?
yes and in practice that is difficult for everyone involved
some games just do not support your controller or do not have all actions mapped
or have them in weird combos
that's what the protocol would help with imo
idk how it'd help
it's a classic problem where when you have a common set of actions and a fixed interface you can't bind them always
that's EXACTLY what the kitty protocol was designed to address
it has a keyboard to work with
key combos work on keyboards, they don't work on controllers
button combos suck
i get that
it's not about the keyboard
what? afaict it literally forwards specific buttons, its less generic than even openxr?
i'm getting to the point where i feel like i'm being intentionally misunderstood so i'm gonna put a bow on this for today... genuinely enjoyed the conversation and the critical thought though, and i'm excited to be here. i'm gonna go touch some grass haha
definitely not intentional
i think we're getting wayyyy too caught up in specificsc
also that wasnt meant to be a direct reply
you just wanna know how to do these actions with audio, right?
or programming, or anything that involves complex creativity
given they use keyboard shortcuts?
yeah def not intentional
you have to reinterpret the interface before it gets to that point
like, don't do selection
totally agree
as for the actual app case, it's per app how that interface is best
i already made a little sketch of a DJ board made for XR
but this is a reinterpreted one:
SoundStage VR
YouTube
SoundStage: VR Music Maker
Available now on Steam Early Access: http://store.steampowered.com/app/485780
SoundStage is a virtual reality music sandbox built specifically for room-scale VR. Whether you’re a professional DJ creating a new sound, or a hobbyist who wants to rock out on virtual drums, SoundStage gives you a diverse toolset to express yourself.
If you look ...
just expand outward
don't bother with jumping to things
just make the window encompass it all and move around
then have a way to unfold complexity contextually
we're on the same page on the interface side, i agree per app the interface is best, but ultimately my point is that it would probably be beneficial to stardust and the ecosystem as a whole if we put some thought into what kind of commonalities we can provide for people to not need to worry about the plumbing so much
i think that's just something that will take time to develop in practical usage
pulling things apart to reveal their contents is one i'm fond of for heirarchy
absolutely. and my open question is "can we borrow useful concepts from well-established solutions and renvision them for the new medium" and i think the answer is not a clear yes or no but i'm interested in pursuing it as a sort of socratic method
Ultraleap
YouTube
Experience Aurora | Ultraleap
Welcome to a whole new world of interaction with Aurora, Ultraleap's latest VR experience!
Aurora is a virtual space created by Ultraleap that's all about hand interaction. Aurora is made up of three islands, each hosting a different experience for you to explore. Navigate around the world of Aurora without needing to physically move yourself b...
like, see the exploded view headset?
idk i think the answer is no, every time it's been tried it falls flat
but i'm open to being wrong there
honestly yeah, as far as i can tell the answer is probably somewhere around "some tiny parts maybe", also open to being wrong tho
and on my end i fully admit to being green about implementaiton details in the code and challenges unique to XR... just trying to think about how to carve a way forward for what i think we can all agree are pretty exciting and appealing end goals. I believe in your vision Nova and thank you for providing this space!
no problem
hope i didn't shut down any of your stuff btw
not at all
i want everything to be possible in stardust, but how that's done is what i disagree on
unfortunately i'm a perennial ideas guy so its honestly nice to get critical pushback. Just wish y'all see what i see 😉
it's important we get it right!
perennial ideas?
read: i'm always the ideas guy
ahh yea
i blame the adhd
ideas are good but you gotta know the medium to know how they're implemented
fo sho. but i'm also a scientist, taking abstractions and applying them to new mediums is the name of the game there
i'm a systems and UX designer/dev
i do the same haha
but i know wayy faster if something doesn't work in a new medium
because i can identify its architecture and what fits in well faster
valuable skill, i wasted 10 years on a doctorate sooooo
🙂
haha i'm sure you got something out of it
i didn't go to college tho
my mom is proud of me at least haha
that's good
mine just doesn't get what i'm doing