Smart Objects: Building a Plug-and-Play Standard for Interactive XR

Andrew Moist
Sep 14
5 min read

Updated: Sep 16

At FruitXR, we want to make XR authoring as modular and reusable as building on the web. This post is the first in a series where we’ll share some of the deeper technical challenges we’re tackling, and why solving them matters.

Today’s topic: Smart Objects - reusable, interactive 3D assets that “just work” when you drop them into a scene.

Why the Need for Smart Objects

If you’ve worked with Unity, Unreal Engine, Blender, or any 3D toolchain, you’ve probably felt the pain. You find a model on sites such as SketchFab or TurboSquid. No physics, no interactivity, no behaviours. You might then spend hours rigging, scripting, and wiring up events just to make the object usable in the way you want.

Game engines do have concepts of reusable, interactive components across platforms. Unity has Prefabs. Unreal Engine has Blueprints. They’re powerful, but they’re engine-specific and proprietary. You can’t easily export and share them across platforms.

There has been progress toward open standards - most notably, the glTF Interactivity Specification was put out for public comment in 2024 by Khronos. But so far, it hasn’t been adopted, let alone something the 3D market has embraced.

The result: there is no open format for smart, reusable interactive objects. The software world is very familiar with open repositories such as npm for Node.js/Javascript and PyPy for Python. 3D developers have always missed the equivalent - high-quality, interactive assets that work anywhere.

That’s the gap Smart Objects aim to fill.

Our Solution

We’re working towards a Smart Object format: an extension of existing 3D standards that lets an object carry with it:

Affordances: what the object can do (e.g., a door can swing open, a button can be pressed).
Behaviours: logic that defines how it reacts to interaction.
State and outputs: changes to itself or its environment.

Imagine pulling in a “Smart” laptop model. Without writing a line of code, you can:

Pick it up and drop it with physics.
Open and close the lid.
See the screen power on when the lid is opened.

We see the emerging glTF Interactivity Spec as a good starting point. Our team is reviewing how it could be extended and built upon as part of the foundation for an open, portable format that we can use on our platform — and hopefully contribute back to the ecosystem.

Let’s go through some of the technical challenges we’re facing along this journey.

Building on Draft Standards

A core problem is 3D engine diversity. Unity handles events differently than Unreal. Godot has yet another model. Prefabs, Blueprints, and other proprietary formats don’t interoperate.

The glTF Interactivity Specification is an important step in this direction. It introduces behaviour graphs in JSON as the way to represent interactivity. Compared to simple scripts, these graphs are not very human-readable. That may not matter if good authoring tools exist, but it does highlight usability as a concern.

One promising idea is to allow compilation from text-based languages into behaviour graphs. That could make Smart Object authoring more approachable, though it adds another layer to the toolchain. We see this as an area where AI or domain-specific languages could play a big role in bridging usability and standardisation.

So our first challenge is to work with the community to push these standards forward and work out how we’re using them internally.

Affordance Detection + Quality Assurance Using AI

Even with a standard for Smart Objects, creating them by hand is a lot of work. A raw 3D mesh doesn’t tell you where the interactive parts are - is that surface a button, a handle, or just decoration? This is where affordance detection comes in.

Affordance detection means identifying the possible interactions that an object affords in the real world - for example, that a protruding cylinder is a knob that can turn, or that a flat rectangular surface is a button that can be pressed. Getting these affordances right is critical: they define how users experience and learn from Smart Objects.

Here’s how we’re looking to leverage AI for affordances:

Affordance detection: Computer vision and geometric analysis can suggest likely interactions automatically (“this looks like a hinge → add rotation behaviour,” “this looks like a button → add press behaviour”). This turns static meshes into interactive starting points with much less manual effort.
Automated QA: Once behaviours are defined, AI can run tests in a sandboxed environment, checking for performance bottlenecks, logic errors, or unsafe behaviour. This ensures consistency and reliability before Smart Objects are widely shared.

This reduces friction for creators and improves trust in the Smart Object ecosystem. Tooling for this is still very immature, so there’s a lot of room for innovation in making these AI-assisted workflows smooth and dependable.

Scaling Up the Library with AI-Powered Publishing

A single Smart Object is useful. A library of thousands is transformative. Just as npm or PyPI unlocked massive developer ecosystems by offering reusable modules to programmers, XR needs a comparable open directory of Smart Objects.

Seeding such a library is a huge undertaking, which is why we’re turning to AI:

AI-assisted creation — converting existing CAD models, 3D scans, or even AI-generated imagery into interactive Smart Objects.
Automated metadata and documentation - generating consistent descriptions, tags, and usage notes so objects are easy to search, discover, and combine. After doing this we can find these objects via both a normal API and an API for LLMs (an MCP server).

By publishing our own growing library of high-quality Smart Objects, we can kickstart the ecosystem and later open the door for community contributions. The true power of Smart Objects comes when the library reaches critical mass - when creators can build complex XR experiences by simply assembling reusable pieces.

Summary

Today, interactive objects in 3D engines are locked to proprietary formats and don’t carry across ecosystems. The glTF Interactivity Spec is a promising step, but not yet widely adopted. That leaves XR creators stuck with static meshes and a lot of manual scripting.

At FruitXR, we’re working towards a standard for Smart Objects that:

Standardises smart objects across engines
Uses AI for affordance detection and QA.
Scales up through AI-powered publishing of reusable assets.

Our goal is to create the missing foundation for XR: a global library of plug-and-play interactive objects that work anywhere.

Interested in joining the team?

We’re hiring! If working on Smart Objects - at the intersection of graphics, XR, and AI — sounds like the kind of challenge you’d enjoy, get in touch at hello@fruitxr.com

Come help us make XR creation as reusable and modular as the modern web.