Project AIXTRA - Adding AI-Empowered Speech Capabilities to VR Builder

Published on

August 2, 2025

Case Studies

/

Project AIXTRA - Adding AI-Empowered Speech Capabilities to VR Builder

Speech can be both a barrier as well as a driver for VR training applications. Together with our partner LEFX, we adopted VOXReality's AI technology to grant you instant access to powerful new capabilities!

About VOX Reality

VOXReality is a EU funded initiative that aims to facilitate the convergence of Natural Language Processing (NLP) and Computer Vision (CV) technologies in the Extended Reality (XR) field. Their aim is to develop innovative Artificial Intelligence (AI) models that combine language as a core interaction medium supported by visual understanding to deliver next-generation applications that provide comprehension of users goals, surrounding environment and context.

Motivation

During our shared years of experience, LEFX and MindPort uncovered two huge and reappearing challenges in enterprise projects, for which VOXReality is the perfect solution. Within our project AIXTRA, we addressed these actual needs of many enterprise clients with a three component system delivering two new and innovative capabilities, namely:

  • Overcoming the Language Barrier: In VR-based training, language differences can hinder communication. AIXTRA tackles this by integrating AI-based in-VR translation, allowing participants to communicate seamlessly in their native languages. A demo will show participants completing complex tasks together with real-time translations to enhance understanding and collaboration. This will improve communication, making the learning experience more inclusive and accessible.
  • AI-Based Virtual Training Partner: Multi-user training can be very impactful, but requires multiple participants to be available at the same time. If replaced by single-user approaches, the quality of the training suffers because processes requiring cooperation are replaced by predefined interaction steps or entirely omitted. AIXTRA seeks to address this by integrating AI to simulate human-to-human interactions. This AI-based virtual training partner allows trainees to request assistance and communicate about their current status, enabling rich training interactions without the need for a physical training companion.

About AIXTRA

By focusing on  two objectives mentioned above, AIXTRA seeks to create a more effective and accessible VR training environment, ultimately contributing to the advancement of VR training methodologies and technologies and the VOXReality project as a whole. The three systems are:

  • VOXReality System, which unlocks the VOXReality capabilities for our two use cases, as described above.
  • The Authoring Tool based on VR Builder which integrates the new capabilities from the VOXReality System and makes them easily accessible.
  • A Demo Application, showcasing those new capabilities in real-world scenarios taken from actual challenges that occurred when working on enterprise projects.

With the end of the project approaching soon, you can get access to the VOXReality System and the Authoring Tool by contacting us.

Check out the demo app at LEFX's YouTube channel!

New Capabilities in Detail

Let's get into the details of the functional and non-functional capabilities that our new speech capabilities offer you.

Made in Europe, Safe and Secure

Despite all the promises, AI systems also pose a threat in different aspects. VOXReality and we made sure that user data is not used for training of the models. To provide an additional level of security, our VOXReality System has been dockerized and you can deploy it on your local infrastructure. You can choose a managed cloud, a private cloud, local servers, or even the VR headsets themselves to run the AI system on! Keep in mind though that different AI systems have varying technical requirements and not all capabilities can be used on all hardware to full extent.

Real-Time Translation with Voice Retention

We made it easy for you to add real-time translation to multi user scenarios. This allows different participants to speak and hear their colleagues in their preferred language. Unlike standard real-time translation services, the VOXReality system retains the speaker's voice, which not only preserves the speaker's character, but also emotions, volume, and pace, all of which improve the collaboration in the virtual world.

Intent Recognition

Our new intent recognition feature allows detecting speech commands. This can be used to identify questions from the user to natively assist in the virtual world. It can also be used to verify understanding or to give commands to virtual training partners in your VR training environment. The newly developed intent recognition condition embeds this functionality into VR Builder's Process Editor.

Future-Proof Thanks to a Modular System Architecture

True to our approach, we kept the architecture flexible and allow replacing AI models. This not only gives you additional flexibility to choose the system which addresses your needs the best - it also makes sure that you can keep up with the latest trends in the fast-paced world of AI systems, especially LLMs.

How to Get Started

Reach out to us to get access to the AIXTRA demo applications and the code packages, which you can simply add to your new and existing VR Builder projects!

This project has been funded by VOXReality and the European Union.