Thomas Huang

Gen AI & ML

Conversational AI on Apple Vision Pro (2024)

This is an exploration of conversational AI prototype using OpenAI APIs and SwiftUI on Apple Vision Pro. The goal is to understand the potential of spatial computing to create truly immersive AI interactions. 🗣️💬🤖

The experiment focused on defining different AI agent roles (e.g. Philosopher, Therapist, Interior Designer) and integrating user voice input with AI agent voice output. The goal? To create an intuitive and natural user dialogue experience within the immersive environment of Apple Vision Pro.

Imagine This:

You're in the middle of a design project, seeking inspiration. With a simple voice command, you invite a virtual interior designer into your workspace.
You find yourself in a thought-provoking conversation with a philosophy professor, walking alongside you in the digital world.
Or, feeling overwhelmed, you call upon a virtual therapist. Their calming voice appears beside you as you walk through a peaceful forest setting, offering support and guidance.

Key Features

Expert Friends: Need design help? Chat with the Interior Designer. Feeling overwhelmed? Talk to the Therapist. Want to ponder life? Have a conversation with the Philosopher Professor.
Your Voice is Key: Simply speak your thoughts – no typing needed! The AI listens naturally.
AI that Gets You: You can control the amount of content the AI agent generates, making the conversation perfect for your needs.
Immersive Voice: The AI's text comes to life with natural-sounding voice output.
Your Chat Window: You can easily see chat history, double-check your input, and refer back to any useful information later.
You're in Control: A simple stop button lets you pause or end the AI's voice output.

This is just the beginning of the possibilities spatial computing offers for conversational AI. I'm eager to continue exploring multi-modal input, more natural voice models, and the full potential of spatial computing for collaborative AI.

Report abuse