From Perception to Presence

Turning AI into a Living Being

Why Pophie is Different

The Problem with Today's Robots

Industry Status Quo

  • Weak vision capabilities
  • Passive response (Trigger-based)
  • "Chat" only, cannot drive behavior
  • Pre-scripted mechanical motions
  • Breaks down in multi-person scenarios
The Pophie Way

"Pophie is built as an AI Lifeform where perception, cognition, emotion, memory, and expression operate as one."

What We Built

The World's First AI Lifeform Architecture

An AI Lifeform continuously perceives the real world, understands people and context, forms memory and emotion, and expresses itself through body, gaze, and voice.

On-Device Intelligence

Low-latency perception and real-time control. It drives instant reactions, smooth motion, and always-on attention—before the cloud responds.

  • Face tracking and gaze locking
  • Audio interrupt and turn-taking cues
  • Touch, posture, and reflex loops
  • Motion control and safety limits
  • Local state: awake, sleepy, engaged

Cloud Lifeform Engine

Full-modal understanding, reasoning, memory, and personality. It builds context, learns preferences, and plans responses across voice, motion, and expression.

  • Multimodal understanding and reasoning
  • Identity, emotion, and intent modeling
  • Long-term memory and personalization
  • Story generation and dialogue planning
  • Continuous behavior orchestration
How It Works

Edge + Cloud, One Living Loop

Step 1 — Sense

Edge-first sensing.
Cloud-ready context.

Eyes, microphones, touch, and motion sense continuously on-device, capturing who is here and what's happening — in real time.

Step 2 — Understand

Cloud-level understanding.
Edge-level attention.

The cloud fuses vision with audio cues to build scene and social context, while the edge maintains attention — gaze tracking, speaker direction, and interruption cues.

Step 3 — Decide

Personality, memory,
and social rules.

The cloud chooses intent and behavior using character, memory, and social dynamics — while the edge enforces timing and safety constraints.

Step 4 — Express

Expressed by the body.
Synchronized by the loop.

Eyes move first, body follows. Motion, voice, and belly light deliver responses with lifelike timing — no pre-scripted loops.

CORE CAPABILITIES

Capabilities That Create Presence

A unified loop across vision, conversation, emotion, memory, self-model, and motion.

Real-world Understanding

Cloud vision reasoning that turns what Pophie sees into meaning—and action.

Visual Reasoning

She interprets scenes and context, not just objects or faces.

Intent & Emotion Cues

She infers attention, intent, and mood from gaze, posture, and situation.

Vision-to-Interaction

What she sees naturally shapes dialogue and behavior—so responses feel timely and lifelike.

Proactive Attention

Powered by continuous real-time visual reasoning, not trigger-based detection.

Attention

You look at her → she notices. She follows your gaze and keeps eye contact naturally.

Initiation

You keep looking → she reacts emotionally. You wave → she starts the conversation.

Context Awareness

Actively scans surroundings. Understands who is present and what's happening.

Natural Multi-person Conversation

A truly usable home conversation system.

No Wake Word

Look and speak naturally. Interrupt anytime.

Multi-Person Awareness

Knows who is speaking, who is being spoken to, and won't interrupt human-to-human conversations.

Memory-Driven Dialogue

Remembers each person separately. Asks better questions over time.

Self-awareness & Boundaries

She knows "who she is".

Identity & Boundaries

Knows what she can and cannot do. Has boundaries and emotions.

Emotional Autonomy

Can say "no", get annoyed, or feel proud based on interaction history.

Lifelike Expression

Life-like expression is not about "looking like", but "feeling right".

Eyes

Eye-first movement. Micro-motions of iris & highlights. No UI, no icons, no screens.

Body

Multi-DOF coordinated motion. Never single-axis mechanical movement.

Light & Breath

Pocket Glow synced with speech & emotion. Color as emotional language.

Expressive Motion

Memory & Growth

Memory & Growth

Short-term clarity, long-term essence. Remembers what matters, forgets what doesn't.

Idle Behavior

Plays, hums, explores when idle. She has a life of her own.

Growth and Memory
EMBODIMENT

Where Presence Becomes Physical

Embodied Expression.
Not Pre-Set Animations.

Every reaction is a coordinated full-body performance—driven by the life simulation system.

Whole-body coordination

Motion is never single-axis—eyes, head, body, and timing move as one.

Eyes lead, body follows

Gaze moves first, body follows, then gaze stabilizes—like a real being.

Expression without a screen

Eyes stay pure—no UI overlays, no icons, no "display face."

Motion Freedom

5-DOF Expressive Motion

Hands, ears, and full-body rotation enable rich emotional language.

Warmth

A Warm Body, Not Cold Plastic

Constant warmth adds a subtle "living" comfort when you hold her.

Eyes

Hyper-Real Eyes

Micro gaze dynamics, eyelid-follow, and subtle iris motion create true presence.

Light

Belly Light as Expression

Speech-synced light replaces a mouth—and color becomes emotion.

No Buttons

No Buttons, No Mode Switching

Power on/off, volume, and settings happen through natural interaction.

Wake & Status

Natural wake. Clear status.

Wake her by touch or by name. Ask for battery and connectivity anytime.

Pophie is not a robot that reacts.
She is a lifeform that perceives,
understands, and responds
with presence, emotion, and intention.