# Expression System Documentation ## Overview The new expression system provides **full control over facial expressions** by generating separate sprite assets for each expression type. This allows for smooth, dynamic expression switching based on face tracking data. ## Key Changes ### 1. **Blank Base Character** The AI now generates a character with a **blank face** (no eyes, no mouth, no eyebrows). This allows us to overlay expression assets without visual conflicts. ### 2. **Expression Grid Layout** The generated sprite sheet uses a **3-row grid format**: ``` ┌─────────────────────────────────────────────────────────┐ │ ROW 1: BASE CHARACTER (full width) │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Blank face - no features (hair, head, body only) │ │ │ └─────────────────────────────────────────────────────┘ │ ├─────────────────────────────────────────────────────────┤ │ ROW 2: EYE EXPRESSIONS (6 variants, equal spacing) │ │ ┌─────┬─────┬─────┬─────┬─────┬─────┐ │ │ │NTRL │HPPY │SRPR │ANGRY│ SAD │BLINK│ │ │ └─────┴─────┴─────┴─────┴─────┴─────┘ │ ├─────────────────────────────────────────────────────────┤ │ ROW 3: MOUTH EXPRESSIONS (6 variants, equal spacing) │ │ ┌─────┬─────┬──────┬──────┬──────┬──────┐ │ │ │NTRL │SMILE│TALK │WIDE │FROWN │O-SHP │ │ │ └─────┴─────┴──────┴──────┴──────┴──────┘ │ └─────────────────────────────────────────────────────────┘ ``` ### 3. **Expression Types** #### Eye Expressions (6 types) | Type | Description | Trigger | |------|-------------|---------| | `NEUTRAL` | Normal open eyes, relaxed | Default state | | `HAPPY` | Eyes curved upward, slightly closed | Smile detection (future) | | `SURPRISED` | Wide open, circular eyes | Mouth open > 70% | | `ANGRY` | Eyebrows angled down, narrowed | Emotion detection (future) | | `SAD` | Eyebrows up, eyes droopy downward | Emotion detection (future) | | `BLINK` | Both eyes fully closed (curves) | Blink detection (active) | #### Mouth Expressions (6 types) | Type | Description | Trigger | |------|-------------|---------| | `NEUTRAL` | Small closed mouth, straight line | Mouth open < 10% | | `HAPPY` | Closed mouth curved upward | Smile detection (future) | | `OPEN_TALK` | Medium open mouth for vowels | Mouth open 10-30% | | `WIDE_OPEN` | Large open mouth for shouting | Mouth open > 30% | | `FROWN` | Mouth curved downward | Emotion detection (future) | | `O_SHAPE` | Small circular open mouth | Phoneme detection (future) | ## File Changes ### `src/shared/types.ts` ```typescript export enum ExpressionType { NEUTRAL = 'NEUTRAL', HAPPY = 'HAPPY', SURPRISED = 'SURPRISED', ANGRY = 'ANGRY', SAD = 'SAD', BLINK = 'BLINK', OPEN_TALK = 'OPEN_TALK', WIDE_OPEN = 'WIDE_OPEN', FROWN = 'FROWN', O_SHAPE = 'O_SHAPE', } export interface AvatarConfig { imageUrl: string; baseFace?: Rect; // Blank face area eyes?: { // Eye expression rects [ExpressionType.NEUTRAL]?: Rect; [ExpressionType.HAPPY]?: Rect; // ... etc }; mouth?: { // Mouth expression rects [ExpressionType.NEUTRAL]?: Rect; [ExpressionType.HAPPY]?: Rect; // ... etc }; riggingReference?: { ... }; activeEyeExpression?: ExpressionType; activeMouthExpression?: ExpressionType; } ``` ### `src/renderer/services/geminiService.ts` Updated prompt to generate: - Blank base character (no facial features) - 6 eye expressions in row 2 - 6 mouth expressions in row 3 - Consistent sizing for easy extraction ### `src/renderer/components/RiggingEditor.tsx` Complete redesign: - **Tab system**: Switch between Eyes and Mouth rigging - **Expression selector**: Preview individual expressions - **Color-coded boxes**: Each expression has unique color - **Base Face box**: Define the blank character area - **12 expression boxes total**: 6 eyes + 6 mouths ### `src/renderer/components/Studio.tsx` Dynamic expression rendering: - `getCurrentEyeExpression()`: Maps tracking data to eye expression - `getCurrentMouthExpression()`: Maps mouth openness to mouth expression - Automatic expression switching based on: - Blink detection → `BLINK` - Mouth openness → `NEUTRAL` / `OPEN_TALK` / `WIDE_OPEN` - Surprise detection → `SURPRISED` (when mouth very open) ## Expression Switching Logic ### Current Implementation ```typescript // Eye expression selection const getCurrentEyeExpression = (): ExpressionType => { if (trackingData.isBlinkingLeft || trackingData.isBlinkingRight) { return ExpressionType.BLINK; } if (trackingData.mouthOpen > 0.7) { return ExpressionType.SURPRISED; } return ExpressionType.NEUTRAL; // Default }; // Mouth expression selection const getCurrentMouthExpression = (): ExpressionType => { const mouthOpen = trackingData.mouthOpen; if (mouthOpen < 0.1) return ExpressionType.NEUTRAL; if (mouthOpen < 0.3) return ExpressionType.OPEN_TALK; return ExpressionType.WIDE_OPEN; }; ``` ### Expression Flow ``` ┌──────────────────┐ │ Face Tracking │ │ Data Input │ └────────┬─────────┘ │ ▼ ┌──────────────────┐ │ mouthOpen: 0.05 │──────┐ │ isBlinking: true │ │ └────────┬─────────┘ │ │ │ ▼ ▼ ┌──────────────────┐ ┌──────────────┐ │ Eye Expression │ │ Mouth │ │ Selector │ │ Expression │ │ │ │ Selector │ │ BLINK (priority) │ │ NEUTRAL │ └────────┬─────────┘ └──────┬───────┘ │ │ └────────┬──────────┘ │ ▼ ┌────────────────┐ │ Render Avatar │ │ with selected │ │ expressions │ └────────────────┘ ``` ## Rigging Workflow ### Step 1: Generate Avatar ``` User enters prompt → AI generates sprite sheet with: - Row 1: Blank character - Row 2: 6 eye expressions - Row 3: 6 mouth expressions ``` ### Step 2: Rig Expressions ``` 1. Adjust Base Face box (yellow) around blank character 2. Switch to "Eyes" tab 3. For each eye expression: - Click expression name to highlight - Drag/resizing box to match asset 4. Switch to "Mouth" tab 5. For each mouth expression: - Click expression name to highlight - Drag/resize box to match asset 6. Click "Finish Rigging" ``` ### Step 3: Live Animation ``` System automatically switches expressions based on: - Your blinks → Eye BLINK - Your mouth opening → Mouth OPEN_TALK / WIDE_OPEN - Wide mouth → Eye SURPRISED ``` ## Future Enhancements ### Planned Features 1. **Manual Expression Override** - Hotkeys to force specific expressions - Emotion wheel UI for manual selection 2. **Advanced Triggers** ```typescript // Future: Audio-based phoneme detection if (phoneme === 'AH') return ExpressionType.OPEN_TALK; if (phoneme === 'OO') return ExpressionType.O_SHAPE; // Future: Eyebrow tracking if (eyebrowsRaised) return ExpressionType.SURPRISED; if (eyebrowsFurrowed) return ExpressionType.ANGRY; ``` 3. **Expression Blending** - Smooth transitions between expressions - Intensity-based blending (e.g., 50% happy + 50% neutral) 4. **Preset Management** - Save expression configurations - Share rigging presets between avatars 5. **More Expressions** - Additional eye variants (wink, heart eyes, etc.) - Mouth shapes for specific phonemes - Eyebrow-only expressions layer ## Testing Tips ### During Rigging 1. **Zoom in** on sprite sheet for precise box placement 2. **Use consistent sizes** for similar expression types 3. **Test all expressions** by clicking through them before finishing 4. **Check the cyan face reference guide** - it should encompass the face area ### During Studio Use 1. **Wait for calibration** (1 second after camera starts) 2. **Good lighting** improves expression detection 3. **Center your face** in camera for best results 4. **Exaggerate expressions** initially to test range ## Troubleshooting | Issue | Solution | |-------|----------| | Expressions don't align | Re-rig with more precise box placement | | Blinking not detected | Increase camera lighting, face camera directly | | Mouth stuck open | Check mouthOpen threshold in Studio.tsx | | Wrong expression showing | Verify riggingReference calculation in RiggingEditor | | Expressions too small/large | Ensure all expression assets are same size in sprite sheet | ## Code Architecture ``` src/ ├── shared/types.ts # ExpressionType enum, AvatarConfig interface ├── renderer/ │ ├── services/ │ │ └── geminiService.ts # AI prompt for expression generation │ ├── components/ │ │ ├── AvatarCreator.tsx # Generate/upload avatar │ │ ├── RiggingEditor.tsx # Rig all expressions │ │ └── Studio.tsx # Dynamic expression switching │ └── hooks/ │ └── useFaceTracking.ts # Provides trackingData for triggers ``` ## Summary The new expression system provides: - ✅ **Full control** over all facial features - ✅ **Dynamic switching** based on face tracking - ✅ **Modular design** - easy to add new expressions - ✅ **Clean separation** - blank base + overlay expressions - ✅ **Future-proof** - ready for audio/emotion integration This is a **major improvement** over the previous 2-expression system (just blink/talk) and enables professional-quality VTuber animations.