291 lines
11 KiB
Markdown
291 lines
11 KiB
Markdown
# Expression System Documentation
|
|
|
|
## Overview
|
|
|
|
The new expression system provides **full control over facial expressions** by generating separate sprite assets for each expression type. This allows for smooth, dynamic expression switching based on face tracking data.
|
|
|
|
## Key Changes
|
|
|
|
### 1. **Blank Base Character**
|
|
The AI now generates a character with a **blank face** (no eyes, no mouth, no eyebrows). This allows us to overlay expression assets without visual conflicts.
|
|
|
|
### 2. **Expression Grid Layout**
|
|
|
|
The generated sprite sheet uses a **3-row grid format**:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ ROW 1: BASE CHARACTER (full width) │
|
|
│ ┌─────────────────────────────────────────────────────┐ │
|
|
│ │ Blank face - no features (hair, head, body only) │ │
|
|
│ └─────────────────────────────────────────────────────┘ │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ ROW 2: EYE EXPRESSIONS (6 variants, equal spacing) │
|
|
│ ┌─────┬─────┬─────┬─────┬─────┬─────┐ │
|
|
│ │NTRL │HPPY │SRPR │ANGRY│ SAD │BLINK│ │
|
|
│ └─────┴─────┴─────┴─────┴─────┴─────┘ │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ ROW 3: MOUTH EXPRESSIONS (6 variants, equal spacing) │
|
|
│ ┌─────┬─────┬──────┬──────┬──────┬──────┐ │
|
|
│ │NTRL │SMILE│TALK │WIDE │FROWN │O-SHP │ │
|
|
│ └─────┴─────┴──────┴──────┴──────┴──────┘ │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### 3. **Expression Types**
|
|
|
|
#### Eye Expressions (6 types)
|
|
| Type | Description | Trigger |
|
|
|------|-------------|---------|
|
|
| `NEUTRAL` | Normal open eyes, relaxed | Default state |
|
|
| `HAPPY` | Eyes curved upward, slightly closed | Smile detection (future) |
|
|
| `SURPRISED` | Wide open, circular eyes | Mouth open > 70% |
|
|
| `ANGRY` | Eyebrows angled down, narrowed | Emotion detection (future) |
|
|
| `SAD` | Eyebrows up, eyes droopy downward | Emotion detection (future) |
|
|
| `BLINK` | Both eyes fully closed (curves) | Blink detection (active) |
|
|
|
|
#### Mouth Expressions (6 types)
|
|
| Type | Description | Trigger |
|
|
|------|-------------|---------|
|
|
| `NEUTRAL` | Small closed mouth, straight line | Mouth open < 10% |
|
|
| `HAPPY` | Closed mouth curved upward | Smile detection (future) |
|
|
| `OPEN_TALK` | Medium open mouth for vowels | Mouth open 10-30% |
|
|
| `WIDE_OPEN` | Large open mouth for shouting | Mouth open > 30% |
|
|
| `FROWN` | Mouth curved downward | Emotion detection (future) |
|
|
| `O_SHAPE` | Small circular open mouth | Phoneme detection (future) |
|
|
|
|
## File Changes
|
|
|
|
### `src/shared/types.ts`
|
|
```typescript
|
|
export enum ExpressionType {
|
|
NEUTRAL = 'NEUTRAL',
|
|
HAPPY = 'HAPPY',
|
|
SURPRISED = 'SURPRISED',
|
|
ANGRY = 'ANGRY',
|
|
SAD = 'SAD',
|
|
BLINK = 'BLINK',
|
|
OPEN_TALK = 'OPEN_TALK',
|
|
WIDE_OPEN = 'WIDE_OPEN',
|
|
FROWN = 'FROWN',
|
|
O_SHAPE = 'O_SHAPE',
|
|
}
|
|
|
|
export interface AvatarConfig {
|
|
imageUrl: string;
|
|
baseFace?: Rect; // Blank face area
|
|
eyes?: { // Eye expression rects
|
|
[ExpressionType.NEUTRAL]?: Rect;
|
|
[ExpressionType.HAPPY]?: Rect;
|
|
// ... etc
|
|
};
|
|
mouth?: { // Mouth expression rects
|
|
[ExpressionType.NEUTRAL]?: Rect;
|
|
[ExpressionType.HAPPY]?: Rect;
|
|
// ... etc
|
|
};
|
|
riggingReference?: { ... };
|
|
activeEyeExpression?: ExpressionType;
|
|
activeMouthExpression?: ExpressionType;
|
|
}
|
|
```
|
|
|
|
### `src/renderer/services/geminiService.ts`
|
|
Updated prompt to generate:
|
|
- Blank base character (no facial features)
|
|
- 6 eye expressions in row 2
|
|
- 6 mouth expressions in row 3
|
|
- Consistent sizing for easy extraction
|
|
|
|
### `src/renderer/components/RiggingEditor.tsx`
|
|
Complete redesign:
|
|
- **Tab system**: Switch between Eyes and Mouth rigging
|
|
- **Expression selector**: Preview individual expressions
|
|
- **Color-coded boxes**: Each expression has unique color
|
|
- **Base Face box**: Define the blank character area
|
|
- **12 expression boxes total**: 6 eyes + 6 mouths
|
|
|
|
### `src/renderer/components/Studio.tsx`
|
|
Dynamic expression rendering:
|
|
- `getCurrentEyeExpression()`: Maps tracking data to eye expression
|
|
- `getCurrentMouthExpression()`: Maps mouth openness to mouth expression
|
|
- Automatic expression switching based on:
|
|
- Blink detection → `BLINK`
|
|
- Mouth openness → `NEUTRAL` / `OPEN_TALK` / `WIDE_OPEN`
|
|
- Surprise detection → `SURPRISED` (when mouth very open)
|
|
|
|
## Expression Switching Logic
|
|
|
|
### Current Implementation
|
|
|
|
```typescript
|
|
// Eye expression selection
|
|
const getCurrentEyeExpression = (): ExpressionType => {
|
|
if (trackingData.isBlinkingLeft || trackingData.isBlinkingRight) {
|
|
return ExpressionType.BLINK;
|
|
}
|
|
|
|
if (trackingData.mouthOpen > 0.7) {
|
|
return ExpressionType.SURPRISED;
|
|
}
|
|
|
|
return ExpressionType.NEUTRAL; // Default
|
|
};
|
|
|
|
// Mouth expression selection
|
|
const getCurrentMouthExpression = (): ExpressionType => {
|
|
const mouthOpen = trackingData.mouthOpen;
|
|
|
|
if (mouthOpen < 0.1) return ExpressionType.NEUTRAL;
|
|
if (mouthOpen < 0.3) return ExpressionType.OPEN_TALK;
|
|
return ExpressionType.WIDE_OPEN;
|
|
};
|
|
```
|
|
|
|
### Expression Flow
|
|
|
|
```
|
|
┌──────────────────┐
|
|
│ Face Tracking │
|
|
│ Data Input │
|
|
└────────┬─────────┘
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ mouthOpen: 0.05 │──────┐
|
|
│ isBlinking: true │ │
|
|
└────────┬─────────┘ │
|
|
│ │
|
|
▼ ▼
|
|
┌──────────────────┐ ┌──────────────┐
|
|
│ Eye Expression │ │ Mouth │
|
|
│ Selector │ │ Expression │
|
|
│ │ │ Selector │
|
|
│ BLINK (priority) │ │ NEUTRAL │
|
|
└────────┬─────────┘ └──────┬───────┘
|
|
│ │
|
|
└────────┬──────────┘
|
|
│
|
|
▼
|
|
┌────────────────┐
|
|
│ Render Avatar │
|
|
│ with selected │
|
|
│ expressions │
|
|
└────────────────┘
|
|
```
|
|
|
|
## Rigging Workflow
|
|
|
|
### Step 1: Generate Avatar
|
|
```
|
|
User enters prompt → AI generates sprite sheet with:
|
|
- Row 1: Blank character
|
|
- Row 2: 6 eye expressions
|
|
- Row 3: 6 mouth expressions
|
|
```
|
|
|
|
### Step 2: Rig Expressions
|
|
```
|
|
1. Adjust Base Face box (yellow) around blank character
|
|
2. Switch to "Eyes" tab
|
|
3. For each eye expression:
|
|
- Click expression name to highlight
|
|
- Drag/resizing box to match asset
|
|
4. Switch to "Mouth" tab
|
|
5. For each mouth expression:
|
|
- Click expression name to highlight
|
|
- Drag/resize box to match asset
|
|
6. Click "Finish Rigging"
|
|
```
|
|
|
|
### Step 3: Live Animation
|
|
```
|
|
System automatically switches expressions based on:
|
|
- Your blinks → Eye BLINK
|
|
- Your mouth opening → Mouth OPEN_TALK / WIDE_OPEN
|
|
- Wide mouth → Eye SURPRISED
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
### Planned Features
|
|
|
|
1. **Manual Expression Override**
|
|
- Hotkeys to force specific expressions
|
|
- Emotion wheel UI for manual selection
|
|
|
|
2. **Advanced Triggers**
|
|
```typescript
|
|
// Future: Audio-based phoneme detection
|
|
if (phoneme === 'AH') return ExpressionType.OPEN_TALK;
|
|
if (phoneme === 'OO') return ExpressionType.O_SHAPE;
|
|
|
|
// Future: Eyebrow tracking
|
|
if (eyebrowsRaised) return ExpressionType.SURPRISED;
|
|
if (eyebrowsFurrowed) return ExpressionType.ANGRY;
|
|
```
|
|
|
|
3. **Expression Blending**
|
|
- Smooth transitions between expressions
|
|
- Intensity-based blending (e.g., 50% happy + 50% neutral)
|
|
|
|
4. **Preset Management**
|
|
- Save expression configurations
|
|
- Share rigging presets between avatars
|
|
|
|
5. **More Expressions**
|
|
- Additional eye variants (wink, heart eyes, etc.)
|
|
- Mouth shapes for specific phonemes
|
|
- Eyebrow-only expressions layer
|
|
|
|
## Testing Tips
|
|
|
|
### During Rigging
|
|
1. **Zoom in** on sprite sheet for precise box placement
|
|
2. **Use consistent sizes** for similar expression types
|
|
3. **Test all expressions** by clicking through them before finishing
|
|
4. **Check the cyan face reference guide** - it should encompass the face area
|
|
|
|
### During Studio Use
|
|
1. **Wait for calibration** (1 second after camera starts)
|
|
2. **Good lighting** improves expression detection
|
|
3. **Center your face** in camera for best results
|
|
4. **Exaggerate expressions** initially to test range
|
|
|
|
## Troubleshooting
|
|
|
|
| Issue | Solution |
|
|
|-------|----------|
|
|
| Expressions don't align | Re-rig with more precise box placement |
|
|
| Blinking not detected | Increase camera lighting, face camera directly |
|
|
| Mouth stuck open | Check mouthOpen threshold in Studio.tsx |
|
|
| Wrong expression showing | Verify riggingReference calculation in RiggingEditor |
|
|
| Expressions too small/large | Ensure all expression assets are same size in sprite sheet |
|
|
|
|
## Code Architecture
|
|
|
|
```
|
|
src/
|
|
├── shared/types.ts # ExpressionType enum, AvatarConfig interface
|
|
├── renderer/
|
|
│ ├── services/
|
|
│ │ └── geminiService.ts # AI prompt for expression generation
|
|
│ ├── components/
|
|
│ │ ├── AvatarCreator.tsx # Generate/upload avatar
|
|
│ │ ├── RiggingEditor.tsx # Rig all expressions
|
|
│ │ └── Studio.tsx # Dynamic expression switching
|
|
│ └── hooks/
|
|
│ └── useFaceTracking.ts # Provides trackingData for triggers
|
|
```
|
|
|
|
## Summary
|
|
|
|
The new expression system provides:
|
|
- ✅ **Full control** over all facial features
|
|
- ✅ **Dynamic switching** based on face tracking
|
|
- ✅ **Modular design** - easy to add new expressions
|
|
- ✅ **Clean separation** - blank base + overlay expressions
|
|
- ✅ **Future-proof** - ready for audio/emotion integration
|
|
|
|
This is a **major improvement** over the previous 2-expression system (just blink/talk) and enables professional-quality VTuber animations.
|