Why We Built a Lip-Sync Animation System for Every Arabic Sound
Amal uses Rive-powered lip-sync animations that show children exactly how to form each Arabic sound — the character's mouth moves in sync with audio pronunciation. This visual-phonetic approach helps children learn pronunciation intuitively, especially for sounds that don't exist in English (like ع, خ, غ, ح).
The Problem: Arabic Has Sounds English Doesn't
Arabic phonetics include:
- Pharyngeal consonants (ع, ح): produced deep in the throat, no English equivalent
- Uvular consonants (ق, خ, غ): produced at the back of the mouth
- Emphatic consonants (ص, ض, ط, ظ): pronounced with tongue retraction
Children can't learn these sounds from text alone — they need to see mouth position. Traditional approach: a teacher demonstrates in person. Our approach: an AI character demonstrates on screen, infinitely patient and always available.
How the Lip-Sync System Works
The Rive Animation Engine Rive (formerly Flare) is a 2D animation system with state machine support. We use it because:
- State machines enable smooth transitions between idle → speaking → error → celebration
- Runtime manipulation: we change mouth position programmatically, not playing pre-rendered sequences
- Single
.rivfile contains all animation states (vs. hundreds of sprite frames) - GPU-accelerated, 60fps on mid-range devices
Speech Marks Pipeline
- Text-to-speech generates audio for "أَنَا" (I)
- TTS returns "speech marks" — precise timestamps for each phoneme
- Our
lip_sync_avatar.jsonmaps phonemes → Rive mouth states LipSyncControllerdrives state machine transitions in sync with playback- Child sees the character's mouth forming the correct position as they hear the sound
TTS Audio + Speech Marks
↓
[Extract Phoneme Timing]
↓
[Map to Rive States]
↓
[Animate Character Mouth]
↓
[Child Sees Mouth Position]
Multiple Character Variants
- Main Amal character with full-body and face-only variants
- Friendly auxiliary characters for variety and engagement
- Customizable avatars: children choose head shape, clothing, colors, accessories
- Emotional states: idle, speaking, error (encouraging), celebration (praise)
When children customize their character, that personalized avatar teaches them throughout the app — creating emotional investment.
Why Rive (Not Lottie or Sprite Sheets)
| Approach | State Machines | Runtime Control | File Size | Performance | Cost |
|---|---|---|---|---|---|
| Rive | ✓ | ✓ | 1.2 MB | 60fps | Engineering time |
| Lottie | ✗ | Partial | 2-3 MB | 30fps | Animation time |
| Sprites | ✗ | Manual | 50+ MB | 60fps | Asset storage |
| Video | N/A | ✗ | 100+ MB | Variable | Hosting cost |
Rive wins because we need programmatic control, state transitions, and compact file sizes for a mobile app serving 95,000+ children.
Educational Impact
Research shows visual-phonetic learning (seeing mouth position while hearing sound) accelerates pronunciation acquisition. Our internal data:
- Children who see lip-sync learn pronunciation 40% faster
- Pronunciation accuracy improves 3x faster with visual feedback
- Particularly effective for diaspora children without Arabic speakers at home
Why Competitors Can't Match This
Reproducing this requires:
- Phonetics expertise (knowing which mouth positions match which sounds)
- Rive animation skills (not trivial — state machine design is complex)
- TTS speech marks integration (not all TTS providers offer this)
- Mobile optimization (Rive rendering at 60fps across devices)
- Character customization system (component-based avatar architecture)
FAQ
Q: Can my child adjust the animation speed? A: Yes. Slower speeds help with difficult sounds; faster speeds suit advanced learners. The app adapts based on performance.
Q: Do all exercises have lip-sync animation? A: Speak-out-loud and pronunciation exercises feature full lip-sync. Other exercise types (games, puzzles) use the character for encouragement and reward animations.
Q: Why does the character sometimes show an error animation? A: When speech recognition detects mispronunciation, the character gently shows a "let's try again" expression. This is encouraging, not punishing — children learn through iterative attempts.



