Amal handles the full complexity of Arabic diacritics: 8 tashkeel marks (fatha, damma, kasra, shadda, sukun, fathatan, dammatan, kasratan), 4 alef variants (standard, madda, hamza above, hamza below, wasla), 3 hamza variants (isolated, on waw, on ya), and Lam-Alef ligatures. The app's speech recognition, text rendering, and similarity scoring all treat diacritized Arabic ("كَتَبَ") differently from undiacritized Arabic ("كتب") — a critical distinction most Arabic learning apps ignore.
Why Diacritics Matter for Learning
The Ambiguity Problem
Arabic without diacritics is ambiguous:
- "كتب" can mean:
- "kataba" (he wrote) — past tense
- "kutub" (books) — plural noun
- "kutiba" (it was written) — passive voice
All are spelled identically without diacritics. Diacritics remove ambiguity.
The Learning Progression
- Beginner: Learn to read WITH diacritics (easy — vowels are marked)
- Intermediate: Practice WITH diacritics until automatic
- Advanced: Gradually remove diacritics, reading becomes harder
- Fluent: Read without diacritics fluently (native-level reading)
Most Arabic learning apps skip step 1 — they don't teach diacritics at all, or strip them away. This teaches bad habits. Amal's progression is scientifically correct.
Our Unicode-Level Implementation
The Diacritical Marks (8 total)
// lib/src/utils/arabic_extension.dart
class ArabicExtension {
static const Map<String, String> tashkeelMarks = {
'FATHA': '\u064E', // َ (vowel 'a')
'DAMMA': '\u064F', // ُ (vowel 'u')
'KASRA': '\u0650', // ِ (vowel 'i')
'SUKUN': '\u0652', // ْ (no vowel)
'SHADDA': '\u0651', // ّ (doubled letter)
'FATHATAN': '\u064B', // ً (tanween 'an')
'DAMMATAN': '\u064C', // ٌ (tanween 'un')
'KASRATAN': '\u064D', // ٍ (tanween 'in')
};
static const Map<String, String> alefVariants = {
'ALEF_STANDARD': 'ا', // ا
'ALEF_WITH_MADDA': 'آ', // آ (elongated)
'ALEF_WITH_HAMZA_ABOVE': 'أ', // أ
'ALEF_WITH_HAMZA_BELOW': 'إ', // إ
'ALEF_WASLA': 'ٰ', // ٰ (connecting alef)
};
static const Map<String, String> hamzaVariants = {
'HAMZA_ISOLATED': 'ء', // Standalone hamza
'HAMZA_ON_WAW': 'ؤ', // Hamza on waw (و + hamza)
'HAMZA_ON_YEH': 'ئ', // Hamza on yeh (ي + hamza)
};
}
Quranic Diacritics and Uthmani Stops
For Thurayya, we support Quranic-specific marks:
static const Map<String, String> quranicMarks = {
'STOP_FULL': 'ۖ', // Full stop (‖)
'STOP_HALF': 'ۗ', // Half stop
'STOP_QUA': 'ۙ', // Qua stop
'STOP_NECESSARY': 'ۚ', // Necessary stop
'TAJWEED_ELONGATION': '', // Elongation indicator
};
Diacritic-Aware Speech Recognition
Context Biasing with Diacritics
When a child is learning "كَتَبَ" (he wrote, past tense), we bias speech recognition toward that exact vocalization:
# src/services/stt_client.py
def recognize_with_diacritical_context(audio_bytes, expected_text):
# expected_text = "كَتَبَ" (with diacritics)
# Create speech context hint
speech_context = {
'phrases': [expected_text],
'boost': 20.0 # High boost for expected text
}
# Send to Google Cloud STT
response = google_stt_client.recognize(
audio=audio_bytes,
language_code='ar-SA',
speech_contexts=[speech_context]
)
# Result: Google STT is biased toward "kataba" pronunciation
return response
Diacritic-Aware Similarity Scoring
Similarity scoring distinguishes diacritized from undiacritized:
def compare_pronunciations(expected, actual):
"""
expected: "كَتَبَ" (with diacritics)
actual: "كتب" (child's attempt, possibly undiacritized)
"""
# Strip diacritics for coarse comparison
expected_base = strip_diacritics(expected) # "كتب"
actual_base = strip_diacritics(actual) # "كتب"
# Base similarity (ignoring diacritics)
base_similarity = string_similarity(expected_base, actual_base) # 1.0 (perfect)
# Diacritical bonus (if child's attempt includes diacritics)
diacritic_bonus = 0.0
if has_diacritics(actual):
diacritic_accuracy = diacritics_match_ratio(expected, actual)
diacritic_bonus = diacritic_accuracy * 0.15 # Up to +15% for correct diacritics
# Final score
final_score = min(base_similarity + diacritic_bonus, 1.0)
return {
'base_score': base_similarity,
'diacritic_bonus': diacritic_bonus,
'final_score': final_score,
'feedback': 'Great! Pronunciation is perfect. Next, practice the diacritical marks.'
}
This means:
- Child says "كتب" (undiacritized) → 85-90% score (correct base, missing diacritics)
- Child says "كَتَبَ" (fully diacritized) → 98%+ score (perfect)
- Progression is clear: first master base pronunciation, then add diacritical subtlety
RTL Rendering Challenges
Text Direction Management
// lib/src/screens/lesson_screen.dart
Column(
children: [
Directionality(
textDirection: TextDirection.rtl, // For Arabic text
child: Text(
'كَتَبَ',
textAlign: TextAlign.right, // Right-aligned for RTL
style: TextStyle(
fontFamily: 'IBMPlexSansArabic',
fontSize: 36,
height: 1.8, // Extra line height for diacritics
),
),
),
// English instructions below
Directionality(
textDirection: TextDirection.ltr, // For English
child: Text(
'Pronounce: "he wrote"',
textAlign: TextAlign.left, // Left-aligned for LTR
),
),
],
)
Connected Letter Shaping
Arabic letters change form depending on position:
- Isolated: "ك" (Kaf)
- Initial: "كَـــ" (Kaf at start of word)
- Medial: "ـــكَـــ" (Kaf in middle)
- Final: "ـــكَ" (Kaf at end)
The IBMPlexSansArabic font handles shaping automatically, but we need proper Unicode sequences:
// Correct: Uses Unicode joining characters
String word = 'ك' + '\u0640' + 'ت' + '\u0640' + 'ب'; // Kashida (extension character)
// Incorrect: Direct concatenation
String word = 'ك' + 'ت' + 'ب'; // May not shape correctly on all devices
Bidirectional Text Mixing
When English and Arabic appear together:
RichText(
textDirection: TextDirection.rtl, // Overall RTL
text: TextSpan(
children: [
TextSpan(text: 'means ', style: englishStyle), // LTR
TextSpan(text: 'كتاب', style: arabicStyle), // RTL
TextSpan(text: ' (book)', style: englishStyle), // LTR
],
),
)
Result: "means كتاب (book)" displayed with correct bidirectional flow.
FAQ
Q: Why force diacritics on beginner learners? Doesn't that make it harder? A: Initially, yes. But learning with diacritics creates stronger letter-sound associations. Research shows diacritical learning produces faster fluency. After mastery with diacritics, reading without them is natural progression.
Q: What if my child's keyboard doesn't support typing diacritics? A: The app never asks children to type diacritics. Recognition and pronunciation are speech-based. Only adults (teachers, content creators) need to input diacritics, and they use specialized Arabic keyboards.
Q: Does Amal support non-standard diacritical combinations? A: We support all Unicode-standardized combinations. Rare or custom combinations may not render correctly, but standard Quranic and modern Arabic are fully supported.
Related reading
See our Arabic alphabet learning page and how Amal works for early readers.



