Inside Our AI Marketing Video Engine
7 min readMohammad Shaker

Inside Our AI Marketing Video Engine

Alphazed built an AI workflow that finds Arabic education topics, drafts scripts, creates visuals, and publishes finished videos faster.

Engineering

Quick Answer

Alphazed built an AI workflow that finds Arabic education topics, drafts scripts, creates visuals, and publishes finished videos faster.

Alphazed built a fully automated AI marketing pipeline that discovers trending Arabic education topics on YouTube, scores them for relevance, generates video scripts in Arabic, creates images with DALL-E, synthesizes voiceovers with ElevenLabs, composes videos with FFmpeg, runs kids-safety compliance checks, and publishes to YouTube — all with a human approval gate via Slack before anything goes live.

The 13-Step Pipeline

Step 1: Discover Trends
  └─ YouTubeAPI: Fetch trending videos in Arabic education category
     Search queries: "تعليم", "أطفال", "تعلم", "عربي"
     Extract: title, views, velocity (views per day), comments, subscriber engagement

Step 2: Score Trends
  └─ TrendScorer: Weighted formula
     Score = (views × 0.35) + (velocity × 0.30) + (topic_fit × 0.20) + (region × 0.10) + (safety × 0.05)
     Threshold: Only trends scoring >75 proceed

Step 3: Ideate
  └─ ContentIdeator: Generate video concept
     Input: Trending topic (e.g., "تحفيز الأطفال على تعلم العربية")
     Output: Video concept, target age, learning objective

Step 4: Script Generation
  └─ GPT-4o: Generate Arabic video script
     Prompt: "Create a 2-minute YouTube shorts script about [topic] for [age] children in Arabic"
     Output: Scene-by-scene script with narration

Step 5: Hook Variants
  └─ HookGenerator: Create 3 different opening hooks
     Variant 1: Story-based opening
     Variant 2: Question-based opening
     Variant 3: Challenge-based opening
     Later, A/B test which hook gets highest CTR

Step 6: Storyboard
  └─ StoryboardGenerator: Create visual sequence
     Input: Script
     Output: Shot-by-shot breakdown (20-30 shots for 2-minute video)

Step 7: Image Generation
  └─ DALL-E: Generate visuals for each shot
     Prompt: "Child learning Arabic letter ب in a colorful classroom"
     Output: 20-30 images, style-matched

Step 8: Voiceover Synthesis
  └─ ElevenLabs: Generate Arabic narration
     Voice: Female voice, child-friendly, clear articulation
     Language: Arabic (Saudi dialect for broad appeal)
     Output: MP3 audio, speech marks for lip-sync reference

Step 9: Video Composition
  └─ FFmpeg: Assemble video
     Input: Images (step 7) + audio (step 8) + background music
     Output: MP4 video, 1080p, optimized for YouTube Shorts

Step 10: Compliance Check
  └─ KidsSafetyChecker: LLM scan for inappropriate content
     Check: No violence, no inappropriate language, no third-party IP
     Output: Pass/Fail + notes

Step 11: Slack Approval Gate
  └─ SlackBot: Post video preview + metadata
     Team reviews: thumbnail, title, description, transcript
     Approval options: ✓ Publish | 🔄 Revise | ✗ Reject

Step 12: Publish
  └─ YouTubeAPI: Upload to Alphazed channel
     Title, description, tags, thumbnail
     Visibility: Public

Step 13: Analytics Sync
  └─ YouTubeAnalytics: Track performance
     Metrics: Views, CTR, avg. watch duration, shares
     Feedback: Use metrics to improve future scripts

Trend Scoring Algorithm (Step 2)

The Formula

def score_trend(trend_data):
    """
    trend_data = {
        'views': 500000,
        'days_since_upload': 7,
        'topic': 'تعليم الأطفال العربية',
        'language': 'ar',
        'age_group': '5-12',
        'video_category': 'education'
    }
    """
    
    # Component 1: Raw popularity (views)
    popularity_score = min(trend_data['views'] / 1_000_000, 1.0) * 100  # Capped at 100
    # Max out at 1M views = 100 points
    
    # Component 2: Velocity (growth rate)
    velocity = trend_data['views'] / trend_data['days_since_upload']
    velocity_score = min(velocity / 100_000, 1.0) * 100  # Capped at 100
    # 100k views/day = 100 points
    
    # Component 3: Topic fit
    relevant_keywords = ['عربية', 'قرآن', 'أطفال', 'تعليم', 'لغة']
    keyword_matches = sum(1 for kw in relevant_keywords if kw in trend_data['topic'])
    topic_fit_score = (keyword_matches / len(relevant_keywords)) * 100
    
    # Component 4: Regional relevance
    # Videos trending in MENA, South Asia, Malaysia score higher
    region_score = get_region_weight(trend_data.get('region', 'unknown')) * 100
    
    # Component 5: Safety (quick LLM check)
    safety_score = 100 if is_kid_safe(trend_data['title']) else 0
    
    # Weighted sum
    final_score = (
        popularity_score * 0.35 +
        velocity_score * 0.30 +
        topic_fit_score * 0.20 +
        region_score * 0.10 +
        safety_score * 0.05
    )
    
    return {
        'overall_score': final_score,
        'pass_threshold': final_score >= 75,
        'breakdown': {
            'popularity': popularity_score,
            'velocity': velocity_score,
            'topic_fit': topic_fit_score,
            'region': region_score,
            'safety': safety_score
        }
    }

Example: Scoring a Trending Video

Trend: "How to teach kids Arabic letters at home"
Metrics:
  - Views: 500,000
  - Days: 7
  - Keywords: Contains "teach", "kids", "Arabic"
  - Region: US + Canada
  - Safety: Clean

Scoring:
  - Popularity: 50 (500k/1M capped)
  - Velocity: 71 (500k views / 7 days = ~71k/day)
  - Topic fit: 67 (3 of 5 keywords match)
  - Region: 60 (US diaspora)
  - Safety: 100 (clean)
  
Final: (50 × 0.35) + (71 × 0.30) + (67 × 0.20) + (60 × 0.10) + (100 × 0.05)
      = 17.5 + 21.3 + 13.4 + 6 + 5
      = 63.2 → FAIL (below 75 threshold)

Higher-Scoring Example

Trend: "تعليم القرآن للأطفال - طرق فعالة"
Metrics:
  - Views: 2,000,000 (viral)
  - Days: 3 (fast growth)
  - Keywords: "قرآن", "أطفال", "تعليم" (all match)
  - Region: MENA + South Asia
  - Safety: Clean

Scoring:
  - Popularity: 100 (capped)
  - Velocity: 100 (2M/3 days = 666k/day, capped)
  - Topic fit: 100 (3 of 3 keywords)
  - Region: 90 (MENA + diaspora)
  - Safety: 100
  
Final: (100 × 0.35) + (100 × 0.30) + (100 × 0.20) + (90 × 0.10) + (100 × 0.05)
      = 35 + 30 + 20 + 9 + 5
      = 99 → PASS! (excellent fit)

Human-in-the-Loop: Mandatory Approval

Before any video publishes, it goes to Slack for team review:

Slack Notification

🎥 [Pipeline] Ready for Review: Video #47

Title: "كيف تعلم ابنك حروف العربية بسهولة"
Topic Score: 89/100
Estimated Views (ML model): 85,000-120,000

[Preview Video] [View Transcript] [View Analysis]

Compliance Status: ✅ Pass
  - No violence: ✓
  - Age appropriate: ✓
  - No IP violations: ✓

Actions: ✓ Publish | 🔄 Revise | ✗ Reject

Special Gates

  • Quran content: Extra scholarly review
  • New trends: Bonus manual review
  • High velocity trends: Faster pipeline priority

Ports & Adapters Architecture

The pipeline is designed to swap providers without touching business logic:

# src/services/content_generation/interfaces.py
class TextGeneratorInterface:
    def generate_script(self, topic: str, age_group: str) -> str:
        pass

class OpenAIScriptGenerator(TextGeneratorInterface):
    def generate_script(self, topic: str, age_group: str) -> str:
        # Use OpenAI API
        pass

class ClaudeScriptGenerator(TextGeneratorInterface):
    def generate_script(self, topic: str, age_group: str) -> str:
        # Use Anthropic API
        pass

# At runtime, inject the right provider
script_generator = ClaudeScriptGenerator()  # Easy to swap
script = script_generator.generate_script('تعليم العربية', '5-7')

Benefit: If OpenAI goes down, switch to Claude with one config change.

Results

Volume

  • Input: 50-100 trending topics per week
  • Threshold pass rate: ~20% (15-20 trends pass scoring)
  • Published: ~3-4 videos per week
  • Annual output: 150-200 videos

Performance (Actual Data)

  • Average views per video: 12,000-45,000
  • Average CTR: 8-12% (industry: 2-5%)
  • Average watch time: 65-85% of video length (industry: 40-50%)
  • Conversion (views → app installs): 3-5% (industry: 0.5-1%)

Cost

  • AI generation per video: $3-5 (GPT, DALL-E, ElevenLabs)
  • Human review: 15 min × $25/hour = $6.25
  • YouTube hosting: Free
  • Total per video: ~$10
  • Cost per install: ~$2-3 (calculated from 3-5% conversion)

FAQ

Q: What if a generated script is inaccurate about Arabic? A: The human approval gate catches this. If the script has grammatical errors or cultural insensitivity, the reviewer selects "Revise" and provides notes. Pipeline regenerates with feedback.

Q: Does this violate YouTube's automation policies? A: No. We have human review before publishing (mandatory Slack gate). YouTube allows AI-assisted content as long as it's not fully automated without oversight.

Q: Can AI-generated videos rank well in search? A: Yes, if they're high-quality (which ours are). The algorithm doesn't penalize AI generation — it rewards watch time, CTR, and engagement. Our videos perform better than average.

See how we generate learning content at scale and why multilingual reach matters for Arabic families.

Related Articles