Skip to content

Business Motivation: Video-to-Video Comparison for Compliance-Driven Content Production

Objective

To accelerate content production and improve consistency and compliance by introducing an automated, segment-aware video comparison feature that removes manual effort, improves feedback precision, and ensures repeatable validation of creative changes.


Business Context

In regulated and brand-sensitive environments—such as retail marketing, pharmaceuticals, or legal-compliant advertising—teams routinely manage multiple versions of video content. Verifying what has changed between versions is often a manual, time-consuming, and error-prone process, typically involving side-by-side playback and human interpretation.

With compliance teams under pressure to ensure that no unauthorised changes have occurred—particularly with claims, visual assets, or timing constraints—manual checks become a bottleneck, delaying approvals and increasing the risk of missed errors.


Motivation and Benefits

1. Remove Manual Comparison Tasks

  • Eliminate the need for frame-by-frame human review.
  • Automate detection of changes such as reordering, insertions, deletions, or modifications at the segment level.
  • Enable users to trust the tool to highlight only the relevant differences, saving hours per review cycle.

2. Provide Precision Feedback

  • Deliver exact timestamps, similarity scores, and change classifications (e.g. Added, Removed, Modified).
  • Allow reviewers to comment or tag only on affected segments instead of reviewing entire assets.
  • Generate auditable and exportable reports for cross-functional validation and compliance logs.

3. Ensure Versioning Consistency

  • Reinforce proper version control across media assets.
  • Prevent subtle or unapproved changes from being overlooked by enabling deterministic comparisons.
  • Improve traceability of changes and maintain a single source of truth for asset evolution.

4. Accelerate Compliance Workflows

  • Shorten approval timelines for localisations, adaptations, and re-edits.
  • Empower legal and brand teams to confidently approve content without requiring full creative reviews.
  • Integrate results directly into project management and digital asset management systems for seamless traceability.

Strategic Impact

This feature directly supports content scalability and compliance assurance, while enabling the broader organisation to:

  • Reduce risk of non-compliant or off-brand releases.
  • Improve speed-to-market for campaigns with high volumes of variants.
  • Build a foundation for further automation in creative QA and brand governance.

MVP Scenarios

Side-by-Side Playback (With Locking)

The comparison tool must support side-by-side playback of the old and new video versions, with frame-synchronised locking. This is required for both visual review and debugging of detected segment changes. The UI must highlight current segment type during playback (e.g. Common, Modified, Added, Removed, Moved).

Tabular Difference View

All detected segments should be shown in a table view. For each segment, display:

  • Segment Type: Common, Modified, Added, Removed, Moved
  • Timestamps in old and new videos (TemporalAnchor)
  • Duration
  • Similarity score (if applicable)
  • Hypothesis (for Modified or Added segments)
  • Navigation link to jump to the segment in the video player

Segment Types (Implemented as)

  • CommonSegment: Identical content in both videos
  • ModifiedSegment: Slight variation between matched content (e.g. edit, lighting, object added)
  • AddedSegment: New content appears only in the new video
  • RemovedSegment: Content appears only in the old video
  • MovedSegment: Same content exists but appears in a different position

Use Case 1: Complete Matching Video

Description: The videos are identical. Segments:

  • 1 CommonSegment: 100% similarity, spans 0–10s Outcome: Match confirmed, no additional insight needed.

Use Case 2: Partial Fine-Grained Modification

Description: Half the video is unchanged, the other half slightly altered. Segments:

  • 1 CommonSegment: 0–10s, similarity 100%
  • 1 ModifiedSegment: 11–20s, similarity \~95%, e.g. colour change or object shift Outcome: Indicates visual consistency with a specific minor change.

Use Case 3: Segment Replaced by New

Description: A segment in the middle has been removed and a new one inserted. Segments:

  • 1 CommonSegment: 0–10s, similarity 100%
  • 1 RemovedSegment: 11–20s (from old video)
  • 1 AddedSegment: 11–20s (in new video) Outcome: Clearly shows content substitution.

Use Case 4: Inserted Segment Leads to Longer Video

Description: A new segment was added in the middle; total duration is longer. Segments:

  • 1 CommonSegment: 0–10s
  • 1 AddedSegment: 11–14s (only in new video)
  • 1 CommonSegment: 15–25s Outcome: Timeline misalignment due to new content.

Use Case 5: Middle Section Changed, Duration Unchanged

Description: One segment replaced by another of the same length. Segments:

  • 1 CommonSegment: 0–8s
  • 1 RemovedSegment: 9–12s (old only)
  • 1 AddedSegment: 9–12s (new only)
  • 1 CommonSegment: 13–20s Outcome: Content changed but video remains same length.

Use Case 6: Reordered Content

Description: Same segments appear but in different order. Length unchanged. Segments:

  • 1 MovedSegment: Old 0–8s → New 9–16s
  • 1 MovedSegment: Old 9–16s → New 0–8s Outcome: Reordering is recognised via diagonal segment match patterns.

Technical Differences

Definition

Technical differences are metadata-level mismatches that do not involve content, e.g.:

  • Duration
  • Resolution
  • Framerate
  • Codec
  • Audio encoding
  • Colour profile

Representation

Stored as TechDifference objects with:

  • type
  • old_value
  • new_value
  • optional description

Example

{
  "type": "resolution",
  "old_value": "1920x1080",
  "new_value": "1280x720"
}

Audio Differences

Voiceover / Dialogue Changes

  • Detected as speech transcription mismatch
  • Shown as side-by-side transcript + metadata (speaker, language, tone)

Background Music Differences

  • Detected via fingerprint mismatch or waveform analysis
  • Can be visualised as audio timeline diff
  • Marked as TechDifference if relevant

Frame-Aligned Audio Diff

  • Differences in tone, pitch, or volume per segment
  • May contribute to segment classification (e.g. ModifiedSegment)

Let me know if you want this turned into a formal design doc or structured test spec.

Process View

UserUIPostgresEventQueueSidekickProcessorAPIUserUserUIUIPostgresPostgresEventQueueEventQueueSidekickProcessorSidekickProcessorAPIAPISubmission Triggered Automatically (Version N vs N-1)Upload new video asset to CardCreate comparison submission(JSONB DTO, status=pending)Emit "ComparisonSubmissionCreated" eventManual Comparison (User-Initiated)Select Version A and Version BTrigger Compare ActionCreate comparison submission(JSONB DTO, status=pending)Emit "ComparisonSubmissionCreated" eventQueue-based Asynchronous ProcessingDeliver payload JSONBfor processingRun comparison logicSubmit results + status=complete(Update by submission ID)Update comparison rowstatus=complete, result payloadUI Polls or ListensPoll or subscribe for submission statusReturn status + resultsDisplay result segments, scores,technical diffs and time taken
UserUIPostgresEventQueueSidekickProcessorAPIUserUserUIUIPostgresPostgresEventQueueEventQueueSidekickProcessorSidekickProcessorAPIAPISubmission Triggered Automatically (Version N vs N-1)Upload new video asset to CardCreate comparison submission(JSONB DTO, status=pending)Emit "ComparisonSubmissionCreated" eventManual Comparison (User-Initiated)Select Version A and Version BTrigger Compare ActionCreate comparison submission(JSONB DTO, status=pending)Emit "ComparisonSubmissionCreated" eventQueue-based Asynchronous ProcessingDeliver payload JSONBfor processingRun comparison logicSubmit results + status=complete(Update by submission ID)Update comparison rowstatus=complete, result payloadUI Polls or ListensPoll or subscribe for submission statusReturn status + resultsDisplay result segments, scores,technical diffs and time taken

Data Model Views

Domain Model

ComparisonSessionUUID idVideo oldVideoVideo newVideoComparisonResult comparisonResultVideostr idfloat duration_secondsComparisonResultstr oldVideoIdstr newVideoIdfloat similarityScoreList~Segment~ segmentsList~TechDifference~ technicalDiffsTemporalAnchorfloat startfloat endfloat? durationLiteral type ("time", "frame", "character")TechDifferencestr old_valuestr new_valuestr? descriptionLiteral type ("duration", "resolution", "frame_rate", "codec", "audio")SegmentCommonSegmentTemporalAnchor old_rangeTemporalAnchor new_rangefloat similarity (1.0)ModifiedSegmentTemporalAnchor old_rangeTemporalAnchor new_rangefloat similarityAddedSegmentTemporalAnchor new_rangestr? hypothesisRemovedSegmentTemporalAnchor old_rangestr? hypothesisMovedSegmentTemporalAnchor old_rangeTemporalAnchor new_rangefloat similarityoldVideonewVideosegmentstechnicalDiffs
520. plantuml.com | 520: Web server is returning an unknown error

Web server is returning an unknown error Error code 520

Visit cloudflare.com for more information.
2025-08-13 11:03:58 UTC
You

Browser

Working
London

Cloudflare

Working
www.plantuml.com

Host

Error

What happened?

There is an unknown connection issue between Cloudflare and the origin web server. As a result, the web page can not be displayed.

What can I do?

If you are a visitor of this website:

Please try again in a few minutes.

If you are the owner of this website:

There is an issue between Cloudflare's cache and your origin web server. Cloudflare monitors for these errors and automatically investigates the cause. To help support the investigation, you can pull the corresponding error log from your web server and submit it our support team. Please include the Ray ID (which is at the bottom of this error page). Additional troubleshooting resources.

Pseudo Code Implementation


from typing import List, Optional, Union, Literal
from uuid import UUID
from pydantic import BaseModel, Field

# --- Pseudo Code to Capture the Design Session ---
# -- Domain Primitives --

class Video(BaseModel):
    """
    Represents a video asset in the system.
    """
    id: str  # unique identifier, could be path or database ID
    duration_seconds: float

# TODO: Reference from module, also use Union not instance
class TemporalAnchor(BaseModel):
    """
    Represents a temporal anchor within a media object.

    Can be based on time (seconds), frame number, or character offset.
    """
    type: Literal["time", "frame", "character"] = "time"
    start: float = Field(..., description="Start time in seconds, frame number or character offset")
    end: float = Field(..., description="End time in seconds, frame number or character offset")
    duration: Optional[float] = Field(default=None, description="Duration in seconds, frame count or character count")


# -- Segment Match Types --

class CommonSegment(BaseModel):
    """
    A segment that appears unchanged in both videos.

    Indicates 100% similarity for the given temporal anchor ranges.
    """
    old_range: TemporalAnchor
    new_range: TemporalAnchor
    similarity: float = Field(1.0, const=True)


class ModifiedSegment(BaseModel):
    """
    A segment that exists in both videos but has been slightly altered.

    Example: lighting change, subtitle variation, content trimmed.
    """
    old_range: TemporalAnchor
    new_range: TemporalAnchor
    similarity: float  # < 1.0 and > threshold


class AddedSegment(BaseModel):
    """
    A new segment that is present only in the new video.
    """
    new_range: TemporalAnchor
    hypothesis: Optional[str] = None  # human-interpretable guess


class RemovedSegment(BaseModel):
    """
    A segment that was removed from the old video.
    """
    old_range: TemporalAnchor
    hypothesis: Optional[str] = None


class MovedSegment(BaseModel):
    """
    A segment that is present in both videos but reordered.

    Useful for detecting structural edits (e.g., reordering intro).
    """
    old_range: TemporalAnchor
    new_range: TemporalAnchor
    similarity: float


# -- Segment Union Type --

Segment = Union[
    CommonSegment,
    ModifiedSegment,
    AddedSegment,
    RemovedSegment,
    MovedSegment,
]


# -- Technical Differences --

class TechDifference(BaseModel):
    """
    Captures technical-level differences between the videos
    (e.g., resolution, duration, encoding).

    Does not involve semantic or visual content diff.
    """
    type: Literal['duration', 'resolution', 'frame_rate', 'codec', 'audio']
    old_value: str
    new_value: str
    description: Optional[str] = None


# -- Comparison Result -- These are effectively DTOs to move the results across the wire

class ComparisonResult(BaseModel):
    """
    The outcome of comparing two video assets.

    Includes similarity score, semantic segment differences, and technical differences.
    """
    oldVideoId: str # refer to the object by its full reference Asset::Video
    newVideoId: str # refer to the object by its full reference Asset::Video
    similarityScore: float # quant summary of findings
    segments: List[Segment]
    technicalDiffs: List[TechDifference]
    hypothesis: Optional[str] = None  # human-interpretable guess


# -- Comparison Session --

class ComparisonSession(BaseModel):
    """
    A single instance of comparing two video assets.

    Holds input metadata and the result.
    """
    id: UUID
    oldVideo: Video
    newVideo: Video
    comparisonResult: ComparisonResult

Data Persistence View

base schema design for the video_comparison_submissions table:

  • Persisting JSONB payloads (DTOs) using the Entity-Attribute-Value (EAV) model
  • Supporting both automatic (version n vs n-1) and manual submission modes
  • Tracking submission and completion times
  • Recording status and results
  • Supporting asynchronous processing (via event queue and sidekick processor)

Table: video_comparison_submissions

Column Name Type Description
id UUID (PK) Primary key – unique ID for the comparison job
card_id UUID ID of the card to which the assets belong
submitted_by UUID (FK) User who submitted the job (system for automatic)
submission_type TEXT auto (version n vs n-1) or manual (user-selected assets)
old_video_id TEXT Asset ID of the "before" video
new_video_id TEXT Asset ID of the "after" video
status TEXT One of: pending, processing, complete, error
submitted_at TIMESTAMP Time of submission
started_at TIMESTAMP Time processing began (nullable)
completed_at TIMESTAMP Time processing completed (nullable)
result_json JSONB Full result object (ComparisonResult DTO) in EAV-style JSON
error_message TEXT Optional error message if processing fails
metadata JSONB Optional metadata (e.g. client ID, project, tags, trace ID)

Key Features

  • Flexibility via JSONB: Allows you to store domain model DTOs such as ComparisonResult including nested segments, technicalDiffs, etc.
  • EAV Ready: Since this is JSONB, it naturally supports nested structures, dynamic keys, and sparse attributes – all consistent with the EAV philosophy.
  • Supports Retry/Diagnostics: Via error_message and timestamps
  • Tracks submission origin: Through submission_type and submitted_by
  • Optimised for eventing: status + submitted_at are enough for triggering processing logic

Example Entry (Simplified)

{
  "oldVideoId": "asset_123",
  "newVideoId": "asset_124",
  "similarityScore": 0.94,
  "segments": [
    {
      "type": "CommonSegment",
      "old_range": {"type": "time", "start": 0, "end": 10},
      "new_range": {"type": "time", "start": 0, "end": 10},
      "similarity": 1.0
    },
    {
      "type": "ModifiedSegment",
      "old_range": {"type": "time", "start": 11, "end": 20},
      "new_range": {"type": "time", "start": 11, "end": 20},
      "similarity": 0.93
    }
  ],
  "technicalDiffs": [
    {
      "type": "duration",
      "old_value": "00:25:00",
      "new_value": "00:27:00"
    }
  ]
}

Suggested Indexes

CREATE INDEX idx_comparison_status ON video_comparison_submissions(status);
CREATE INDEX idx_comparison_card_id ON video_comparison_submissions(card_id);
CREATE INDEX idx_comparison_submitted_at ON video_comparison_submissions(submitted_at);