AnsweredAssumed Answered

Metadata for Video Content

Question asked by xkahn on May 21, 2014
Latest reply on May 23, 2014 by xkahn
I store quite a bit of video content in my Alfresco server. In general, users upload videos in one of a few supported formats, and I deliver the videos back using the <video> tag.

A while back, I added the ability to associate multiple video formats together to let the user's browser select the most appropriate version.

But now my users are asking about caption support (SRT) for videos in order to handle translations. I'd like to start thinking about what video metadata /SHOULD/ look like in a more organized way. Ideally, others can use this too!

So, with that in mind, the idea is to create a video aspect, similar to the exif aspect for JPGs. Video formats are much more complex than image formats, so this could be tricky to get right. It may be that multiple aspects will be needed, depending on the video.

So lets start right off:
Running ffprobe on a OGG video stream shows:


$ ffprobe '/home/xkahn/public_html/Bugzilla - Beyond the Basics.ogv'
Input #0, ogg, from '/home/xkahn/public_html/Bugzilla - Beyond the Basics.ogv':
  Duration: 00:36:28.76, start: 0.000000, bitrate: 972 kb/s
    Stream #0:0: Video: theora, yuv420p, 568x576 [SAR 64:45 DAR 568:405], 25 tbr, 25 tbn, 25 tbc
    Metadata:
      MAJOR_BRAND     : mp42
      MINOR_VERSION   : 0
      COMPATIBLE_BRANDS: mp42isomavc1
      CREATION_TIME   : 2012-11-19 01:44:43
      ENCODER         : Lavf53.32.100
    Stream #0:1: Audio: flac, 48000 Hz, stereo, s16
    Metadata:
      MAJOR_BRAND     : mp42
      MINOR_VERSION   : 0
      COMPATIBLE_BRANDS: mp42isomavc1
      CREATION_TIME   : 2012-11-19 01:44:43
      ENCODER         : Lavf53.32.100


And another file:


$ffprobe '/home/xkahn/Downloads/New year 2014.mp4'
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/xkahn/Downloads/New year 2014.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 1
    compatible_brands: mp41mp42isom
    creation_time   : 2014-01-03 01:33:35
  Duration: 00:01:39.60, start: 0.000000, bitrate: 887 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 640x360, 754 kb/s, 30 fps, 30 tbr, 600 tbn, 1200 tbc (default)
    Metadata:
      creation_time   : 2014-01-03 01:33:35
      handler_name    : Core Media Video
    Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 130 kb/s (default)
    Metadata:
      creation_time   : 2014-01-03 01:33:35
      handler_name    : Core Media Audio


Obviously, video files are containers which contain other content: video, audio, and text.

The first metadata is information about the container itself: container type, total duration, anything else?

Next is one or more video streams: codec, resolution, duration, frames per second, chrominance and luminance, anything else?

Then on to audio: codec, sampling frequency, channels, anything else?

Finally text. What types of text streams are there? Well, there are subtitles, captions, screen reader descriptions, chapters, and cues. These text files can be embedded in the video, or external. Each text file has the following video related metadata: kind, language, anything else?

Others have standardized on AudioMD and VideoMD: http://www.loc.gov/standards/amdvmd/

Anyway, this is just a statement of the problem. I'll start trying to sketch out what this would look like next.

Outcomes