FFmpeg

From David's Wiki
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia. You can interact with FFmpeg using their command-line interface or using their C API. I find it useful for converting videos to gifs. You can also extract videos into a sequence of images or vice-versa.

CLI

Basic usage is as follows:

ffmpeg -i input_file [-s resolution] [-b bitrate] [-ss start_second] [-t time] [-r output_framerate] output.mp4

x264

x264 is a software h264 decoder and encoder.
[1]

Changing Pixel Format

Encode to h264 with YUV420p pixel format

ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4

Images to Video

Reference
Assuming 60 images per second and you want a 30 fps video.

ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4

Crop

ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename


Get a list of encoders/decoders

Reference

for i in encoders decoders filters; do
    echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done

PSNR/SSIM

Reference
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.

ffmpeg -i distorted.mp4 -i reference.mp4 \
       -lavfi "ssim;[0:v][1:v]psnr" -f null –

ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -

C API

A doxygen reference manual for their C api is available at [2].

Getting Started

Structs

Pixel Formats

Reference
Pixel formats are stored as AVPixelFormat enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.

AV_PIX_FMT_RGB24
  • This is your standard 24 bits per pixel RGB.
  • In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
  • Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
AV_PIX_FMT_YUV420P
  • This is a planar YUV pixel format with chroma subsampling.
  • Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
  • In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
  • Data[0] will typically be \(\displaystyle width * height\) bytes.
  • Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.

Muxing to memory

You can specify a custom AVIOContext and attach it to your AVFormatContext->pb to mux directly to memory or to implement your own buffering.


NVENC

Options Reference When encoding using NVENC, your codec_ctx->priv_data is a pointer to a NvencContext. To list all of the things you can set in the private data, you can type the following in bash

ffmpeg -hide_banner -h encoder=h264_nvenc
  if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
                                    NULL, 0)) < 0) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
    return;
  }

  if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
         << endl;
    return;
  }
  codec_ctx = avcodec_alloc_context3(codec);
  codec_ctx->bit_rate = 2500000;
  codec_ctx->width = source_codec_ctx->width;
  codec_ctx->height = source_codec_ctx->height;
  codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
  codec_ctx->time_base = source_codec_ctx->time_base;
  input_timebase = source_codec_ctx->time_base;
  codec_ctx->framerate = source_codec_ctx->framerate;
  codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
  codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
  codec_ctx->max_b_frames = 0;
  codec_ctx->delay = 0;
  codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
  av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
  av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
  av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
  av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
  av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
  av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);

C++ API

FFmpeg does not have an official C++ API.