From David's Wiki
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.

You can interact with FFmpeg using their command-line interface or using their C API.
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout.


You can download static builds of FFmpeg from

If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script.

Basic usage is as follows:

ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
  • Use -pattern_type glob for wildcards (e.g. all images in a folder)


x264 is a software h264 decoder and encoder.

Changing Pixel Format

Encode to h264 with YUV420p pixel format

ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4

Images to Video

Assuming 60 images per second and you want a 30 fps video.

# Make sure -framerate is before -i
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4

Video to Images

Extracting frames from a video

ffmpeg -i video.mp4 frames/%d.png
  • Use -ss H:M:S to specify where to start before you input the video
  • Use -vframes 1 to extract one frames
  • Use -vf "select=not(mod(n\,10))" to select every 10th frame

Get a list of encoders/decoders


for i in encoders decoders filters; do
    echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"


FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.

ffmpeg -i distorted.mp4 -i reference.mp4 \
       -lavfi "ssim;[0:v][1:v]psnr" -f null –

ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -

Generate Thumbnails

Below is a bash script to generate all thumbnails in a folder

#!/usr/bin/env bash


for file in *.mp4;
  do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png";

MP4 to GIF

Normally you can just do

ffmpeg -i my_video.mp4 my_video.gif

If you want better quality, you can use the following filter_complex:


Here is another script from https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality

mp4 to gif script
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2

Pipe to stdout

Below is an example of piping the video only to stdout:

ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo -

In Python, you can read it as follows:

video_width = 1920
video_height = 1080
ffmpeg_process = subprocess.Popen(ffmpeg_command,
raw_image = ffmpeg_process.stdout.read(
              video_width * video_height * 3)
image = (np.frombuffer(raw_image, dtype=np.uint8)
           .reshape(video_height, video_width, 3))


Filters are part of the CLI


ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
  • Here x and y are the top left corners of your crop. w and h are the height and width of the final image or video.


FFMpeg Scaling
scale filter

ffmpeg -i input.avi -vf scale=320:240 output.avi

ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
  • If the aspect ratio is not what you expect, try using the setdar filter.
    • E.g. setdar=ratio=2/1
Resizing with transparent padding

Useful for generating logos

ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
More sizes

ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png


transpose filter

To rotate 180 degrees

ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
  • 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
  • 1 – Rotate by 90 degrees clockwise.
  • 2 – Rotate by 90 degrees counter-clockwise.
  • 3 – Rotate by 90 degrees clockwise and flip vertically.

360 Video

See v360 filter

Converting EAC to equirectangular

Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:

ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4

Sometimes you may run into errors where height or width is not divisible by 2.
Apply a scale filter to fix this issue.

ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4

Converting to rectilinear

ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4


To add 360 video metadata, you should use Google's spatial-media. This will add the following sidedata which you can see using ffprobe:

Side data:
 spherical: equirectangular (0.000000/0.000000/0.000000) 

Removing Duplicate Frames

mpdecimate filter

Useful for extracting frames from timelapses.

ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4

Stack and Unstack

To stack, see hstack, vstack.
To unstack, see crop.


Filter complex allows you to create a graph of filters.

Suppose you have 3 inputs: $1, $2, $3.
Then you can access them as streams [0], [1], [3].
The filter syntax allows you to chain multiple filters where each filter is an edge.
For example, [0]split[t1][t2] creates two vertices t1 and t2 from input 0. The last statement in your edge will be the output of your command:
E.g. [t1][t2]vstack

ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y

Concatenate Videos

ffmpeg -i part_1.mp4 \
    -i part_2.mp4 \
    -i part_3.mp4 \
    -filter_complex \
     [0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \
    -map "[v]" -map "[a]" \
    -vsync 2 \
    all_parts.mp4 -y

Replace transparency

Add a background to transparent images.

ffmpeg -i in.mov -filter_complex
        [bg][fg]overlay=format=auto" -c:a copy new.mov

Draw Text


ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output


A doxygen reference manual for their C api is available at [2].
Note that FFmpeg is licensed under GPL.
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [3].

Getting Started

Best way to get started is to look at the official examples.


  • AVInputFormat/AVOutputFormat Represents a container type.
  • AVFormatContext Represents your specific container.
  • AVStream Represents a single audio, video, or data stream in your container.
  • AVCodec Represents a single codec (e.g. H.264)
  • AVCodecContext Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
  • AVPacket Compressed Data.
  • AVFrame Decoded audio or video data.
  • SwsContext Used for image scaling and colorspace and pixel format conversion operations.

Pixel Formats

Pixel formats are stored as AVPixelFormat enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.

  • This is your standard 24 bits per pixel RGB.
  • In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
  • Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
  • This is a planar YUV pixel format with chroma subsampling.
  • Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
  • In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
  • Data[0] will typically be \(\displaystyle width * height\) bytes.
  • Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.

Muxing to memory

You can specify a custom AVIOContext and attach it to your AVFormatContext->pb to mux directly to memory or to implement your own buffering.


Options Reference

When encoding using NVENC, your codec_ctx->priv_data is a pointer to a NvencContext.

To list all of the things you can set in the private data, you can type the following in bash

ffmpeg -hide_banner -h encoder=h264_nvenc
NVENC Codec Ctx
  if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
                                    NULL, 0)) < 0) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;

  if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
         << endl;
  codec_ctx = avcodec_alloc_context3(codec);
  codec_ctx->bit_rate = 2500000;
  codec_ctx->width = source_codec_ctx->width;
  codec_ctx->height = source_codec_ctx->height;
  codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
  codec_ctx->time_base = source_codec_ctx->time_base;
  input_timebase = source_codec_ctx->time_base;
  codec_ctx->framerate = source_codec_ctx->framerate;
  codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
  codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
  codec_ctx->max_b_frames = 0;
  codec_ctx->delay = 0;
  codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
  av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
  av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
  av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
  av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
  av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
  av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);


FFmpeg does not have an official C++ API.
There are wrappers such as Raveler/ffmpeg-cpp which you can use.
However, I recommend just using the C API and wrapping things in smart pointers.

Python API

You can try pyav which contains bindings for the library. However I haven't tried it.
If you just need to call the CLI, you can use ffmpeg-python to help build calls.

JavaScript API

To use FFmpeg in a browser, see ffmpegwasm.
Note: I have not tried this. It uses CLI commands, not library API commands.

My Preferences

My preferences for encoding video


I mostly use H264 when working on projects for compatibility purposes. Here I typically don't need the smallest file size or best quality, prioritizing encoding speed.


ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2
  • MP4 is ok


H264/HEVC is used for archival purposes to minimize the file size and maximize the quality.


ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2
  • You need to output to a MKV file
  • The pixel format yuv444p10le is 10 bit color without chroma subsampling. If your source is lower, you can use yuv420p instead for 8-bit color and 4:2:0 chroma subsampling.