FFmpeg

From David's Wiki
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.

You can interact with FFmpeg using their command-line interface or using their C API.

I find it useful for converting videos to gifs. You can also extract videos into a sequence of images or vice-versa.

CLI

You can download static builds of FFmpeg from

Basic usage is as follows:

ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4

x264

x264 is a software h264 decoder and encoder.
[1]

Changing Pixel Format

Encode to h264 with YUV420p pixel format

ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4

Images to Video

Reference
Assuming 60 images per second and you want a 30 fps video.

# Make sure -framerate is before -i
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4

Video to Images

Extracting frames from a video

ffmpeg -i video.mp4 frames/%d.png
  • Use -ss H:M:S to specify where to start before you input the video
  • Use -vframes 1 to extract one frames
  • Use -vf "select=not(mod(n\,10))" to select every 10th frame

Crop

ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
  • Here x and y are the top left corners of your crop. w and h are the height and width of the final image or video.

Get a list of encoders/decoders

Reference

for i in encoders decoders filters; do
    echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done

PSNR/SSIM

Reference
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.

ffmpeg -i distorted.mp4 -i reference.mp4 \
       -lavfi "ssim;[0:v][1:v]psnr" -f null –

ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -

Generate Thumbnails

Reference
Below is a bash script to generate all thumbnails in a folder

Script
#!/usr/bin/env bash

OUTPUT_FOLDER="thumbnails"

mkdir -p $OUTPUT_FOLDER
for file in *.mp4;
  do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png";
done

MP4 to GIF

Normally you can just do

ffmpeg -i my_video.mp4 my_video.gif

However, Ruofei has a more advanced script below:

Ruofei's MP4 to GIF
#!/bin/sh
start_time=0:0
duration=17

palette="/tmp/palette.png"

filters="fps=15,scale=320:-1:flags=lanczos"

ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -vf "$filters,palettegen" -y $palette
ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -i $palette -lavfi "$filters [x]; [x][1:v] paletteuse" -y $1.gif

Here is another script from https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality

mp4 to gif script
#!/bin/sh
ffmpeg -i $1 -vf "fps=10,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2

Resizing/Scaling

FFMpeg Scaling
scale filter

ffmpeg -i input.avi -vf scale=320:240 output.avi

ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
Resizing with transparent padding

Useful for generating logos

ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
More sizes


256
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
512
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png

Rotation

transpose filter

To rotate 180 degrees

ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
  • 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
  • 1 – Rotate by 90 degrees clockwise.
  • 2 – Rotate by 90 degrees counter-clockwise.
  • 3 – Rotate by 90 degrees clockwise and flip vertically.

360 Video

See v360 filter

Converting EAC to equirectangular

Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:

ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4

Sometimes you may run into errors where height or width is not divisible by 2.
Apply a scale filter to fix this issue.

ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4

Removing Duplicate Frames

Reference
mpdecimate filter

Useful for extracting frames from timelapses.

ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4

Filter-Complex

Filter complex allows you to create a graph of filters.

Suppose you have 3 inputs: $1, $2, $3.
Then you can access them as streams [0], [1], [3].
The filter syntax allows you to chain multiple filters where each filter is an edge.
For example, [0]split[t1][t2] creates two vertices t1 and t2 from input 0. The last statement in your edge will be the output of your command:
E.g. [t1][t2]vstack

ffmpeg -i $1 -i $2 -i $3 -filter-complex "..."

C API

A doxygen reference manual for their C api is available at [2].

Getting Started

Structs

  • AVInputFormat/AVOutputFormat Represents a container type.
  • AVFormatContext Represents your specific container.
  • AVStream Represents a single audio, video, or data stream in your container.
  • AVCodec Represents a single codec (e.g. H.264)
  • AVCodecContext Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
  • AVPacket Compressed Data.
  • AVFrame Decoded audio or video data.
  • SwsContext Used for image scaling and colorspace and pixel format conversion operations.

Pixel Formats

Reference
Pixel formats are stored as AVPixelFormat enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.

AV_PIX_FMT_RGB24
  • This is your standard 24 bits per pixel RGB.
  • In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
  • Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
AV_PIX_FMT_YUV420P
  • This is a planar YUV pixel format with chroma subsampling.
  • Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
  • In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
  • Data[0] will typically be \(\displaystyle width * height\) bytes.
  • Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.

Muxing to memory

You can specify a custom AVIOContext and attach it to your AVFormatContext->pb to mux directly to memory or to implement your own buffering.


NVENC

Options Reference

When encoding using NVENC, your codec_ctx->priv_data is a pointer to a NvencContext.

To list all of the things you can set in the private data, you can type the following in bash

ffmpeg -hide_banner -h encoder=h264_nvenc
NVENC Codec Ctx
  if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
                                    NULL, 0)) < 0) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
    return;
  }

  if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
         << endl;
    return;
  }
  codec_ctx = avcodec_alloc_context3(codec);
  codec_ctx->bit_rate = 2500000;
  codec_ctx->width = source_codec_ctx->width;
  codec_ctx->height = source_codec_ctx->height;
  codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
  codec_ctx->time_base = source_codec_ctx->time_base;
  input_timebase = source_codec_ctx->time_base;
  codec_ctx->framerate = source_codec_ctx->framerate;
  codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
  codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
  codec_ctx->max_b_frames = 0;
  codec_ctx->delay = 0;
  codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
  av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
  av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
  av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
  av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
  av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
  av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);

C++ API

FFmpeg does not have an official C++ API.
There are wrappers such as Raveler/ffmpeg-cpp which you can use.
However, I recommend just using the C API and wrapping things in smart pointers.

My Preferences

My preferences for encoding video

!#/bin/bash

ffmpeg -i $1 -c:v libx265 -crf 28 -preset medium -c:a libopus -b:a 128K $2
Notes
  • You need to output to a MKV file