FFmpeg
FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.
You can interact with FFmpeg using their command-line interface or using their C API.
I find it useful for converting videos to gifs. You can also extract videos into a sequence of images or vice-versa.
CLI
You can download static builds of FFmpeg from
- Linux: https://johnvansickle.com/ffmpeg/
- Windows: https://ffmpeg.zeranoe.com/builds/
Basic usage is as follows:
ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
x264
x264 is a software h264 decoder and encoder.
[1]
Changing Pixel Format
Encode to h264 with YUV420p pixel format
ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4
Images to Video
Reference
Assuming 60 images per second and you want a 30 fps video.
# Make sure -framerate is before -i
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4
Video to Images
Extracting frames from a video
ffmpeg -i video.mp4 frames/%d.png
- Use
-ss H:M:S
to specify where to start before you input the video - Use
-vframes 1
to extract one frames - Use
-vf "select=not(mod(n\,10))"
to select every 10th frame
Crop
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename
- Here
x
andy
are the top left corners of your crop.w
andh
are the height and width of the final image or video.
Get a list of encoders/decoders
for i in encoders decoders filters; do
echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done
PSNR/SSIM
Reference
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.
ffmpeg -i distorted.mp4 -i reference.mp4 \
-lavfi "ssim;[0:v][1:v]psnr" -f null –
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi ssim -f null -
Generate Thumbnails
Reference
Below is a bash script to generate all thumbnails in a folder
#!/usr/bin/env bash
OUTPUT_FOLDER="thumbnails"
mkdir -p $OUTPUT_FOLDER
for file in *.mp4;
do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png";
done
MP4 to GIF
Normally you can just do
ffmpeg -i my_video.mp4 my_video.gif
However, Ruofei has a more advanced script below:
#!/bin/sh
start_time=0:0
duration=17
palette="/tmp/palette.png"
filters="fps=15,scale=320:-1:flags=lanczos"
ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -vf "$filters,palettegen" -y $palette
ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -i $palette -lavfi "$filters [x]; [x][1:v] paletteuse" -y $1.gif
Here is another script from https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality
#!/bin/sh
ffmpeg -i $1 -vf "fps=10,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
Resizing/Scaling
ffmpeg -i input.avi -vf scale=320:240 output.avi
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
- Resizing with transparent padding
Useful for generating logos
ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
- 256
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
- 512
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
Rotation
To rotate 180 degrees
ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
- 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
- 1 – Rotate by 90 degrees clockwise.
- 2 – Rotate by 90 degrees counter-clockwise.
- 3 – Rotate by 90 degrees clockwise and flip vertically.
360 Video
See v360 filter
- Converting EAC to equirectangular
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4
Sometimes you may run into errors where height or width is not divisible by 2.
Apply a scale filter to fix this issue.
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4
Removing Duplicate Frames
Useful for extracting frames from timelapses.
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
C API
A doxygen reference manual for their C api is available at [2].
Getting Started
Structs
AVInputFormat
/AVOutputFormat
Represents a container type.AVFormatContext
Represents your specific container.AVStream
Represents a single audio, video, or data stream in your container.AVCodec
Represents a single codec (e.g. H.264)AVCodecContext
Represents your specific codec and contains all associated paramters.AVPacket
Compressed Data.AVFrame
Decoded audio or video data.SwsContext
Used for image scaling and colorspace and pixel format conversion operations.
Pixel Formats
Reference
Pixel formats are stored as AVPixelFormat
enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.
- AV_PIX_FMT_RGB24
- This is your standard 24 bits per pixel RGB.
- In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
- Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
- AV_PIX_FMT_YUV420P
- This is a planar YUV pixel format with chroma subsampling.
- Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
- In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
- Data[0] will typically be \(\displaystyle width * height\) bytes.
- Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.
Muxing to memory
You can specify a custom AVIOContext
and attach it to your AVFormatContext->pb
to mux directly to memory or to implement your own buffering.
NVENC
Options Reference
When encoding using NVENC, your codec_ctx->priv_data
is a pointer to a NvencContext
.
To list all of the things you can set in the private data, you can type the following in bash
ffmpeg -hide_banner -h encoder=h264_nvenc
if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
NULL, 0)) < 0) {
cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
return;
}
if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
<< endl;
return;
}
codec_ctx = avcodec_alloc_context3(codec);
codec_ctx->bit_rate = 2500000;
codec_ctx->width = source_codec_ctx->width;
codec_ctx->height = source_codec_ctx->height;
codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
codec_ctx->time_base = source_codec_ctx->time_base;
input_timebase = source_codec_ctx->time_base;
codec_ctx->framerate = source_codec_ctx->framerate;
codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
codec_ctx->max_b_frames = 0;
codec_ctx->delay = 0;
codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
C++ API
FFmpeg does not have an official C++ API.
There are wrappers such as Raveler/ffmpeg-cpp which you can use.
However, I recommend just using the C API and wrapping things in smart pointers.
My Preferences
My preferences for encoding video
!#/bin/bash
ffmpeg -i $1 -c:v libx265 -crf 28 -preset medium -c:a libopus -b:a 128K $2
- Notes
- You need to output to a MKV file