FFmpeg
FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia. You can interact with FFmpeg using their command-line interface or using their C API. I find it useful for converting videos to gifs. You can also extract videos into a sequence of images or vice-versa.
CLI
Basic usage is as follows:
ffmpeg -i input_file [-s resolution] [-b bitrate] [-ss start_second] [-t time] [-r output_framerate] output.mp4
x264
x264 is a software h264 decoder and encoder.
[1]
Changing Pixel Format
Encode to h264 with YUV420p pixel format
ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4
Images to Video
Reference
Assuming 60 images per second and you want a 30 fps video.
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4
Crop
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename
Get a list of encoders/decoders
for i in encoders decoders filters; do
echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done
PSNR/SSIM
Reference
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.
ffmpeg -i distorted.mp4 -i reference.mp4 \
-lavfi "ssim;[0:v][1:v]psnr" -f null –
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi ssim -f null -
C API
A doxygen reference manual for their C api is available at [2].
Getting Started
Structs
AVInputFormat
/AVOutputFormat
Represents a container type.AVFormatContext
Represents your specific container.AVStream
Represents a single audio, video, or data stream in your container.AVCodec
Represents a single codec (e.g. H.264)AVCodecContext
Represents your specific codec and contains all associated paramters.AVPacket
Compressed Data.AVFrame
Decoded audio or video data.SwsContext
Used for image scaling and colorspace and pixel format conversion operations.
Pixel Formats
Reference
Pixel formats are stored as AVPixelFormat
enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.
- AV_PIX_FMT_RGB24
- This is your standard 24 bits per pixel RGB.
- In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
- Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
- AV_PIX_FMT_YUV420P
- This is a planar YUV pixel format with chroma subsampling.
- Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
- In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
- Data[0] will typically be \(\displaystyle width * height\) bytes.
- Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.
Muxing to memory
You can specify a custom AVIOContext
and attach it to your AVFormatContext->pb
to mux directly to memory or to implement your own buffering.
NVENC
Options Reference
When encoding using NVENC, your codec_ctx->priv_data
is a pointer to a NvencContext
.
To list all of the things you can set in the private data, you can type the following in bash
ffmpeg -hide_banner -h encoder=h264_nvenc
if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
NULL, 0)) < 0) {
cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
return;
}
if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
<< endl;
return;
}
codec_ctx = avcodec_alloc_context3(codec);
codec_ctx->bit_rate = 2500000;
codec_ctx->width = source_codec_ctx->width;
codec_ctx->height = source_codec_ctx->height;
codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
codec_ctx->time_base = source_codec_ctx->time_base;
input_timebase = source_codec_ctx->time_base;
codec_ctx->framerate = source_codec_ctx->framerate;
codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
codec_ctx->max_b_frames = 0;
codec_ctx->delay = 0;
codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
C++ API
FFmpeg does not have an official C++ API.