FFmpeg: Difference between revisions

Revision as of 17:45, 29 October 2019

FFmpeg is a library for encoding and decoding multimedia. You can interact with FFmpeg using their command-line interface or using their C API. I find it useful for converting videos to gifs. You can also extract videos into a sequence of images or vice-versa.

CLI

Basic usage is as follows:

ffmpeg -i input_file [-s resolution] [-b bitrate] [-ss start_second] [-t time] [-r output_framerate] output.mp4

x264

x264 is a software h264 decoder and encoder.
[1]

Changing Pixel Format

Encode to h264 with YUV420p pixel format

ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4

Images to Video

Reference
Assuming 60 images per second and you want a 30 fps video.

ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4

Crop

ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename

C API

A doxygen reference manual for their C api is available at [2].

Getting Started

Structs

AVFormat Represents a container type.
AVFormatContext Represents your specific container.
AVStream Represents a single audio, video, or data stream in your container.
AVCodec Represents a single codec (e.g. H.264)
AVCodecContext Represents your specific codec and contains all associated paramters.
AVFrame Decoded audio or video data.
SwsContext Used for image scaling and colorspace and pixel format conversion operations.

Muxing to memory

You can specify a custom AVIOContext and attach it to your AVFormatContext->pb to mux directly to memory or to implement your own buffering.

NVENC

Options Reference When encoding using NVENC, your codec_ctx->priv_data is a pointer to a NvencContext. To list all of the things you can set in the private data, you can type the following in bash

ffmpeg -hide_banner -h encoder=h264_nvenc

  if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
                                    NULL, 0)) < 0) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
    return;
  }

  if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
         << endl;
    return;
  }
  codec_ctx = avcodec_alloc_context3(codec);
  codec_ctx->bit_rate = 2500000;
  codec_ctx->width = source_codec_ctx->width;
  codec_ctx->height = source_codec_ctx->height;
  codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
  codec_ctx->time_base = source_codec_ctx->time_base;
  input_timebase = source_codec_ctx->time_base;
  codec_ctx->framerate = source_codec_ctx->framerate;
  codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
  codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
  codec_ctx->max_b_frames = 0;
  codec_ctx->delay = 0;
  codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
  av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
  av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
  av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
  av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
  av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
  av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);

@@ Line 33: / Line 33: @@
 ==C API==
 A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].
+===Getting Started===
+====Structs====
+* <code>AVFormat</code> Represents a container type.
+* <code>AVFormatContext</code> Represents your specific container.
+* <code>AVStream</code> Represents a single audio, video, or data stream in your container.
+* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
+* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters.
+* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
+* <code>SwsContext</code> Used for image scaling and colorspace and pixel format conversion operations.
 ===Muxing to memory===