FFmpeg: Difference between revisions

From David's Wiki
No edit summary
 
(75 intermediate revisions by the same user not shown)
Line 1: Line 1:
FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.
[https://ffmpeg.org/ FFmpeg] (Fast Forward MPEG) is a library for encoding and decoding multimedia.
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
 
I find it useful for converting videos to gifs. You can also [https://en.wikibooks.org/wiki/FFMPEG_An_Intermediate_Guide/image_sequence extract videos into a sequence of images or vice-versa].
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout.


==CLI==
==CLI==
You can download static builds of FFmpeg from
* Linux: [https://johnvansickle.com/ffmpeg/ https://johnvansickle.com/ffmpeg/]
* Windows: [https://ffmpeg.zeranoe.com/builds/ https://ffmpeg.zeranoe.com/builds/]
If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script.
Basic usage is as follows:
Basic usage is as follows:
<pre>
<pre>
ffmpeg -i input_file [-s resolution] [-b bitrate] [-ss start_second] [-t time] [-r output_framerate] output.mp4
ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
</pre>
</pre>
* Use <code>-pattern_type glob</code> for wildcards (e.g. all images in a folder)


===x264===
===x264===
Line 23: Line 32:
Assuming 60 images per second and you want a 30 fps video.
Assuming 60 images per second and you want a 30 fps video.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Make sure -framerate is before -i
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4
</syntaxhighlight>
</syntaxhighlight>


===Crop===
===Video to Images===
Extracting frames from a video
 
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
ffmpeg -i video.mp4 frames/%d.png
</syntaxhighlight>
</syntaxhighlight>


* Use <code>-ss H:M:S</code> to specify where to start before you input the video
* Use <code>-vframes 1</code> to extract one frames
* Use <code>-vf "select=not(mod(n\,10))"</code> to select every 10th frame


===Get a list of encoders/decoders===
===Get a list of encoders/decoders===
Line 50: Line 65:
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -
</syntaxhighlight>
</syntaxhighlight>
===Generate Thumbnails===
[https://superuser.com/questions/1099491/batch-extract-thumbnails-with-ffmpeg Reference]<br>
Below is a bash script to generate all thumbnails in a folder
{{hidden|Script|
<syntaxhighlight lang="bash">
#!/usr/bin/env bash
OUTPUT_FOLDER="thumbnails"
mkdir -p $OUTPUT_FOLDER
for file in *.mp4;
  do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png";
done
</syntaxhighlight>
}}
===MP4 to GIF===
Normally you can just do
<syntaxhighlight lang="bash">
ffmpeg -i my_video.mp4 my_video.gif
</syntaxhighlight>
If you want better quality, you can use the following filter_complex:
<pre>
[0]split=2[v1][v2];[v1]palettegen=stats_mode=full[palette];[v2][palette]paletteuse=dither=sierra2_4a
</pre>
Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality]
{{hidden | mp4 to gif script |
<syntaxhighlight lang="bash">
#!/bin/sh
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
</syntaxhighlight>
}}
===Pipe to stdout===
Below is an example of piping the video only to stdout:
<pre>
ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo -
</pre>
In Python, you can read it as follows:
<syntaxhighlight lang="python">
video_width = 1920
video_height = 1080
ffmpeg_process = subprocess.Popen(ffmpeg_command,
                                  stdout=subprocess.PIPE,
                                  stderr=subprocess.PIPE)
raw_image = ffmpeg_process.stdout.read(
              video_width * video_height * 3)
image = (np.frombuffer(raw_image, dtype=np.uint8)
          .reshape(video_height, video_width, 3))
</syntaxhighlight>
==Filters==
Filters are part of the CLI<br>
[https://ffmpeg.org/ffmpeg-filters.html https://ffmpeg.org/ffmpeg-filters.html]
===Crop===
<syntaxhighlight lang="bash">
ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
</syntaxhighlight>
* Here <code>x</code> and <code>y</code> are the top left corners of your crop. <code>w</code> and <code>h</code> are the height and width of the final image or video.
===Resizing/Scaling===
[https://trac.ffmpeg.org/wiki/Scaling FFMpeg Scaling]<br>
[https://ffmpeg.org/ffmpeg-filters.html#scale scale filter]
<syntaxhighlight lang="bash">
ffmpeg -i input.avi -vf scale=320:240 output.avi
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
</syntaxhighlight>
* If the aspect ratio is not what you expect, try using the <code>setdar</code> filter.
** E.g. <code>setdar=ratio=2/1</code>
;Resizing with transparent padding
Useful for generating logos
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
{{hidden | More sizes |
;256
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
;512
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
}}
===Rotation===
[https://ffmpeg.org/ffmpeg-filters.html#transpose transpose filter]
To rotate 180 degrees
<syntaxhighlight lang="bash">
ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
</syntaxhighlight>
* 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
* 1 – Rotate by 90 degrees clockwise.
* 2 – Rotate by 90 degrees counter-clockwise.
* 3 – Rotate by 90 degrees clockwise and flip vertically.
===360 Video===
See [https://ffmpeg.org/ffmpeg-filters.html#v360 v360 filter]
=====Converting EAC to equirectangular=====
Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4
</pre>
Sometimes you may run into errors where height or width is not divisible by 2.<br>
Apply a scale filter to fix this issue.
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4
</pre>
====Converting to rectilinear====
<pre>
ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4
</pre>
====Metadata====
To add 360 video metadata, you should use [https://github.com/google/spatial-media Google's spatial-media].
This will add the following sidedata which you can see using <code>ffprobe</code>:
<pre>
Side data:
spherical: equirectangular (0.000000/0.000000/0.000000)
</pre>
===Removing Duplicate Frames===
[https://stackoverflow.com/questions/37088517/remove-sequentially-duplicate-frames-when-using-ffmpeg Reference]<br>
[https://ffmpeg.org/ffmpeg-filters.html#mpdecimate mpdecimate filter]
Useful for extracting frames from timelapses.
<syntaxhighlight lang="bash">
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
</syntaxhighlight>
===Stack and Unstack===
To stack, see [https://ffmpeg.org/ffmpeg-all.html#hstack <code>hstack</code>], [https://ffmpeg.org/ffmpeg-all.html#vstack <code>vstack</code>]. 
To unstack, see <code>crop</code>.
===Filter-Complex===
Filter complex allows you to create a graph of filters.
Suppose you have 3 inputs: $1, $2, $3. 
Then you can access them as streams [0], [1], [3]. 
The filter syntax allows you to chain multiple filters where each filter is an edge. 
For example, <code>[0]split[t1][t2]</code> creates two vertices t1 and t2 from input 0.
The last statement in your edge will be the output of your command: 
E.g. <code>[t1][t2]vstack</code>
<pre>
ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y
</pre>
===Concatenate Videos===
<pre>
ffmpeg -i part_1.mp4 \
    -i part_2.mp4 \
    -i part_3.mp4 \
    -filter_complex \
    "[0]scale=1920:1080[0s];\
    [1]scale=1920:1080[1s];\
    [2]scale=1920:1080[2s];\
    [0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \
    -map "[v]" -map "[a]" \
    -vsync 2 \
    all_parts.mp4 -y
</pre>
===Replace transparency===
[https://superuser.com/questions/1341674/ffmpeg-convert-transparency-to-a-certain-color Reference]<br>
Add a background to transparent images.<br>
<pre>
ffmpeg -i in.mov -filter_complex
      "[0]format=pix_fmts=yuva420p,split=2[bg][fg];[bg]drawbox=c=white@1:replace=1:t=fill[bg];
        [bg][fg]overlay=format=auto" -c:a copy new.mov
</pre>
===Draw Text===
https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
<pre>
ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output
</pre>


==C API==
==C API==
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].<br>
Note that FFmpeg is licensed under GPL.<br>
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [https://batchloaf.wordpress.com/2017/02/12/a-simple-way-to-read-and-write-audio-and-video-files-in-c-using-ffmpeg-part-2-video/].<br>


===Getting Started===
===Getting Started===
Best way to get started is to look at the [https://ffmpeg.org/doxygen/trunk/examples.html official examples].
====Structs====
====Structs====
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
Line 60: Line 274:
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
Line 85: Line 299:
===Muxing to memory===
===Muxing to memory===
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.


===NVENC===
===NVENC===
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
To list all of the things you can set in the private data, you can type the following in bash
To list all of the things you can set in the private data, you can type the following in bash
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 95: Line 310:
</syntaxhighlight>
</syntaxhighlight>


{{ hidden | NVENC Codec Ctx |
<syntaxhighlight lang="c++">
<syntaxhighlight lang="c++">
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
Line 128: Line 344:
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
</syntaxhighlight>
</syntaxhighlight>
}}


==C++ API==
==C++ API==
FFmpeg does not have an official C++ API.
FFmpeg does not have an official C++ API.<br>
There are wrappers such as [https://github.com/Raveler/ffmpeg-cpp Raveler/ffmpeg-cpp] which you can use.<br>
However, I recommend just using the C API and wrapping things in smart pointers.
 
==Python API==
You can try [https://github.com/PyAV-Org/PyAV pyav] which contains bindings for the library. However I haven't tried it. 
If you just need to call the CLI, you can use [https://github.com/kkroening/ffmpeg-python ffmpeg-python] to help build calls.
 
==JavaScript API==
To use FFmpeg in a browser, see [https://ffmpegwasm.github.io/ ffmpegwasm]. 
Note: I have not tried this. It uses CLI commands, not library API commands.
 
==My Preferences==
My preferences for encoding video
===H264===
I mostly use H264 when working on projects for compatibility purposes. Here I typically don't need the smallest file size or best quality, prioritizing encoding speed.
<syntaxhighlight lang="bash">
!#/bin/bash
 
ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2
</syntaxhighlight>
 
;Notes
* MP4 is ok
 
===H265/HEVC===
H264/HEVC is used for archival purposes to minimize the file size and maximize the quality.
<syntaxhighlight lang="bash">
!#/bin/bash
 
ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2
</syntaxhighlight>
 
;Notes
* You need to output to a MKV file
* The pixel format <code>yuv444p10le</code> is 10 bit color without chroma subsampling. If your source is lower, you can use <code>yuv420p</code> instead for 8-bit color and 4:2:0 chroma subsampling.
 
===AV1===

Latest revision as of 14:26, 9 October 2023

FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.

You can interact with FFmpeg using their command-line interface or using their C API.
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout.

CLI

You can download static builds of FFmpeg from

If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script.

Basic usage is as follows:

ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
  • Use -pattern_type glob for wildcards (e.g. all images in a folder)

x264

x264 is a software h264 decoder and encoder.
[1]

Changing Pixel Format

Encode to h264 with YUV420p pixel format

ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4

Images to Video

Reference
Assuming 60 images per second and you want a 30 fps video.

# Make sure -framerate is before -i
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4

Video to Images

Extracting frames from a video

ffmpeg -i video.mp4 frames/%d.png
  • Use -ss H:M:S to specify where to start before you input the video
  • Use -vframes 1 to extract one frames
  • Use -vf "select=not(mod(n\,10))" to select every 10th frame

Get a list of encoders/decoders

Reference

for i in encoders decoders filters; do
    echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done

PSNR/SSIM

Reference
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.

ffmpeg -i distorted.mp4 -i reference.mp4 \
       -lavfi "ssim;[0:v][1:v]psnr" -f null –

ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  psnr -f null -
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi  ssim -f null -

Generate Thumbnails

Reference
Below is a bash script to generate all thumbnails in a folder

Script
#!/usr/bin/env bash

OUTPUT_FOLDER="thumbnails"

mkdir -p $OUTPUT_FOLDER
for file in *.mp4;
  do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png";
done

MP4 to GIF

Normally you can just do

ffmpeg -i my_video.mp4 my_video.gif

If you want better quality, you can use the following filter_complex:

[0]split=2[v1][v2];[v1]palettegen=stats_mode=full[palette];[v2][palette]paletteuse=dither=sierra2_4a

Here is another script from https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality

mp4 to gif script
#!/bin/sh
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2

Pipe to stdout

Below is an example of piping the video only to stdout:

ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo -

In Python, you can read it as follows:

video_width = 1920
video_height = 1080
ffmpeg_process = subprocess.Popen(ffmpeg_command,
                                  stdout=subprocess.PIPE,
                                  stderr=subprocess.PIPE)
raw_image = ffmpeg_process.stdout.read(
              video_width * video_height * 3)
image = (np.frombuffer(raw_image, dtype=np.uint8)
           .reshape(video_height, video_width, 3))

Filters

Filters are part of the CLI
https://ffmpeg.org/ffmpeg-filters.html

Crop

ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
  • Here x and y are the top left corners of your crop. w and h are the height and width of the final image or video.

Resizing/Scaling

FFMpeg Scaling
scale filter

ffmpeg -i input.avi -vf scale=320:240 output.avi

ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
  • If the aspect ratio is not what you expect, try using the setdar filter.
    • E.g. setdar=ratio=2/1
Resizing with transparent padding

Useful for generating logos

ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
More sizes


256
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
512
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png

Rotation

transpose filter

To rotate 180 degrees

ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
  • 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
  • 1 – Rotate by 90 degrees clockwise.
  • 2 – Rotate by 90 degrees counter-clockwise.
  • 3 – Rotate by 90 degrees clockwise and flip vertically.

360 Video

See v360 filter

Converting EAC to equirectangular

Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:

ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4

Sometimes you may run into errors where height or width is not divisible by 2.
Apply a scale filter to fix this issue.

ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4

Converting to rectilinear

ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4

Metadata

To add 360 video metadata, you should use Google's spatial-media. This will add the following sidedata which you can see using ffprobe:

Side data:
 spherical: equirectangular (0.000000/0.000000/0.000000) 

Removing Duplicate Frames

Reference
mpdecimate filter

Useful for extracting frames from timelapses.

ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4

Stack and Unstack

To stack, see hstack, vstack.
To unstack, see crop.

Filter-Complex

Filter complex allows you to create a graph of filters.

Suppose you have 3 inputs: $1, $2, $3.
Then you can access them as streams [0], [1], [3].
The filter syntax allows you to chain multiple filters where each filter is an edge.
For example, [0]split[t1][t2] creates two vertices t1 and t2 from input 0. The last statement in your edge will be the output of your command:
E.g. [t1][t2]vstack

ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y

Concatenate Videos

ffmpeg -i part_1.mp4 \
    -i part_2.mp4 \
    -i part_3.mp4 \
    -filter_complex \
    "[0]scale=1920:1080[0s];\
     [1]scale=1920:1080[1s];\
     [2]scale=1920:1080[2s];\
     [0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \
    -map "[v]" -map "[a]" \
    -vsync 2 \
    all_parts.mp4 -y

Replace transparency

Reference
Add a background to transparent images.

ffmpeg -i in.mov -filter_complex
       "[0]format=pix_fmts=yuva420p,split=2[bg][fg];[bg]drawbox=c=white@1:replace=1:t=fill[bg];
        [bg][fg]overlay=format=auto" -c:a copy new.mov

Draw Text

https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg

ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output

C API

A doxygen reference manual for their C api is available at [2].
Note that FFmpeg is licensed under GPL.
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [3].

Getting Started

Best way to get started is to look at the official examples.

Structs

  • AVInputFormat/AVOutputFormat Represents a container type.
  • AVFormatContext Represents your specific container.
  • AVStream Represents a single audio, video, or data stream in your container.
  • AVCodec Represents a single codec (e.g. H.264)
  • AVCodecContext Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
  • AVPacket Compressed Data.
  • AVFrame Decoded audio or video data.
  • SwsContext Used for image scaling and colorspace and pixel format conversion operations.

Pixel Formats

Reference
Pixel formats are stored as AVPixelFormat enums.
Below are descriptions for a few common pixel formats.
Note that the exact sizes of buffers may vary depending on alignment.

AV_PIX_FMT_RGB24
  • This is your standard 24 bits per pixel RGB.
  • In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.
  • Where the linesize is typically \(\displaystyle 3 * width\) bytes per row and \(\displaystyle 3\) bytes per pixel.
AV_PIX_FMT_YUV420P
  • This is a planar YUV pixel format with chroma subsampling.
  • Each pixel will have its own luma component (Y) but each \(\displaystyle 2 \times 2\) block of pixels will share chrominance components (U, V)
  • In your AVFrame, data[0] will contain your Y image, data[1] will contain your .
  • Data[0] will typically be \(\displaystyle width * height\) bytes.
  • Data[1] and data[2] will typically be \(\displaystyle width * height / 4\) bytes.

Muxing to memory

You can specify a custom AVIOContext and attach it to your AVFormatContext->pb to mux directly to memory or to implement your own buffering.

NVENC

Options Reference

When encoding using NVENC, your codec_ctx->priv_data is a pointer to a NvencContext.

To list all of the things you can set in the private data, you can type the following in bash

ffmpeg -hide_banner -h encoder=h264_nvenc
NVENC Codec Ctx
  if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
                                    NULL, 0)) < 0) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl;
    return;
  }

  if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) {
    cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder"
         << endl;
    return;
  }
  codec_ctx = avcodec_alloc_context3(codec);
  codec_ctx->bit_rate = 2500000;
  codec_ctx->width = source_codec_ctx->width;
  codec_ctx->height = source_codec_ctx->height;
  codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
  codec_ctx->time_base = source_codec_ctx->time_base;
  input_timebase = source_codec_ctx->time_base;
  codec_ctx->framerate = source_codec_ctx->framerate;
  codec_ctx->pix_fmt = AV_PIX_FMT_CUDA;
  codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
  codec_ctx->max_b_frames = 0;
  codec_ctx->delay = 0;
  codec_ctx->gop_size = 0;
// Todo: figure out which ones of these do nothing
  av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN);
  av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0);
  av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0);
  av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0);
  av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0);
  av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);

C++ API

FFmpeg does not have an official C++ API.
There are wrappers such as Raveler/ffmpeg-cpp which you can use.
However, I recommend just using the C API and wrapping things in smart pointers.

Python API

You can try pyav which contains bindings for the library. However I haven't tried it.
If you just need to call the CLI, you can use ffmpeg-python to help build calls.

JavaScript API

To use FFmpeg in a browser, see ffmpegwasm.
Note: I have not tried this. It uses CLI commands, not library API commands.

My Preferences

My preferences for encoding video

H264

I mostly use H264 when working on projects for compatibility purposes. Here I typically don't need the smallest file size or best quality, prioritizing encoding speed.

!#/bin/bash

ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2
Notes
  • MP4 is ok

H265/HEVC

H264/HEVC is used for archival purposes to minimize the file size and maximize the quality.

!#/bin/bash

ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2
Notes
  • You need to output to a MKV file
  • The pixel format yuv444p10le is 10 bit color without chroma subsampling. If your source is lower, you can use yuv420p instead for 8-bit color and 4:2:0 chroma subsampling.

AV1