FFmpeg: Difference between revisions

 
(74 intermediate revisions by the same user not shown)
Line 1: Line 1:
FFmpeg (Fast Forward MPEG) is a library for encoding and decoding multimedia.
[https://ffmpeg.org/ FFmpeg] (Fast Forward MPEG) is a library for encoding and decoding multimedia.
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
 
I find it useful for converting videos to gifs. You can also [https://en.wikibooks.org/wiki/FFMPEG_An_Intermediate_Guide/image_sequence extract videos into a sequence of images or vice-versa].
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout.


==CLI==
==CLI==
You can download static builds of FFmpeg from
* Linux: [https://johnvansickle.com/ffmpeg/ https://johnvansickle.com/ffmpeg/]
* Windows: [https://ffmpeg.zeranoe.com/builds/ https://ffmpeg.zeranoe.com/builds/]
If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script.
Basic usage is as follows:
Basic usage is as follows:
<pre>
<pre>
ffmpeg -i input_file [-s resolution] [-b bitrate] [-ss start_second] [-t time] [-r output_framerate] output.mp4
ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
</pre>
</pre>
* Use <code>-pattern_type glob</code> for wildcards (e.g. all images in a folder)


===x264===
===x264===
Line 29: Line 38:
===Video to Images===
===Video to Images===
Extracting frames from a video
Extracting frames from a video
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
ffmpeg -i video.mp4 frames/%d.png
ffmpeg -i video.mp4 frames/%d.png
</syntaxhighlight>
</syntaxhighlight>
* Use <code>-ss H:M:S</code> to specify where to start
 
* Use <code>-ss H:M:S</code> to specify where to start before you input the video
* Use <code>-vframes 1</code> to extract one frames
* Use <code>-vframes 1</code> to extract one frames
 
* Use <code>-vf "select=not(mod(n\,10))"</code> to select every 10th frame
===Crop===
<syntaxhighlight lang="bash">
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename
</syntaxhighlight>
 


===Get a list of encoders/decoders===
===Get a list of encoders/decoders===
Line 82: Line 88:
</syntaxhighlight>
</syntaxhighlight>


However, Ruofei has a more advanced script below:
If you want better quality, you can use the following filter_complex:
{{hidden | Ruofei's MP4 to GIF |  
<pre>
[0]split=2[v1][v2];[v1]palettegen=stats_mode=full[palette];[v2][palette]paletteuse=dither=sierra2_4a
</pre>
 
Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality]
{{hidden | mp4 to gif script |
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/sh
#!/bin/sh
start_time=0:0
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
duration=17
</syntaxhighlight>
}}


palette="/tmp/palette.png"
===Pipe to stdout===
Below is an example of piping the video only to stdout:
<pre>
ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo -
</pre>


filters="fps=15,scale=320:-1:flags=lanczos"
In Python, you can read it as follows:
<syntaxhighlight lang="python">
video_width = 1920
video_height = 1080
ffmpeg_process = subprocess.Popen(ffmpeg_command,
                                  stdout=subprocess.PIPE,
                                  stderr=subprocess.PIPE)
raw_image = ffmpeg_process.stdout.read(
              video_width * video_height * 3)
image = (np.frombuffer(raw_image, dtype=np.uint8)
          .reshape(video_height, video_width, 3))
</syntaxhighlight>


ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -vf "$filters,palettegen" -y $palette
==Filters==
ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -i $palette -lavfi "$filters [x]; [x][1:v] paletteuse" -y $1.gif
Filters are part of the CLI<br>
</syntaxhighlight>
[https://ffmpeg.org/ffmpeg-filters.html https://ffmpeg.org/ffmpeg-filters.html]
}}


Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality]
===Crop===
{{hidden | mp4 to gif script |
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/sh
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename
ffmpeg -i $1 -vf "fps=10,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
</syntaxhighlight>
</syntaxhighlight>
}}
 
* Here <code>x</code> and <code>y</code> are the top left corners of your crop. <code>w</code> and <code>h</code> are the height and width of the final image or video.


===Resizing/Scaling===
===Resizing/Scaling===
[https://trac.ffmpeg.org/wiki/Scaling FFMpeg Scaling]<br>
[https://trac.ffmpeg.org/wiki/Scaling FFMpeg Scaling]<br>
[https://ffmpeg.org/ffmpeg-filters.html#scale scale filter]
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
ffmpeg -i input.avi -vf scale=320:240 output.avi
ffmpeg -i input.avi -vf scale=320:240 output.avi
Line 113: Line 140:
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
</syntaxhighlight>
</syntaxhighlight>
* If the aspect ratio is not what you expect, try using the <code>setdar</code> filter.
** E.g. <code>setdar=ratio=2/1</code>
;Resizing with transparent padding
Useful for generating logos
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
{{hidden | More sizes |
;256
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
;512
<syntaxhighlight lang="bash">
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png
</syntaxhighlight>
}}
===Rotation===
[https://ffmpeg.org/ffmpeg-filters.html#transpose transpose filter]
To rotate 180 degrees
<syntaxhighlight lang="bash">
ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4
</syntaxhighlight>
* 0 – Rotate by 90 degrees counter-clockwise and flip vertically.
* 1 – Rotate by 90 degrees clockwise.
* 2 – Rotate by 90 degrees counter-clockwise.
* 3 – Rotate by 90 degrees clockwise and flip vertically.
===360 Video===
See [https://ffmpeg.org/ffmpeg-filters.html#v360 v360 filter]
=====Converting EAC to equirectangular=====
Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4
</pre>
Sometimes you may run into errors where height or width is not divisible by 2.<br>
Apply a scale filter to fix this issue.
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4
</pre>
====Converting to rectilinear====
<pre>
ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4
</pre>
====Metadata====
To add 360 video metadata, you should use [https://github.com/google/spatial-media Google's spatial-media].
This will add the following sidedata which you can see using <code>ffprobe</code>:
<pre>
Side data:
spherical: equirectangular (0.000000/0.000000/0.000000)
</pre>
===Removing Duplicate Frames===
[https://stackoverflow.com/questions/37088517/remove-sequentially-duplicate-frames-when-using-ffmpeg Reference]<br>
[https://ffmpeg.org/ffmpeg-filters.html#mpdecimate mpdecimate filter]
Useful for extracting frames from timelapses.
<syntaxhighlight lang="bash">
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
</syntaxhighlight>
===Stack and Unstack===
To stack, see [https://ffmpeg.org/ffmpeg-all.html#hstack <code>hstack</code>], [https://ffmpeg.org/ffmpeg-all.html#vstack <code>vstack</code>]. 
To unstack, see <code>crop</code>.
===Filter-Complex===
Filter complex allows you to create a graph of filters.
Suppose you have 3 inputs: $1, $2, $3. 
Then you can access them as streams [0], [1], [3]. 
The filter syntax allows you to chain multiple filters where each filter is an edge. 
For example, <code>[0]split[t1][t2]</code> creates two vertices t1 and t2 from input 0.
The last statement in your edge will be the output of your command: 
E.g. <code>[t1][t2]vstack</code>
<pre>
ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y
</pre>
===Concatenate Videos===
<pre>
ffmpeg -i part_1.mp4 \
    -i part_2.mp4 \
    -i part_3.mp4 \
    -filter_complex \
    "[0]scale=1920:1080[0s];\
    [1]scale=1920:1080[1s];\
    [2]scale=1920:1080[2s];\
    [0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \
    -map "[v]" -map "[a]" \
    -vsync 2 \
    all_parts.mp4 -y
</pre>
===Replace transparency===
[https://superuser.com/questions/1341674/ffmpeg-convert-transparency-to-a-certain-color Reference]<br>
Add a background to transparent images.<br>
<pre>
ffmpeg -i in.mov -filter_complex "[0]format=pix_fmts=yuva420p,split=2[bg][fg];[bg]drawbox=c=white@1:replace=1:t=fill[bg];[bg][fg]overlay=format=auto" -c:a copy new.mov
</pre>
===Draw Text===
https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
<pre>
ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output
</pre>


==C API==
==C API==
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].<br>
Note that FFmpeg is licensed under GPL.<br>
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [https://batchloaf.wordpress.com/2017/02/12/a-simple-way-to-read-and-write-audio-and-video-files-in-c-using-ffmpeg-part-2-video/].<br>


===Getting Started===
===Getting Started===
Best way to get started is to look at the [https://ffmpeg.org/doxygen/trunk/examples.html official examples].
====Structs====
====Structs====
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
Line 123: Line 272:
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
Line 148: Line 297:
===Muxing to memory===
===Muxing to memory===
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.


===NVENC===
===NVENC===
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
To list all of the things you can set in the private data, you can type the following in bash
To list all of the things you can set in the private data, you can type the following in bash
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 158: Line 308:
</syntaxhighlight>
</syntaxhighlight>


{{ hidden | NVENC Codec Ctx |
<syntaxhighlight lang="c++">
<syntaxhighlight lang="c++">
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
Line 191: Line 342:
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
</syntaxhighlight>
</syntaxhighlight>
}}


==C++ API==
==C++ API==
FFmpeg does not have an official C++ API.
FFmpeg does not have an official C++ API.<br>
There are wrappers such as [https://github.com/Raveler/ffmpeg-cpp Raveler/ffmpeg-cpp] which you can use.<br>
However, I recommend just using the C API and wrapping things in smart pointers.
 
==Python API==
You can try [https://github.com/PyAV-Org/PyAV pyav] which contains bindings for the library. However I haven't tried it. 
If you just need to call the CLI, you can use [https://github.com/kkroening/ffmpeg-python ffmpeg-python] to help build calls.
 
==JavaScript API==
To use FFmpeg in a browser, see [https://ffmpegwasm.netlify.app/ ffmpegwasm]. 
This is used in https://davidl.me/apps/media/index.html.
 
==My Preferences==
My preferences for encoding video
 
===AV1===
Prefer AV1 for encoding video on on modern devices.
 
 
===H265/HEVC===
H264/HEVC is now a good tradeoff between size, quality, and compatibility.
This has been supported on devices since Android 5.0 (2014).
<syntaxhighlight lang="bash">
ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2
</syntaxhighlight>
 
;Notes
* The pixel format <code>yuv444p10le</code> is 10 bit color without chroma subsampling. If your source is lower, you can use <code>yuv420p</code> instead for 8-bit color and 4:2:0 chroma subsampling.
 
===H264===
If you need compatability with very old and low end devices.
<syntaxhighlight lang="bash">
ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2
</syntaxhighlight>
 
===Opus===
 
For streaming:
<syntaxhighlight lang="bash">
ffmpeg -i input.wav -c:a libopus -b:a 96k output.opus
</syntaxhighlight>
 
See https://wiki.xiph.org/Opus_Recommended_Settings