FFmpeg: Difference between revisions

 
(57 intermediate revisions by the same user not shown)
Line 1: Line 1:
[https://ffmpeg.org/ FFmpeg] (Fast Forward MPEG) is a library for encoding and decoding multimedia.
[https://ffmpeg.org/ FFmpeg] (Fast Forward MPEG) is a library for encoding and decoding multimedia.


You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API].
 
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout.
I find it useful for converting videos to gifs. You can also [https://en.wikibooks.org/wiki/FFMPEG_An_Intermediate_Guide/image_sequence extract videos into a sequence of images or vice-versa].


==CLI==
==CLI==
Line 9: Line 8:
* Linux: [https://johnvansickle.com/ffmpeg/ https://johnvansickle.com/ffmpeg/]
* Linux: [https://johnvansickle.com/ffmpeg/ https://johnvansickle.com/ffmpeg/]
* Windows: [https://ffmpeg.zeranoe.com/builds/ https://ffmpeg.zeranoe.com/builds/]
* Windows: [https://ffmpeg.zeranoe.com/builds/ https://ffmpeg.zeranoe.com/builds/]
If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script.


Basic usage is as follows:
Basic usage is as follows:
Line 14: Line 15:
ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4
</pre>
</pre>
* Use <code>-pattern_type glob</code> for wildcards (e.g. all images in a folder)


===x264===
===x264===
Line 43: Line 46:
* Use <code>-vframes 1</code> to extract one frames
* Use <code>-vframes 1</code> to extract one frames
* Use <code>-vf "select=not(mod(n\,10))"</code> to select every 10th frame
* Use <code>-vf "select=not(mod(n\,10))"</code> to select every 10th frame
===Crop===
<syntaxhighlight lang="bash">
ffmpeg -i input_filename -vf  "crop=w:h:x:y" output_filename
</syntaxhighlight>
* Here <code>x</code> and <code>y</code> are the top left corners of your crop. <code>w</code> and <code>h</code> are the height and width of the final image or video.


===Get a list of encoders/decoders===
===Get a list of encoders/decoders===
Line 92: Line 88:
</syntaxhighlight>
</syntaxhighlight>


However, Ruofei has a more advanced script below:
If you want better quality, you can use the following filter_complex:
{{hidden | Ruofei's MP4 to GIF |  
<pre>
[0]split=2[v1][v2];[v1]palettegen=stats_mode=full[palette];[v2][palette]paletteuse=dither=sierra2_4a
</pre>
 
Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality]
{{hidden | mp4 to gif script |
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/sh
#!/bin/sh
start_time=0:0
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
duration=17
</syntaxhighlight>
}}


palette="/tmp/palette.png"
===Pipe to stdout===
Below is an example of piping the video only to stdout:
<pre>
ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo -
</pre>


filters="fps=15,scale=320:-1:flags=lanczos"
In Python, you can read it as follows:
<syntaxhighlight lang="python">
video_width = 1920
video_height = 1080
ffmpeg_process = subprocess.Popen(ffmpeg_command,
                                  stdout=subprocess.PIPE,
                                  stderr=subprocess.PIPE)
raw_image = ffmpeg_process.stdout.read(
              video_width * video_height * 3)
image = (np.frombuffer(raw_image, dtype=np.uint8)
          .reshape(video_height, video_width, 3))
</syntaxhighlight>


ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -vf "$filters,palettegen" -y $palette
==Filters==
ffmpeg -v warning -ss $start_time -t $duration -i $1.mp4 -i $palette -lavfi "$filters [x]; [x][1:v] paletteuse" -y $1.gif
Filters are part of the CLI<br>
</syntaxhighlight>
[https://ffmpeg.org/ffmpeg-filters.html https://ffmpeg.org/ffmpeg-filters.html]
}}


Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality]
===Crop===
{{hidden | mp4 to gif script |
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/sh
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename
ffmpeg -i $1 -vf "fps=10,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2
</syntaxhighlight>
</syntaxhighlight>
}}
 
* Here <code>x</code> and <code>y</code> are the top left corners of your crop. <code>w</code> and <code>h</code> are the height and width of the final image or video.


===Resizing/Scaling===
===Resizing/Scaling===
Line 125: Line 140:
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png
</syntaxhighlight>
</syntaxhighlight>
* If the aspect ratio is not what you expect, try using the <code>setdar</code> filter.
** E.g. <code>setdar=ratio=2/1</code>


;Resizing with transparent padding
;Resizing with transparent padding
Line 161: Line 179:
See [https://ffmpeg.org/ffmpeg-filters.html#v360 v360 filter]
See [https://ffmpeg.org/ffmpeg-filters.html#v360 v360 filter]


;Converting EAC to equirectangular
=====Converting EAC to equirectangular=====
Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows:
<pre>
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4
Line 170: Line 189:
<pre>
<pre>
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4
</pre>
====Converting to rectilinear====
<pre>
ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4
</pre>
====Metadata====
To add 360 video metadata, you should use [https://github.com/google/spatial-media Google's spatial-media].
This will add the following sidedata which you can see using <code>ffprobe</code>:
<pre>
Side data:
spherical: equirectangular (0.000000/0.000000/0.000000)
</pre>
</pre>


Line 180: Line 212:
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
</syntaxhighlight>
</syntaxhighlight>
===Stack and Unstack===
To stack, see [https://ffmpeg.org/ffmpeg-all.html#hstack <code>hstack</code>], [https://ffmpeg.org/ffmpeg-all.html#vstack <code>vstack</code>]. 
To unstack, see <code>crop</code>.
===Filter-Complex===
Filter complex allows you to create a graph of filters.
Suppose you have 3 inputs: $1, $2, $3. 
Then you can access them as streams [0], [1], [3]. 
The filter syntax allows you to chain multiple filters where each filter is an edge. 
For example, <code>[0]split[t1][t2]</code> creates two vertices t1 and t2 from input 0.
The last statement in your edge will be the output of your command: 
E.g. <code>[t1][t2]vstack</code>
<pre>
ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y
</pre>
===Concatenate Videos===
<pre>
ffmpeg -i part_1.mp4 \
    -i part_2.mp4 \
    -i part_3.mp4 \
    -filter_complex \
    "[0]scale=1920:1080[0s];\
    [1]scale=1920:1080[1s];\
    [2]scale=1920:1080[2s];\
    [0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \
    -map "[v]" -map "[a]" \
    -vsync 2 \
    all_parts.mp4 -y
</pre>
===Replace transparency===
[https://superuser.com/questions/1341674/ffmpeg-convert-transparency-to-a-certain-color Reference]<br>
Add a background to transparent images.<br>
<pre>
ffmpeg -i in.mov -filter_complex "[0]format=pix_fmts=yuva420p,split=2[bg][fg];[bg]drawbox=c=white@1:replace=1:t=fill[bg];[bg][fg]overlay=format=auto" -c:a copy new.mov
</pre>
===Draw Text===
https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
<pre>
ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output
</pre>


==C API==
==C API==
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].<br>
Note that FFmpeg is licensed under GPL.<br>
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [https://batchloaf.wordpress.com/2017/02/12/a-simple-way-to-read-and-write-audio-and-video-files-in-c-using-ffmpeg-part-2-video/].<br>


===Getting Started===
===Getting Started===
Best way to get started is to look at the [https://ffmpeg.org/doxygen/trunk/examples.html official examples].
====Structs====
====Structs====
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type.
Line 190: Line 272:
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264)
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters.
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps).
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data.
Line 215: Line 297:
===Muxing to memory===
===Muxing to memory===
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering.


===NVENC===
===NVENC===
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference]
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>.
To list all of the things you can set in the private data, you can type the following in bash
To list all of the things you can set in the private data, you can type the following in bash
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 225: Line 308:
</syntaxhighlight>
</syntaxhighlight>


{{ hidden | NVENC Codec Ctx |
<syntaxhighlight lang="c++">
<syntaxhighlight lang="c++">
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
   if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL,
Line 258: Line 342:
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
   av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0);
</syntaxhighlight>
</syntaxhighlight>
}}


==C++ API==
==C++ API==
Line 263: Line 348:
There are wrappers such as [https://github.com/Raveler/ffmpeg-cpp Raveler/ffmpeg-cpp] which you can use.<br>
There are wrappers such as [https://github.com/Raveler/ffmpeg-cpp Raveler/ffmpeg-cpp] which you can use.<br>
However, I recommend just using the C API and wrapping things in smart pointers.
However, I recommend just using the C API and wrapping things in smart pointers.
==Python API==
You can try [https://github.com/PyAV-Org/PyAV pyav] which contains bindings for the library. However I haven't tried it. 
If you just need to call the CLI, you can use [https://github.com/kkroening/ffmpeg-python ffmpeg-python] to help build calls.
==JavaScript API==
To use FFmpeg in a browser, see [https://ffmpegwasm.netlify.app/ ffmpegwasm]. 
This is used in https://davidl.me/apps/media/index.html.


==My Preferences==
==My Preferences==
My preferences for encoding video
My preferences for encoding video
===AV1===
Prefer AV1 for encoding video on on modern devices.
===H265/HEVC===
H264/HEVC is now a good tradeoff between size, quality, and compatibility.
This has been supported on devices since Android 5.0 (2014).
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
!#/bin/bash
ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2
</syntaxhighlight>


ffmpeg -i $1 -c:v libx265 -crf 28 -preset medium -c:a libopus -b:a 128K $2
;Notes
* The pixel format <code>yuv444p10le</code> is 10 bit color without chroma subsampling. If your source is lower, you can use <code>yuv420p</code> instead for 8-bit color and 4:2:0 chroma subsampling.
 
===H264===
If you need compatability with very old and low end devices.
<syntaxhighlight lang="bash">
ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2
</syntaxhighlight>
 
===Opus===
 
For streaming:
<syntaxhighlight lang="bash">
ffmpeg -i input.wav -c:a libopus -b:a 96k output.opus
</syntaxhighlight>
</syntaxhighlight>


;Notes
See https://wiki.xiph.org/Opus_Recommended_Settings
* You need to output to a MKV file