FFmpeg: Difference between revisions
No edit summary |
|||
(97 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
FFmpeg is a library for encoding and decoding multimedia. | [https://ffmpeg.org/ FFmpeg] (Fast Forward MPEG) is a library for encoding and decoding multimedia. | ||
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API]. | |||
You can interact with FFmpeg using their command-line interface or using their [https://ffmpeg.org/doxygen/trunk/index.html C API]. | |||
Note that a lot of things involving just decoding or encoding can be done by calling their CLI application and piping things to stdin or from stdout. | |||
==CLI== | ==CLI== | ||
You can download static builds of FFmpeg from | |||
* Linux: [https://johnvansickle.com/ffmpeg/ https://johnvansickle.com/ffmpeg/] | |||
* Windows: [https://ffmpeg.zeranoe.com/builds/ https://ffmpeg.zeranoe.com/builds/] | |||
If you need nvenc support, you can build FFmpeg with https://github.com/markus-perl/ffmpeg-build-script. | |||
Basic usage is as follows: | Basic usage is as follows: | ||
<pre> | <pre> | ||
ffmpeg -i input_file [-s resolution] [-b bitrate] [- | ffmpeg [-ss start_second] -i input_file [-s resolution] [-b bitrate] [-t time] [-r output_framerate] output.mp4 | ||
</pre> | </pre> | ||
* Use <code>-pattern_type glob</code> for wildcards (e.g. all images in a folder) | |||
===x264=== | ===x264=== | ||
Line 17: | Line 26: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4 | ffmpeg -i input.mp4 -c:v libx264 -profile:v high -pix_fmt yuv420p output.mp4 | ||
<syntaxhighlight> | </syntaxhighlight> | ||
===Images to Video=== | |||
[https://en.wikibooks.org/wiki/FFMPEG_An_Intermediate_Guide/image_sequence Reference]<br> | |||
Assuming 60 images per second and you want a 30 fps video. | |||
<syntaxhighlight lang="bash"> | |||
# Make sure -framerate is before -i | |||
ffmpeg -framerate 60 -i image-%03d.png -r 30 video.mp4 | |||
</syntaxhighlight> | |||
===Video to Images=== | |||
Extracting frames from a video | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i video.mp4 frames/%d.png | |||
</syntaxhighlight> | |||
* Use <code>-ss H:M:S</code> to specify where to start before you input the video | |||
* Use <code>-vframes 1</code> to extract one frames | |||
* Use <code>-vf "select=not(mod(n\,10))"</code> to select every 10th frame | |||
===Get a list of encoders/decoders=== | |||
[https://superuser.com/questions/1236275/how-can-i-use-crf-encoding-with-nvenc-in-ffmpeg Reference] | |||
<syntaxhighlight lang="bash"> | |||
for i in encoders decoders filters; do | |||
echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda" | |||
done | |||
</syntaxhighlight> | |||
===PSNR/SSIM=== | |||
[https://github.com/stoyanovgeorge/ffmpeg/wiki/How-to-Compare-Video Reference]<br> | |||
FFmpeg can compare two videos and output the psnr or ssim numbers for each of the y, u, and v channels.<br> | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i distorted.mp4 -i reference.mp4 \ | |||
-lavfi "ssim;[0:v][1:v]psnr" -f null – | |||
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi psnr -f null - | |||
ffmpeg -i distorted.mp4 -i reference.mp4 -lavfi ssim -f null - | |||
</syntaxhighlight> | |||
===Generate Thumbnails=== | |||
[https://superuser.com/questions/1099491/batch-extract-thumbnails-with-ffmpeg Reference]<br> | |||
Below is a bash script to generate all thumbnails in a folder | |||
{{hidden|Script| | |||
<syntaxhighlight lang="bash"> | |||
#!/usr/bin/env bash | |||
OUTPUT_FOLDER="thumbnails" | |||
mkdir -p $OUTPUT_FOLDER | |||
for file in *.mp4; | |||
do ffmpeg -i "$file" -vf "select=gte(n\,300)" -vframes 1 "$OUTPUT_FOLDER/${file%.mp4}.png"; | |||
done | |||
</syntaxhighlight> | |||
}} | |||
===MP4 to GIF=== | |||
Normally you can just do | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i my_video.mp4 my_video.gif | |||
</syntaxhighlight> | |||
If you want better quality, you can use the following filter_complex: | |||
<pre> | |||
[0]split=2[v1][v2];[v1]palettegen=stats_mode=full[palette];[v2][palette]paletteuse=dither=sierra2_4a | |||
</pre> | |||
Here is another script from [https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality https://superuser.com/questions/556029/how-do-i-convert-a-video-to-gif-using-ffmpeg-with-reasonable-quality] | |||
{{hidden | mp4 to gif script | | |||
<syntaxhighlight lang="bash"> | |||
#!/bin/sh | |||
ffmpeg -i $1 -vf "fps=15,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 $2 | |||
</syntaxhighlight> | |||
}} | |||
===Pipe to stdout=== | |||
Below is an example of piping the video only to stdout: | |||
<pre> | |||
ffmpeg -i video.webm -pix_fmt rgb24 -f rawvideo - | |||
</pre> | |||
In Python, you can read it as follows: | |||
<syntaxhighlight lang="python"> | |||
video_width = 1920 | |||
video_height = 1080 | |||
ffmpeg_process = subprocess.Popen(ffmpeg_command, | |||
stdout=subprocess.PIPE, | |||
stderr=subprocess.PIPE) | |||
raw_image = ffmpeg_process.stdout.read( | |||
video_width * video_height * 3) | |||
image = (np.frombuffer(raw_image, dtype=np.uint8) | |||
.reshape(video_height, video_width, 3)) | |||
</syntaxhighlight> | |||
==Filters== | |||
Filters are part of the CLI<br> | |||
[https://ffmpeg.org/ffmpeg-filters.html https://ffmpeg.org/ffmpeg-filters.html] | |||
===Crop=== | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i input_filename -vf "crop=w:h:x:y" output_filename | |||
</syntaxhighlight> | |||
* Here <code>x</code> and <code>y</code> are the top left corners of your crop. <code>w</code> and <code>h</code> are the height and width of the final image or video. | |||
===Resizing/Scaling=== | |||
[https://trac.ffmpeg.org/wiki/Scaling FFMpeg Scaling]<br> | |||
[https://ffmpeg.org/ffmpeg-filters.html#scale scale filter] | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i input.avi -vf scale=320:240 output.avi | |||
ffmpeg -i input.jpg -vf scale=iw*2:ih input_double_width.png | |||
</syntaxhighlight> | |||
* If the aspect ratio is not what you expect, try using the <code>setdar</code> filter. | |||
** E.g. <code>setdar=ratio=2/1</code> | |||
;Resizing with transparent padding | |||
Useful for generating logos | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i icon.svg -vf "scale=h=128:w=128:force_original_aspect_ratio=decrease,pad=128:128:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png | |||
</syntaxhighlight> | |||
{{hidden | More sizes | | |||
;256 | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i icon.svg -vf "scale=h=256:w=256:force_original_aspect_ratio=decrease,pad=256:256:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png | |||
</syntaxhighlight> | |||
;512 | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i icon.svg -vf "scale=h=512:w=512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" -y icon.png | |||
</syntaxhighlight> | |||
}} | |||
===Rotation=== | |||
[https://ffmpeg.org/ffmpeg-filters.html#transpose transpose filter] | |||
To rotate 180 degrees | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i input.mp4 -vf "transpose=1,transpose=1" output.mp4 | |||
</syntaxhighlight> | |||
* 0 – Rotate by 90 degrees counter-clockwise and flip vertically. | |||
* 1 – Rotate by 90 degrees clockwise. | |||
* 2 – Rotate by 90 degrees counter-clockwise. | |||
* 3 – Rotate by 90 degrees clockwise and flip vertically. | |||
===360 Video=== | |||
See [https://ffmpeg.org/ffmpeg-filters.html#v360 v360 filter] | |||
=====Converting EAC to equirectangular===== | |||
Youtube sometimes uses an EAC format. You can convert this to the traditional equirectangular format as follows: | |||
<pre> | |||
ffmpeg -i input.mp4 -vf "v360=eac:e" output.mp4 | |||
</pre> | |||
Sometimes you may run into errors where height or width is not divisible by 2.<br> | |||
Apply a scale filter to fix this issue. | |||
<pre> | |||
ffmpeg -i input.mp4 -vf "v360=eac:e,scale=iw:-2" output.mp4 | |||
</pre> | |||
====Converting to rectilinear==== | |||
<pre> | |||
ffmpeg -i input.mp4 -vf "v360=e:rectilinear:h_fov=90:v_fov=90" output.mp4 | |||
</pre> | |||
====Metadata==== | |||
To add 360 video metadata, you should use [https://github.com/google/spatial-media Google's spatial-media]. | |||
This will add the following sidedata which you can see using <code>ffprobe</code>: | |||
<pre> | |||
Side data: | |||
spherical: equirectangular (0.000000/0.000000/0.000000) | |||
</pre> | |||
===Removing Duplicate Frames=== | |||
[https://stackoverflow.com/questions/37088517/remove-sequentially-duplicate-frames-when-using-ffmpeg Reference]<br> | |||
[https://ffmpeg.org/ffmpeg-filters.html#mpdecimate mpdecimate filter] | |||
Useful for extracting frames from timelapses. | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4 | |||
</syntaxhighlight> | |||
===Stack and Unstack=== | |||
To stack, see [https://ffmpeg.org/ffmpeg-all.html#hstack <code>hstack</code>], [https://ffmpeg.org/ffmpeg-all.html#vstack <code>vstack</code>]. | |||
To unstack, see <code>crop</code>. | |||
===Filter-Complex=== | |||
Filter complex allows you to create a graph of filters. | |||
Suppose you have 3 inputs: $1, $2, $3. | |||
Then you can access them as streams [0], [1], [3]. | |||
The filter syntax allows you to chain multiple filters where each filter is an edge. | |||
For example, <code>[0]split[t1][t2]</code> creates two vertices t1 and t2 from input 0. | |||
The last statement in your edge will be the output of your command: | |||
E.g. <code>[t1][t2]vstack</code> | |||
<pre> | |||
ffmpeg -i $1 -i $2 -i $3 -filter_complex "[0]split[t1][t2];[t1][t2]vstack" output.mkv -y | |||
</pre> | |||
===Concatenate Videos=== | |||
<pre> | |||
ffmpeg -i part_1.mp4 \ | |||
-i part_2.mp4 \ | |||
-i part_3.mp4 \ | |||
-filter_complex \ | |||
"[0]scale=1920:1080[0s];\ | |||
[1]scale=1920:1080[1s];\ | |||
[2]scale=1920:1080[2s];\ | |||
[0s][0:a][1s][1:a][2s][2:a]concat=n=3:v=1:a=1[v][a]" \ | |||
-map "[v]" -map "[a]" \ | |||
-vsync 2 \ | |||
all_parts.mp4 -y | |||
</pre> | |||
===Replace transparency=== | |||
[https://superuser.com/questions/1341674/ffmpeg-convert-transparency-to-a-certain-color Reference]<br> | |||
Add a background to transparent images.<br> | |||
<pre> | |||
ffmpeg -i in.mov -filter_complex "[0]format=pix_fmts=yuva420p,split=2[bg][fg];[bg]drawbox=c=white@1:replace=1:t=fill[bg];[bg][fg]overlay=format=auto" -c:a copy new.mov | |||
</pre> | |||
===Draw Text=== | |||
https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg | |||
<pre> | |||
ffmpeg -i input -vf "drawtext=fontfile=Arial.ttf: text='%{frame_num}': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=20: box=1: boxcolor=white: boxborderw=5" -c:a copy output | |||
</pre> | |||
==C API== | ==C API== | ||
A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html]. | A doxygen reference manual for their C api is available at [https://ffmpeg.org/doxygen/trunk/index.html].<br> | ||
Note that FFmpeg is licensed under GPL.<br> | |||
If you only need to do encoding and decoding, you can simply pipe the inputs and outputs of the FFmpeg CLI to your program [https://batchloaf.wordpress.com/2017/02/12/a-simple-way-to-read-and-write-audio-and-video-files-in-c-using-ffmpeg-part-2-video/].<br> | |||
===Getting Started=== | |||
Best way to get started is to look at the [https://ffmpeg.org/doxygen/trunk/examples.html official examples]. | |||
====Structs==== | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVInputFormat.html <code>AVInputFormat</code>]/[https://www.ffmpeg.org/doxygen/trunk/structAVOutputFormat.html <code>AVOutputFormat</code>] Represents a container type. | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVFormatContext.html <code>AVFormatContext</code>] Represents your specific container. | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVStream.html <code>AVStream</code>] Represents a single audio, video, or data stream in your container. | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodec.html <code>AVCodec</code>] Represents a single codec (e.g. H.264) | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVCodecContext.html <code>AVCodecContext</code>] Represents your specific codec and contains all associated paramters (e.g. resolution, bitrate, fps). | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVPacket.html <code>AVPacket</code>] Compressed Data. | |||
* [https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html <code>AVFrame</code>] Decoded audio or video data. | |||
* [https://www.ffmpeg.org/doxygen/trunk/structSwsContext.html <code>SwsContext</code>] Used for image scaling and colorspace and pixel format conversion operations. | |||
====Pixel Formats==== | |||
[https://www.ffmpeg.org/doxygen/4.0/pixfmt_8h.html Reference]<br> | |||
Pixel formats are stored as <code>AVPixelFormat</code> enums.<br> | |||
Below are descriptions for a few common pixel formats.<br> | |||
Note that the exact sizes of buffers may vary depending on alignment.<br> | |||
;AV_PIX_FMT_RGB24 | |||
* This is your standard 24 bits per pixel RGB.<br> | |||
* In your AVFrame, data[0] will contain your single buffer RGBRGBRGB.<br> | |||
* Where the linesize is typically <math>3 * width</math> bytes per row and <math>3</math> bytes per pixel. | |||
;AV_PIX_FMT_YUV420P | |||
* This is a planar YUV pixel format with chroma subsampling.<br> | |||
* Each pixel will have its own luma component (Y) but each <math>2 \times 2</math> block of pixels will share chrominance components (U, V)<br> | |||
* In your AVFrame, data[0] will contain your Y image, data[1] will contain your .<br> | |||
* Data[0] will typically be <math>width * height</math> bytes.<br> | |||
* Data[1] and data[2] will typically be <math>width * height / 4</math> bytes.<br> | |||
===Muxing to memory=== | |||
You can specify a custom <code>AVIOContext</code> and attach it to your <code>AVFormatContext->pb</code> to mux directly to memory or to implement your own buffering. | |||
===NVENC=== | |||
[https://superuser.com/questions/1296374/best-settings-for-ffmpeg-with-nvenc Options Reference] | |||
When encoding using NVENC, your <code>codec_ctx->priv_data</code> is a pointer to a <code>NvencContext</code>. | |||
To list all of the things you can set in the private data, you can type the following in bash | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -hide_banner -h encoder=h264_nvenc | |||
</syntaxhighlight> | |||
{{ hidden | NVENC Codec Ctx | | |||
<syntaxhighlight lang="c++"> | |||
if ((ret = av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_CUDA, NULL, | |||
NULL, 0)) < 0) { | |||
cerr << "[VideoEncoder::VideoEncoder] Failed to create hw context" << endl; | |||
return; | |||
} | |||
if (!(codec = avcodec_find_encoder_by_name("h264_nvenc"))) { | |||
cerr << "[VideoEncoder::VideoEncoder] Failed to find h264_nvenc encoder" | |||
<< endl; | |||
return; | |||
} | |||
codec_ctx = avcodec_alloc_context3(codec); | |||
codec_ctx->bit_rate = 2500000; | |||
codec_ctx->width = source_codec_ctx->width; | |||
codec_ctx->height = source_codec_ctx->height; | |||
codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO; | |||
codec_ctx->time_base = source_codec_ctx->time_base; | |||
input_timebase = source_codec_ctx->time_base; | |||
codec_ctx->framerate = source_codec_ctx->framerate; | |||
codec_ctx->pix_fmt = AV_PIX_FMT_CUDA; | |||
codec_ctx->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE; | |||
codec_ctx->max_b_frames = 0; | |||
codec_ctx->delay = 0; | |||
codec_ctx->gop_size = 0; | |||
// Todo: figure out which ones of these do nothing | |||
av_opt_set(codec_ctx->priv_data, "cq", "23", AV_OPT_SEARCH_CHILDREN); | |||
av_opt_set(codec_ctx->priv_data, "preset", "llhp", 0); | |||
av_opt_set(codec_ctx->priv_data, "tune", "zerolatency", 0); | |||
av_opt_set(codec_ctx->priv_data, "look_ahead", "0", 0); | |||
av_opt_set(codec_ctx->priv_data, "zerolatency", "1", 0); | |||
av_opt_set(codec_ctx->priv_data, "nb_surfaces", "0", 0); | |||
</syntaxhighlight> | |||
}} | |||
==C++ API== | |||
FFmpeg does not have an official C++ API.<br> | |||
There are wrappers such as [https://github.com/Raveler/ffmpeg-cpp Raveler/ffmpeg-cpp] which you can use.<br> | |||
However, I recommend just using the C API and wrapping things in smart pointers. | |||
==Python API== | |||
You can try [https://github.com/PyAV-Org/PyAV pyav] which contains bindings for the library. However I haven't tried it. | |||
If you just need to call the CLI, you can use [https://github.com/kkroening/ffmpeg-python ffmpeg-python] to help build calls. | |||
==JavaScript API== | |||
To use FFmpeg in a browser, see [https://ffmpegwasm.netlify.app/ ffmpegwasm]. | |||
This is used in https://davidl.me/apps/media/index.html. | |||
==My Preferences== | |||
My preferences for encoding video | |||
===AV1=== | |||
Prefer AV1 for encoding video on on modern devices. | |||
===H265/HEVC=== | |||
H264/HEVC is now a good tradeoff between size, quality, and compatibility. | |||
This has been supported on devices since Android 5.0 (2014). | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i $1 -c:v libx265 -crf 23 -preset slow -pix_fmt yuv444p10le -c:a libopus -b:a 128K $2 | |||
</syntaxhighlight> | |||
;Notes | |||
* The pixel format <code>yuv444p10le</code> is 10 bit color without chroma subsampling. If your source is lower, you can use <code>yuv420p</code> instead for 8-bit color and 4:2:0 chroma subsampling. | |||
===H264=== | |||
If you need compatability with very old and low end devices. | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i $1 -c:v libx264 -crf 28 -preset medium -pix_fmt yuv420p -c:a libfdk_aac -b:a 128K $2 | |||
</syntaxhighlight> | |||
===Opus=== | |||
For streaming: | |||
<syntaxhighlight lang="bash"> | |||
ffmpeg -i input.wav -c:a libopus -b:a 96k output.opus | |||
</syntaxhighlight> | |||
See https://wiki.xiph.org/Opus_Recommended_Settings |