\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

Unity is a game development framework and game engine. It has a shallow learning curve and is very capable, supporting most popular platforms and features including AR and VR platforms.

Shaders

Unity shaders are written in HLSL. Unity supports the standard vertex, geometry, fragment shader pipeline.
They also have their own variation of fragment shaders called surface shaders which automatically handle lighting.
Compute shaders are useful for doing parallel computation on the GPU.
Results from compute shaders can be used on the graphical shaders without being copied back to the CPU.

Compute Shaders

To use a compute shader, add a ComputeShader reference to your C# script. To copy data to and from the GPU, use a ComputeBuffer. You can copy standard floats arrays as well as Unity Vector2, Vector3, and Vector4 structs.

Optimization

Loading Files

All loading of streaming assets should be done in a background thread. See C# Multithreading for more details on multithreading.
I've found that Unity's job system doesn't perform as well as C#'s ThreadPool when stressed with thousands of small tasks so I recommend using C# APIs over Unity APIs whenever possible.
Ciela Spike's Thread Ninja may be used to run coroutines in the background when you only need a single, but complex, task.
Much of Unity's API for GPU assets such as new Mesh() and new Texture2D() cannot be called from background threads. I suggest caching textures in system memory as byte[] or Color32[] which can then be loaded with Texture2D.LoadRawImageData(). Similarly, you can hold a Mesh in system memory by using a struct of vertices, triangles, normals, and tangents.

Instancing

If you need multiple instances of the same object, make sure that the material is instanced so that all objects only consume one draw call.
See Unity: GPUInstancing.
Do not attach scripts to thousands of objects. Instead use one script to control all instanced objects.

Merging Meshes

If you need thousands of simple objects, one way is to use merge all of the meshes.
For simple objects, you can create a mesh with one vertex per object and expand them in the geometry shader such that they're all drawn in a single draw call. Otherwise, merging them on the CPU is more efficient.
Both methods will ensure all objects are drawn in a single draw call. Be aware of the limitations of using a single draw call though; transparency can become tricky.

After merging a mesh, you can use a compute shader to animate and move the objects instead of using Unity scripts.
The compute shader edits a single compute buffer which can be read by your vertex or geometry shader without passing data back to the CPU.

Resources