Notes of Depth of Field

"Distributed Ray Tracing" Cook 1884

http://artis.inrialpes.fr/Enseignement/TRSA/CookDistributed84.pdf

"Depth of Field: A Survey of Techniques" GPU Gems 1

要旨

CoC : 点に収束せずに円状に広がる光
- Anything that's not at this exact distance projects to a region (instead of to a point) on the film. This region is known as the circle of confusion (CoC).
ピントがあっているように見えるのはフィルムの解像度よりも CoC が小さいから
- there is a range of distances within which the CoC is smaller than the resolution of the film, photographers and cinematographers refer to this range as being in focus;
よくある方法
- (A) レイトレース (B) アキュームレーションバッファに加算 (C) レイヤーに分ける
(D) Scatter 方式 : Forward-Mapped Z-Buffer Depth of Field
- スプライトを描画する
  - Using forward mapping (that is, rendering sprites) to approximate depth of field allows for a depth-of-field effect that works for arbitrary scenes.
- ブレンド方法
  - The pixels are blended into the frame buffer as circles whose colors are the pixels' colors, whose diameter equals the CoC, and with alpha values inversely proportional to the circles' areas.
- スプライトの元の点がデプスバッファに隠れると急に消える問題

(E) Gather 方式 : Reverse-Mapped Z-Buffer Depth of Field
- Z に応じてカラーバッファをぼかす, ミップマップを使ったりする
  - but instead of rendering blended sprites onto the screen, this technique blurs the rendered scene by varying amounts per pixel, depending on the depth found in the z-buffer.
- カラーバッファのダウンサンプリング
  - Some techniques used the same texture bound multiple times with different mip-level biases, and some explicitly created new blurred textures, often with better down-sampling filters (3x3 or 5x5 rather than 2x2). These variants would typically perform three lookups—the original scene, the quarter-sized (¼x¼) texture, and the sixteenth-sized ( 1x16.jpg x 1x16.jpg ) texture—and blend between them based upon the pixel's z value. Instead of letting the hardware generate mipmaps, it's also possible to render-to-texture to each mipmap with better filter kernels to get better-quality mipmaps.
Gather 方式の諸問題
- MidGround でデプスを 1 点からしか読まない場合, MidGround が FG(宇宙人) を拾わないので FG(宇宙人) が MidGround に対してボケが広がらない
  - Because only a single z value is read to determine how much to blur the image (that is, which mip level to choose), a foreground (blurry) object won't blur out over midground (sharp) objects. See Figure 23-10.

f:id:gregory-igehy:20181031020431j:plain

- デプスを複数読む -> ハロ効果が出る
  - One solution is to read multiple depth values, although unless you can read a variable number of depth values, you will have a halo effect of decreasing blur of fixed radius, which is nearly as objectionable as the depth discontinuity itself.
- デプスバッファを CoC に応じてぼかす
  - Another solution is to read a blurred depth buffer, but this causes artifacts that are just as objectionable: blurring along edges of sharp objects and sharpening along edges of blurry objects. Extra blur can be added to the whole scene by uniformly biasing the calculated CoC.

- カラーバッファのバイリニア補間のアーティファクト -> ジッタリング?

f:id:gregory-igehy:20181031020410p:plain

- カラーブリーディング
  - カラーバッファを見境なくぼかした場合, ピントがあっているものが「ピント外」に染み出る
  - Because the color image is blurred indiscriminately, areas in focus can incorrectly bleed into nearby areas out of focus.

f:id:gregory-igehy:20181031021017j:plain

"Practical Post-Process Depth of Field" GPU Gems 3

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch28.html

"Real-Time Depth of Field Simulation" Guennadi Riguer, Natalya Tatarchuk and John Isidoro. ShaderX2. 2003

http://www.realtimerendering.com/resources/shaderx/Introductions_and_Tutorials_with_DirectX_9.pdf

用語

Focal Distance から遠いと, ImagenPlane に対して円錐の断面で交わり, これは Circle of Confusion という円で近似される
- However, if a given point on an object in the scene is not near the focal distance, the cone of light rays will intersect the image plane in an area shaped like a conic section. Typically, the conic section is approximated by a circle called the circle of confusion.

ポストプロセスの DOF の方法(A) : 円状のカーネルでサンプリングする方式, Gather 方式

1 パス目 : シーンの描画
- VS : 頂点ライティングして, デプス v.fDepth も出力する
- PS : デプス v.fDepth を使って, ピクセルシェーダで CoC を計算する　　
  - MRT0 にカラーを出力, MRT1 には線形デプス, ブラー度合い(by CoC), 0, 0 を出力
2 パス目 : Gather DOF
- VS : シーンカラーを参照するためのビューポート座標や UV 渡す
- PS :

f:id:gregory-igehy:20181104170836p:plain

問題0: ピントがあっている物体が, 遠景ボケに染み出る。これによって, ピントがあっている物体の周囲にハロが発生
- One of the problems commonly associated with all post-filtering methods is leaking of color from sharp objects onto the blurry backgrounds. This results in faint halos around sharp objects, as can be seen on the left side of Figure 8.
理由0 : カーネルサイズが大きいと, 遠景ボケがピントがあっている物体を拾ってしまうから
- The color leaking happens because the filter for the blurry background will sample color from the sharp object in the vicinity due to the large filter size.
対策0: ピントがあっている物体のサンプルが, 遠景ボケのサンプルよりも手前にある場合は　そのサンプルを使わない
- To solve this problem, we will discard the outer samples that can contribute to leaking according to the following criteria: If the outer sample is in focus and it is in front of the blurry center sample, it should not contribute to the blurred color.
問題1 : 物体がピント内や、ピント外にいったときにポッポングが起きる
- This can introduce a minor popping effect when objects go in or out of focus.
対策1: ピント外のブラー度合いを重みとして使う
- To combat sample popping, the outer sample blurriness factor is used as a sample weight to fade out its contribution gradually. The right side of Figure 8 shows a portion of a scene fragment with color leaking eliminated.

f:id:gregory-igehy:20181104171313p:plain

ポストプロセスの DOF の方法(G) : ガウシアンブラーでぼかしたものを参照する方式

1 パス目 : シーンの描画
- VS : 頂点ライティングと, o.fBlur(ニアとファー平面を使ったボケ具合)
- PS : カラーバッファに vColor.rgb と, v.fBlur を出力

2 パス目 : 1/2 * 1/2 にバイリニアフィルタでダウンサンプリング

3 パス目 : ガウシアンブラーで X 方向にブラー
- 13 タップ, 25 点を参照
- VS : タップするテクスチャ座標も出力
  - PS : float4( vColorSum, fWeightSum) を出力

f:id:gregory-igehy:20181104173039p:plain

float4 filter gaussian x ps(PS INPUT TEX7 v) : COLOR
{
// Samples
float4 s0, s1, s2, s3, s4, s5, s6;
float4 vWeights4;
float3 vWeights3;

// Acumulated color and weights
float3 vColorSum;
float fWeightSum;
// Sample taps with coordinates from VS
s0 = tex2D(renderTexture, v.vTap0);
s1 = tex2D(renderTexture, v.vTap1);
s2 = tex2D(renderTexture, v.vTap2);
s3 = tex2D(renderTexture, v.vTap3);
s4 = tex2D(renderTexture, v.vTap1Neg);
s5 = tex2D(renderTexture, v.vTap2Neg);
s6 = tex2D(renderTexture, v.vTap3Neg);

// Compute weights for first 4 samples (including center tap)
// by thresholding blurriness (in sample alpha)
vWeights4.x = saturate(s1.a - vThresh0.x);
vWeights4.y = saturate(s2.a - vThresh0.y);
vWeights4.z = saturate(s3.a - vThresh0.x);
vWeights4.w = saturate(s0.a - vThresh0.w);

// Accumulate weighted samples
vColorSum = s0 * vWeights4.x + s1 * vWeights4.y +
s2 * vWeights4.z + s3 * vWeights4.w;
// Sum weights using DOT
fWeightSum = dot(vWeights4, 1);
// Compute weights for three remaining samples
vWeights3.x = saturate(s4.a - vThresh0.x);
vWeights3.y = saturate(s5.a - vThresh0.y);
vWeights3.z = saturate(s6.a - vThresh0.z);
// Accumulate weighted samples
vColorSum += s4 * vWeights3.x + s4 * vWeights3.y + s6 * vWeights3.z;
// Sum weights using DOT
fWeightSum += dot(vWeights3, 1);
//

// Compute weights for 3 samples
vWeights3.x = saturate(s3.a - vThresh1.x);
vWeights3.y = saturate(s4.a - vThresh1.y);
vWeights3.z = saturate(s5.a - vThresh1.z);
// Accumulate weighted samples
vColorSum += s3 * vWeights3.x + s4 * vWeights3.y + s5 * vWeights3.z;
// Sum weights using DOT
fWeightSum += dot(vWeights3, 1);
// Divide weighted sum of samples by sum of all weights
vColorSum /= fWeightSum;
// Color and weights sum output scaled (by 1/256)
// to fit values in 16 bit 0 to 1 range
return float4(vColorSum, fWeightSum) * 0.00390625;
}

4 パス目 : ガウシアンブラーで Y 方向にブラー

float4 filter gaussian y ps(PS INPUT TEX7 v) : COLOR
{
// Samples
float4 s0, s1, s2, s3, s4, s5, s6;
// Accumulated color and weights
float4 vColorWeightSum;
// Sample taps with coordinates from VS
s0 = tex2D(blurredXTexture, v.vTap0);
s1 = tex2D(blurredXTexture, v.vTap1);
s2 = tex2D(blurredXTexture, v.vTap2);
s3 = tex2D(blurredXTexture, v.vTap3);
s4 = tex2D(blurredXTexture, v.vTap1Neg);
s5 = tex2D(blurredXTexture, v.vTap2Neg);
s6 = tex2D(blurredXTexture, v.vTap3Neg);
// Modulate sampled color values by the weights stored
// in the alpha channel of each sample
s0.rgb = s0.rgb * s0.a;
s1.rgb = s1.rgb * s1.a;
s2.rgb = s2.rgb * s2.a;
s3.rgb = s3.rgb * s3.a;
s4.rgb = s4.rgb * s4.a;
s5.rgb = s5.rgb * s5.a;
s6.rgb = s6.rgb * s6.a;

// Aggregate all samples weighting them with predefined
// kernel weights, weight sum in alpha
vColorWeightSum = s0 * vWeights0.w + (s1 + s4) * vWeights0.x + (s2 + s5) * vWeights0.y + (s3 + s6) * vWeights0.z;

// Compute tex coords for other taps
float2 vTap4 = v.vTap0 + vertTapOffs[4];
float2 vTap5 = v.vTap0 + vertTapOffs[5];
float2 vTap6 = v.vTap0 + vertTapOffs[6];
float2 vTap4Neg = v.vTap0 - vertTapOffs[4];
float2 vTap5Neg = v.vTap0 - vertTapOffs[5];
float2 vTap6Neg = v.vTap0 - vertTapOffs[6];
s0 = tex2D(blurredXTexture, vTap4);
s1 = tex2D(blurredXTexture, vTap5);
s2 = tex2D(blurredXTexture, vTap6);
s3 = tex2D(blurredXTexture, vTap4Neg);
s4 = tex2D(blurredXTexture, vTap5Neg);
s5 = tex2D(blurredXTexture, vTap6Neg);

// Modulate sampled color values by the weights stored
// in the alpha channel of each sample
s0.rgb = s0.rgb * s0.a;
s1.rgb = s1.rgb * s1.a;
s2.rgb = s2.rgb * s2.a;
s3.rgb = s3.rgb * s3.a;
s4.rgb = s4.rgb * s4.a;
s5.rgb = s5.rgb * s5.a;

// Aggregate all samples weighting them with predefined
// kernel weights, weight sum in alpha
vColorWeightSum += (s1 + s3) * vWeights1.x + (s1 + s4) * vWeights1.y + (s2 + s5) * vWeights1.z;
// Average combined sample for all samples in the kernel
vColorWeightSum.rgb /= vColorWeightSum.a;
// Account for scale factor applied in previous pass
// (blur along the X axis) to output values
// in 16 bit 0 to 1 range
return vColorWeightSum * 256.0;
}

5 パス目 : 合成パス

float4 final pass ps(float2 Tex: TEXCOORD0) : COLOR
　{
// Sample Gaussian-blurred image
float4 vBlurred = tex2D(blurredXYTexture, Tex);
// Sample full-resolution scene rendering result
float4 vFullres = tex2D(renderTexture, Tex);
// Interpolate between original full-resolution and
// blurred images based on blurriness
float3 vColor = lerp(vFullres, vBlurred, vFullres.a);
return float4(vColor, 1.0);
}

"Improved Depth-Of-Field Rendering" ShaderX3

http://shiba.hpe.sh.cn/jiaoyanzu/wuli/soft/Hlsl/ShaderX3.pdf

"Real-Time Depth-of-Field Implemented with a Post-Processing only Technique" David Gilham. ShaderX5. 2007

http://www.shaderx5.com/TOC.html

"The Skylanders Depth of Field Shader" GPU Pro 4. 2011

"Star Ocean 4 - Flexible Shader Managment and Post-processing" GDC 2009

http://www.slideshare.net/DAMSIGNUP/so4-flexible-shadermanagmentandpostprocessing
ガウシアンブラーだと, エッジのシャープさがなくなる. 特に輝度が高い箇所

f:id:gregory-igehy:20181109040349p:plain

ミップマップのテクスチャを参照
トライリニアフィルタで, タップ数とシャープさのバランスをとる

f:id:gregory-igehy:20181109040428p:plain

ブレンドとピクセル単位のブラー

f:id:gregory-igehy:20181109040400p:plain

縮小カラーバッファは固定タップ, ミップマップのカラーバッファは高いサンプル数が必要な場合
デプスバッファからマスクを作る

f:id:gregory-igehy:20181109040415p:plain

詳しい流れの図

f:id:gregory-igehy:20181109040436p:plain

染み出る問題への解決
- 高い輝度の境界で、バイリニアフィルタが原因で起きる
- マスクイメージとカラーを乗算したボケ画像を作る
- しかし, 前ボケと後ろボケの別処理が必要

f:id:gregory-igehy:20181109040424p:plain

近景用のマスクの作成
- 膨張用に周囲の 10 点をサンプリング
- 4 フェッチ(9TAP)のガウシアンブラー

f:id:gregory-igehy:20181109040419p:plain

GatherDOF : "Moving to the Next Generation - The Rendering Technology of Ryse" GDC 2014

1 パス目, 2 パス目の比較
- 1 パス目 7x7 = 49 タップ. 0.426 msec
- 2 パス目 3x3=9 タップ ( 0.094 msec )

f:id:gregory-igehy:20181109040457p:plain

最終形状にするために, Union 演算で合成する

f:id:gregory-igehy:20181109040506p:plain

同心円状のサンプリング, 絞り形状のサンプリング

f:id:gregory-igehy:20181109040510p:plain

全体のアルゴリズム
- 1 パス目 : 7x7 サンプリング
- 2 パス目 : 3x3 サンプリング
- カラーバッファ R11G11B10F の縮小バッファ, R8G8 : CoC
- 解像度は半分
- ファーとニアは同じパスで処理
- アンダーサンプリング防止のために, オフセットの範囲を制限

f:id:gregory-igehy:20181109040515p:plain

最終合成
- 遠景ボケ
  - バイラテラルフィルターを使う
  - 半分解像度の CoC から 4 タップして, フル解像度と比較
  - バイキュービックフィルタで重みづけ
  - 遠景の CoC をブレンドに使う
- 近景ボケ
- 注意してアップスケール
  - 半分解像度のニアの CoC をブレンドに使う
  - ブリードしても大丈夫, バイキュービックフィルタを使う
- ブレンドの仕方に注意
  - リニアブレンドは良くない, 周波数
  - 非線形なブレンドを使うべき

f:id:gregory-igehy:20181109040520p:plain

ファーとニア
- 両方とも入力は半分解像度のカラーバッファ
  - 注意すべき点: ダウンスケールはバイリニアフィルタの影響でエラーのもとになる
  - 従って,ダウンスケールにはカスタムのバイリニアフィルタ(バイラテラルフィルタ)を使う
- ファー
  - ファーの CoC でカーネルサイズとサンプルの重みを調整する
  - ファーの CoC でレイヤーを事前に乗算する
  - それによって, バイリニアフィルタや Separable フィルタによるブリーディングを防ぐ

f:id:gregory-igehy:20181109040531p:plain

- ニア
  - ギャザーによる近似でスキャッターを実現する
  - ニアの CoC でカーネルサイズとサンプルの重みを「タイルの MaxCoC」で調整する
  - ニアの CoC でレイヤーを事前に乗算する : ニアのフラグメントだけぼかしたい, ( 部分的なオクルージョンを安く近似 )

f:id:gregory-igehy:20181109040525p:plain

タイルでの Min/Max CoC
- CoC ターゲットを, タイル(k)の回数分だけダウンスケールする
- ファー用に Min のフラグメント, ニア用に Max をのフラグメントを使う
- R8G8 に保存する
同じパスでニアとファーを扱う
- 動的分岐を使って, タイルの Min/Max CoC を出力する
- ファーとニアでコストのバランスがとれる?
- ニア領域の Scatter as Gather でも使われる
- 他のポストプロセスにも利用できる
  - HDR バッファのダウンスケールはブルームにも使われるもの
  - またニアとファー領域用の HDR 入力を R11G11B10F にパッキングする

f:id:gregory-igehy:20181109040535p:plain

McIntosh's paper "Efficiently Simulating the Bokeh of Polygonal Apertures in a Post-Process Depth of Field Shader". 2012

- http://ivizlab.sfu.ca/media/DiPaolaMcIntoshRiecke2012.pdf
- http://ivizlab.sfu.ca/papers/cgf2012.pdf

f:id:gregory-igehy:20181109031733p:plain

TheMin(x, y) function returns the lesser (i.e. least bright) of x and y at every pixel, and thus approximates a boolean intersection by preserving bright pixels only in the areas where bright CoC coincide in both images.
The Max(x, y) function returns the greater (i.e. most bright) of x and y at every pixel, and thus approximates a boolean union by preserving bright pixels wherever they exist in either image (see Figure 9).

Cinematic Depth of Field

Circular Separable Convolution Depth of Field “Circular Dof” EA DICE, GDC2018

https://twvideo01.ubm-us.net/o1/vault/gdc2018/presentations/Garcia_Kleber_CircularDepthOf.pdf

Rendering Trickes in DeadSpace3

https://www.gdcvault.com/play/1017718/Rendering-Tricks-in-Dead-Space
https://archive.org/details/GDC2013Andreev
6 サンプルのリング形状のカーネルで、3 パス描画する
疑似的に 6^3 = 216 サンプル
ピントが合っている範囲のマスクを作って、それで漏れるのを防ぐ

f:id:gregory-igehy:20180911230917p:plain

Crysis Next Gen Effects 2008

最初のパスでは 8 サンプルでぼかす
次のパスでぼかした画像に 8 サンプルでぼかして、疑似的には 8^2 = 64 サンプル
https://www.slideshare.net/TiagoAlexSousa/crysis-nextgen-effects-gdc-2008

f:id:gregory-igehy:20180906041148j:plain

Gather DOF "Killzone Shadow Fall" Demo Postmortem

FarDOF と NearDOF の 2 パスで解像度は 1/2x1/2
ギャザーカーネルは 13x13
小さい CoC ではサンプルカウント数を減らしたい
- 同心円状にサンプリングして, 半径方向にループするようにした
しかし, ギャザーなので今のピクセルに影響する最大の CoC を持つ近傍ピクセルを知る必要がある
- CoC の Max ツリーを作る, 4 ミップあれば 13x13 には十分
- これで従来の 1/8, 重いシーンでは 1/4 になった