Experiments

This folder contains the configuration files for the view synthesis experiments presented in "Photo-realistic depth image-based view synthesis with multi-input modality" (Sarah Fachada, 2023). To run them, create a folder with the dataset and a path corresponding to the configuration file.

Acknowledgments

Sarah Fachada is a Research Fellow of the Fonds de la Recherche Scientifique - FNRS, Belgium.

Requirements

Datasets

The open-source datasets used in these experiments can be found at:

Toystable

Magritte

Tarot and crystal ball

Fujita

RabbitStamp

Software

The software can be found at:

[1], RVS Plenoptic

[2], RVS-MPEG

[3], Shearlet Transform with cycle consistency

[4], Local Light Field Fusion

[5], Neural Radiance Fields

[6], Instant Neural Graphic Primitives

[7], Test model for MPEG Immersive Video

[8], Nex

[9], Eikonal Fields

[10], Raytrix Lenslet Content Convertor

[11], Plenoptic Toolbox

Utilities scripts

Json Configuration files for RVS

Json format gathers the (undistorted) camera parameters of a dataset. The camera type can be perspective, equirectangular, plenoptic or XSslit.

json:table
 {
     "Version": "3.0",
     "Content_name": "dataset_name",
     "BoundingBox_center": [0, 0, 0],
     "Fps": 30,
     "Frames_number": number_of_frames_in_yuv_file,
     "Informative": {[...]},
     "lengthsInMeters": true,
     "sourceCameraNames": [ "input_camera1_name", [...] ],
  "cameras": [
    {
      "Name": "perspective_camera1_name",
      "Position": [x, y, z],
      "Rotation": [ yaw, pitch, roll ],
      "Depthmap": 1,
      "Background": 0,
      "Depth_range": [near, far],
      "Resolution": [w, h],
      "Projection": "Perspective",
          "Focal": [ (fx), (fy)],
          "Principle_point": [ (ppx), (ppy)],
      "BitDepthColor": 8,
      "BitDepthDepth": 8,
      "ColorSpace": "YUV420",
      "DepthColorSpace": "YUV420"
    },
    {
      "Name": "equirectangular_camera2_name",
      [...],
      "Projection": "Equirectangular",
          "Hor_range": [ (theta_min), (theta_max)],
          "Ver_range": [ (phi_min), (phi_max)],
      [...]
    },
    {
      "Name": "plenoptic_camera3_name",
      [...],
      "Projection": "Plenoptic",
          "Focal": [fx, fy],
          "Principle_point": [ppx, ppy],
          "plenopticDiameter": microimage_diameter,
          "plenopticCenter": [microimage_center_x,microimage_center_y],      
          "plenopticPixelSize": pixsize_in_mm,
          "plenopticS": distance_MLA-sensor,
          "plenopticD": distance_mainlens-MLA,
          "plenopticF": mainlens_focal,
          "plenopticConfig": 0-horizontal-layout_1-vertical-layout,
      [...]
    },
    {
      "Name": "xslit_camera4_name",
      [...],
      "Projection": "XSlit",
          "Slit1": [l01, l02, l03, l12, l13, l23],
          "Slit2": [m01, m02, m03, m12, m13, m23],
      [...]
    },
    {
      [...]
    }
  ]
}

Depth Estimation

RDE

To estimate a depth map from YUV input images using RDE [12]:

RDE.exe {config_files/dataset_depth.json}

View synthesis

RVS

Prepare a folder for the input images and another for the depth maps, corresponding to the folders in the configuration file. The input images can be in YUV420 or PNG format. The input depth maps can be in MPEG's disparity format in YUV400/YUV420, or the numerical depth in float in EXR format.

RVS.exe  {dataset_RVS.json}

TMIV

Prepare a folder for the input images and another for the depth maps, corresponding to the folders in the configuration file. The input images have to be in YUV420 format. The input depth maps have to be in MPEG's disparity format in YUV400/YUV420.

TmivRenderer.exe -s {dataset} -n 1 -N 1 -f 0 -r R0 -p configDirectory {config_files} -p inputDirectory {inputdir} -p outputDirectory {outputdir} -c {config_files/dataset_render.json} -v v0 ... -v vn

ShearletTransform

Prepare a folder containg all the camera array that you wish to interpolate (even non-input images). The images should be rescaled to make the disparity between two input images < 32pixels. Interpolate the light field in a planar array of cameras:

python predict.py --path_base={folder} --name_lf={dataset_resized} --angu_res_gt= {#images} --dmin={dmin} --dmax={dmax} --interp_rate={interpolation rate} --im_h {H} --im_w {W} --full_parallax

LLFF

Prepare a folder containg a file poses_bounds.npy, and the input images in a subfolder images_{scalefactor}. 1. Compute MPIs](

python imgs2mpis.py {dataset/mpis} --numplanes 32 --factor  2

2. Render video](

python mpis2video.py {dataset/mpis} {renderposes.npy} {rendered.mp4}

5. Convert to images](

ffmpeg -i {rendered.mp4} -s {WxH} {dataset/%03d.png}

NeRF and EikonalFields

Prepare a folder containg a file poses_bounds.npy, and the input images in a subfolder images_{scalefactor}. The configuration parameters are written in config.txt 1. Train NeRF](

python  run_nerf.py --config {dataset.txt} --factor 2

2. Detect transparent object](

python  find_bounding_box.py --config {dataset.txt} --factor  2

3. Train IOR](

python  run_ior.py  --config {dataset.txt} --N_rand 32000  --N_samples 128 --chunk 131072 --netchunk 65536  --factor 2

4. Render images](

python  render_model.py  --config {dataset.txt} --mode {0](NeRF,  1](IOR}  --render_from_path {folder_with_npy}  --chunk 131072 --netchunk 65536 --factor  2  --N_rand 32000 --N_samples 128

NGP

For training, prepare a folder {dataset} containg a file transform.json, and the input images in a subfolder images_{scalefactor}. For the rendering, prepare a folder {dataset_poses_to_render} containg a file transform.json, and the target images in a subfolder images_{scalefactor}.

Train NeRF](

python scripts/run.py --scene {dataset} --save_snapshot {dataset/snap10000.msgpack} --n_steps  10000 --mode nerf

Render images](

python  scripts/run.py --mode nerf --scene {dataset} --load_snapshot {dataset/snap10000.msgpack} --width  {W}  --height {H} --screenshot_transforms  {dataset_poses_to_render/transform.json} --screenshot_dir {dataset/out}

Nex

Prepare a folder containg a file poses_bounds.npy, and the input images in a subfolder images_{scalefactor}. 1. Train Nex](

python  train.py  -scene {dataset} -model_dir {dataset} -http -layers 6 -sublayers 3 -hidden 256

2. Render images](

python  train.py  -scene {dataset} {dataset} -http -layers 6 -sublayers 3 -hidden 256 -predict -no_webgl

Plenoptic View synthesis

PlenopticToolbox

Convert from real image to virtual if necessary](

python real_to_virtual.py -r {input.png} -cx {w}  -cy  {h}  -o  {output.png}

Disparity estimation](

python  disparity_sample.py {path_to_file.xml}  -dmin  {dmin}  -dmax {dmax}

Rendering](

python  render_view_3g.py {parameters.json} -hv 5 -vv 5 -j  {J}  -spl  {S}

LLMV and RLC

{LLMV/RLC}.exe configuration.cfg

References

[1] Fachada, S., Bonatto, D., Teratani, M., & Lafruit, G. (2022). View Synthesis Tool for VR Immersive Video. [Intechopen](https://www.intechopen.com/online-first/80515>

[2] Kroon, B., (2018). Reference View Synthesize (RVS) manual [N18068], MPEG-I, ISO/IEC JTC1/SC29/WG11.

[3] Gao, Y., Bregović, R., Gotchev, A. (2020). Self-supervised light field reconstruction using shearlet transform and cycle consistency. IEEE Signal Processing Letters, 27, 1425-1429.

[4] Mildenhall, B., Srinivasan, P. P., Ortiz-Cayon, R., Kalantari, N. K., Ramamoorthi, R., Ng, R., Kar, A. (2019). Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 38(4), 1-14.

[5] Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.

[6] Müller, T., Evans, A., Schied, C., & Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4), 1-15.

[7] Boyce, J. M., et al. (2021). MPEG immersive video coding standard. Proceedings of the IEEE, 109(9), 1521-1536.

[8] Wizadwongsa, S., Phongthawee, P., Yenphraphai, J., Suwajanakorn, S. (2021). Nex: Real-time view synthesis with neural basis expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8534-8543).

[9] Bemana, M., Myszkowski, K., Revall Frisvad, J., Seidel, H. P., Ritschel, T. (2022, July). Eikonal fields for refractive novel-view synthesis. In ACM SIGGRAPH 2022 Conference Proceedings (pp. 1-9).

[10] Fujita, S., Mikawa, S., Panahpourtehrani, M., Takahashi, K., & Fujii, T. (2019, March). Extracting multi-view images from multi-focused plenoptic camera. In International Forum on Medical Imaging in Asia 2019 (Vol. 11050, pp. 17-22). SPIE.

[11] Palmieri, L., Koch, R., & Veld, R. O. H. (2018, October). The plenoptic 2.0 toolbox: Benchmarking of depth estimation methods for mla-based focused plenoptic cameras. In 2018 25th IEEE International Conference on Image Processing (ICIP) (pp. 649-653). IEEE.

[12] Rogge, S., Bonatto, D., Sancho, J., Salvador, R., Juarez, E., Munteanu, A., & Lafruit, G. (2019). MPEG-I depth estimation reference software. In 2019 International Conference on 3D Immersion (IC3D). IEEE.