This folder contains the configuration files for the view synthesis experiments presented in "Photo-realistic depth image-based view synthesis with multi-input modality" (Sarah Fachada, 2023). To run them, create a folder with the dataset and a path corresponding to the configuration file.
Sarah Fachada is a Research Fellow of the Fonds de la Recherche Scientifique - FNRS, Belgium.
The open-source datasets used in these experiments can be found at:
The software can be found at:
[3], Shearlet Transform with cycle consistency
[6], Instant Neural Graphic Primitives
[7], Test model for MPEG Immersive Video
[10], Raytrix Lenslet Content Convertor
Json Configuration files for RVS
Json format gathers the (undistorted) camera parameters of a dataset. The camera type can be perspective, equirectangular, plenoptic or XSslit.
"Version": "3.0",
"Content_name": "dataset_name",
"BoundingBox_center": [0, 0, 0],
"Fps": 30,
"Frames_number": number_of_frames_in_yuv_file,
"Informative": {[...]},
"lengthsInMeters": true,
"sourceCameraNames": [ "input_camera1_name", [...] ],
"cameras": [
"Name": "perspective_camera1_name",
"Position": [x, y, z],
"Rotation": [ yaw, pitch, roll ],
"Depthmap": 1,
"Background": 0,
"Depth_range": [near, far],
"Resolution": [w, h],
"Projection": "Perspective",
"Focal": [ (fx), (fy)],
"Principle_point": [ (ppx), (ppy)],
"BitDepthColor": 8,
"BitDepthDepth": 8,
"ColorSpace": "YUV420",
"DepthColorSpace": "YUV420"
"Name": "equirectangular_camera2_name",
"Projection": "Equirectangular",
"Hor_range": [ (theta_min), (theta_max)],
"Ver_range": [ (phi_min), (phi_max)],
"Name": "plenoptic_camera3_name",
"Projection": "Plenoptic",
"Focal": [fx, fy],
"Principle_point": [ppx, ppy],
"plenopticDiameter": microimage_diameter,
"plenopticCenter": [microimage_center_x,microimage_center_y],
"plenopticPixelSize": pixsize_in_mm,
"plenopticS": distance_MLA-sensor,
"plenopticD": distance_mainlens-MLA,
"plenopticF": mainlens_focal,
"plenopticConfig": 0-horizontal-layout_1-vertical-layout,
"Name": "xslit_camera4_name",
"Projection": "XSlit",
"Slit1": [l01, l02, l03, l12, l13, l23],
"Slit2": [m01, m02, m03, m12, m13, m23],
Depth Estimation
To estimate a depth map from YUV input images using RDE [12]:
RDE.exe {config_files/dataset_depth.json}
View synthesis
Prepare a folder for the input images and another for the depth maps, corresponding to the folders in the configuration file. The input images can be in YUV420 or PNG format. The input depth maps can be in MPEG's disparity format in YUV400/YUV420, or the numerical depth in float in EXR format.
RVS.exe {dataset_RVS.json}
Prepare a folder for the input images and another for the depth maps, corresponding to the folders in the configuration file. The input images have to be in YUV420 format. The input depth maps have to be in MPEG's disparity format in YUV400/YUV420.
TmivRenderer.exe -s {dataset} -n 1 -N 1 -f 0 -r R0 -p configDirectory {config_files} -p inputDirectory {inputdir} -p outputDirectory {outputdir} -c {config_files/dataset_render.json} -v v0 ... -v vn
Prepare a folder containg all the camera array that you wish to interpolate (even non-input images). The images should be rescaled to make the disparity between two input images < 32pixels. Interpolate the light field in a planar array of cameras:
python predict.py --path_base={folder} --name_lf={dataset_resized} --angu_res_gt= {#images} --dmin={dmin} --dmax={dmax} --interp_rate={interpolation rate} --im_h {H} --im_w {W} --full_parallax
Prepare a folder containg a file poses_bounds.npy
, and the input images in a subfolder images_{scalefactor}
1. Compute MPIs](
python imgs2mpis.py {dataset/mpis} --numplanes 32 --factor 2
python mpis2video.py {dataset/mpis} {renderposes.npy} {rendered.mp4}
ffmpeg -i {rendered.mp4} -s {WxH} {dataset/%03d.png}
NeRF and EikonalFields
Prepare a folder containg a file poses_bounds.npy
, and the input images in a subfolder images_{scalefactor}
. The configuration parameters are written in config.txt
1. Train NeRF](
python run_nerf.py --config {dataset.txt} --factor 2
python find_bounding_box.py --config {dataset.txt} --factor 2
python run_ior.py --config {dataset.txt} --N_rand 32000 --N_samples 128 --chunk 131072 --netchunk 65536 --factor 2
python render_model.py --config {dataset.txt} --mode {0](NeRF, 1](IOR} --render_from_path {folder_with_npy} --chunk 131072 --netchunk 65536 --factor 2 --N_rand 32000 --N_samples 128
For training, prepare a folder {dataset}
containg a file transform.json
, and the input images in a subfolder images_{scalefactor}
For the rendering, prepare a folder {dataset_poses_to_render}
containg a file transform.json
, and the target images in a subfolder images_{scalefactor}
- Train NeRF](
python scripts/run.py --scene {dataset} --save_snapshot {dataset/snap10000.msgpack} --n_steps 10000 --mode nerf
- Render images](
python scripts/run.py --mode nerf --scene {dataset} --load_snapshot {dataset/snap10000.msgpack} --width {W} --height {H} --screenshot_transforms {dataset_poses_to_render/transform.json} --screenshot_dir {dataset/out}
Prepare a folder containg a file poses_bounds.npy
, and the input images in a subfolder images_{scalefactor}
1. Train Nex](
python train.py -scene {dataset} -model_dir {dataset} -http -layers 6 -sublayers 3 -hidden 256
python train.py -scene {dataset} {dataset} -http -layers 6 -sublayers 3 -hidden 256 -predict -no_webgl
Plenoptic View synthesis
- Convert from real image to virtual if necessary](
python real_to_virtual.py -r {input.png} -cx {w} -cy {h} -o {output.png}
- Disparity estimation](
python disparity_sample.py {path_to_file.xml} -dmin {dmin} -dmax {dmax}
- Rendering](
python render_view_3g.py {parameters.json} -hv 5 -vv 5 -j {J} -spl {S}
{LLMV/RLC}.exe configuration.cfg
[1] Fachada, S., Bonatto, D., Teratani, M., & Lafruit, G. (2022). View Synthesis Tool for VR Immersive Video. [Intechopen](https://www.intechopen.com/online-first/80515>
[2] Kroon, B., (2018). Reference View Synthesize (RVS) manual [N18068], MPEG-I, ISO/IEC JTC1/SC29/WG11.
[3] Gao, Y., Bregović, R., Gotchev, A. (2020). Self-supervised light field reconstruction using shearlet transform and cycle consistency. IEEE Signal Processing Letters, 27, 1425-1429.
[4] Mildenhall, B., Srinivasan, P. P., Ortiz-Cayon, R., Kalantari, N. K., Ramamoorthi, R., Ng, R., Kar, A. (2019). Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 38(4), 1-14.
[5] Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.
[6] Müller, T., Evans, A., Schied, C., & Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4), 1-15.
[7] Boyce, J. M., et al. (2021). MPEG immersive video coding standard. Proceedings of the IEEE, 109(9), 1521-1536.
[8] Wizadwongsa, S., Phongthawee, P., Yenphraphai, J., Suwajanakorn, S. (2021). Nex: Real-time view synthesis with neural basis expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8534-8543).
[9] Bemana, M., Myszkowski, K., Revall Frisvad, J., Seidel, H. P., Ritschel, T. (2022, July). Eikonal fields for refractive novel-view synthesis. In ACM SIGGRAPH 2022 Conference Proceedings (pp. 1-9).
[10] Fujita, S., Mikawa, S., Panahpourtehrani, M., Takahashi, K., & Fujii, T. (2019, March). Extracting multi-view images from multi-focused plenoptic camera. In International Forum on Medical Imaging in Asia 2019 (Vol. 11050, pp. 17-22). SPIE.
[11] Palmieri, L., Koch, R., & Veld, R. O. H. (2018, October). The plenoptic 2.0 toolbox: Benchmarking of depth estimation methods for mla-based focused plenoptic cameras. In 2018 25th IEEE International Conference on Image Processing (ICIP) (pp. 649-653). IEEE.
[12] Rogge, S., Bonatto, D., Sancho, J., Salvador, R., Juarez, E., Munteanu, A., & Lafruit, G. (2019). MPEG-I depth estimation reference software. In 2019 International Conference on 3D Immersion (IC3D). IEEE.