Researchers from Inria Sophia Antipolis centre, in collaboration with Adobe and UC Berkeley developed a method that can alter the lighting of your pictures and videos in a few clicks. This innovative algorithm use a neural network that was trained on synthetic images but can also work on real images. The results show it is possible to take a picture at noon and make it automatically look like it was taken at sunset, for instance.
Being at the right spot at the right time to make a photograph is always challenging. Light is nonetheless one of the key components in achieving a professional look. Catching the sunset light, or aligning sun rays with some objects often require the photogrtapher to wait long hours for the best conditions. With smartphone fitted with increasingly better cameras and the advent of social networks, more and more users take and share pictures. They often use filters to beautify their pictures, but these filters cannot change the image’s content. Thanks to machine learning --- more specifically a deep neural network --- researchers have developed a method that pushes the boundary of what is possible with image filters and allows the lighting of a photo or video to be modified after the capture.
The first results presented at the SIGGRAPH conference
Julien Philip, a Ph.D student under the supervision of Dr George Drettakis in the GraphDeco project-team at Inria Sophia Antipolis, is the main contributor to the paper describing the method. He presented their results in Los Angeles at the SIGGRAPH conference in early August. SIGGRAPH (Special Interest Group on Computer GRAPHics and Interactive Techniques) is a major international conference on computer graphics: since 1974, it has brought together every year a large number of industrial, artistic and scientific players in the world of computer graphics, whether they are animated films, the creation of special effects, video games or software for 3D modelling
The method remains experimental but already produces stunning results, says Julien. “Often neural networks can only process small images with a quality that is insufficient for photographic applications, or they focus on lower level tasks like denoising. We give back control of the lighting to the users: they can now decide whether the picture will look as if it were shot in the morning, at noon or at sunset, long after capturing it. As a matter of fact, users can let their creativity drive them and imagine lightings that are beyond reality”.
Supported by Michaël Gharbi, an Adobe researcher, Tinghui Zhou and Alexei Efros from UC Berkeley, and my his thesis advisor George Drettakis, he showed that a single photo was not sufficient to obtain convincing results with current methods. To overcome this difficulty, they use more photos (or “views”) of the same place to estimate its underlying 3D and guide the relighting. These multiple “views” can be obtained by recording a video, taking several pictures, or even downloading images of the same place from the internet.
The algorithm can then be used to alter a photo, generate a time lapse effect or edit a video. The method can also be adapted to traditional multi-views pipelines such as “Image Based Rendering (IBR)” or photogrammetry, which is particularly used in visual effects andcould lead to future industrial applications.
A light-and-shadow game
Producing a realistic modification of cast shadows is a core challenge when trying to alter the lighting of a shot. The method is able to remove and modify them to simulate another lighting. They guide their algorithm using 3D et applying methods close to the ones commonly used in video games for shadow computation. Unfortunately, these methods are not directly applicable. Julien comments that “The 3D we obtain is not sufficiently good to remove or synthesize realistic shadows, though it provides solid cues. That is where the neural network comes in, it was taught to correct the errors due to the low quality of the 3D”. To train their artificial intelligence to estimate the transformations to apply, they needed examples of places under a wide variety of lighting conditions. This type of data is hard and expensive to acquire, so they used highly realistic rendering methods that simulate the physic of light instead of real photos. This allowed them to gather enough data for the neural network to properly learn to relight images despite poor 3D reconstruction.
This work was funded by the European project H2020 EMOTIVE & the ERC Advanced Grant FUNGRAPH.
For more details visit the project link (https://repo-sam.inria.fr/fungraph/deep-relighting/).
Before this work was published, Inria’s Graphdeco Team and Adobe Research collaborated on several projects that led to numerous joint publications since 2009.
This blog post was originally posted on Inria's website: "Inria’s deep learning algorithm that can relight photos and videos", 31 October 2019.