Inpainting to hide structures in satellite images
I recently generalized this computer vision model to forecast time series (link to the repo).
Google Earth made sophisticated satellite images accessible to anyone with a digital device and internet connection. However, sensitive locations, which used to be well hidden from the public, were fully exposed. Thus, easy access to satellite images poses a potential threat to national security and personal privacy.
On the one hand, it is necessary to completely hide details in secret facilities. Usually those locations are censored with single-color or pixelation patches which do not blend in with surrounding structures. On the other hand, such “cover-ups” often raise people’s curiosity and attract more public attention (check out this blog post).

To tackle this delimma, I applied the technology of photo retouching to satellite images where the sensity structures were replaced by textures mimicing the surrounding structures. This process was automated using a dense partial convolutional neural network.

The architecture is a modified version of UNet where the encoding convolutional layers are replaced by partial convolutional layers together with partial batch normalization. The partial convolution layer takes (i) an image (or its feature map) with a hole, and (ii) the mask indicate the location of the hole; the output is a partial convolution feature map skipping the hole region and the mask for the remaing hole in the feature map.
class PartialConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=0.1)

True, adds a learnable bias to the output. Default: Truenew_mask = AvePool2d(mask.float())>eps. Default: eps=0.1(batch, in_channel, height, width) and mask (1, 1, height, width)(batch, out_channel, new_height, new_width) and mask (1, 1, new_height, new_width)class PConvBlock(in_channel, out_channel, conv_para, pool_para, eps=0.1)
The input image and mask are passed to partial convolution and then partial batch normalization (avoiding the hole region). The output feature map is then passed to a ReLU layer and MaxPool2d layer; the mask is downsampled with the same MaxPool2d layer.
torch.nn.MaxPool2d(batch, in_channel, height, width) and mask(1, 1, height, width)(batch, out_channel, new_height, new_width) and mask (1, 1, new_height, new_width)class PConvNet(n_hidden=8)

There are 8 partial convolution blocks PConvBlock downsampling the images, and 8 transposed convolution blocks reconstructing the images. Each transposed convolution is composed of the following:
torch.nn.ConvTransposed2d functions as reverse of torch.nn.MaxPool in the mirroring partial convolution block.torch.nn.BatchNorm2d 2D Batch Normalizationtorch.nn.ConvTransposed2d functions as reverse of PartialConv2dtorch.nn.ReLU ReLU layerPConvBlock to the output of ReLU layertorch.nn.Conv2d(batch, in_channel, height, width) and mask(1, 1, height, width)(batch, in_channel, height, width)The original images are samples from Paris-SpaceNet
| Camouflage | Original |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |