Inpainting to hide structures in satellite images
I recently generalized this computer vision model to forecast time series (link to the repo).
Google Earth made sophisticated satellite images accessible to anyone with a digital device and internet connection. However, sensitive locations, which used to be well hidden from the public, were fully exposed. Thus, easy access to satellite images poses a potential threat to national security and personal privacy.
On the one hand, it is necessary to completely hide details in secret facilities. Usually those locations are censored with single-color or pixelation patches which do not blend in with surrounding structures. On the other hand, such “cover-ups” often raise people’s curiosity and attract more public attention (check out this blog post).
To tackle this delimma, I applied the technology of photo retouching to satellite images where the sensity structures were replaced by textures mimicing the surrounding structures. This process was automated using a dense partial convolutional neural network.
The architecture is a modified version of UNet where the encoding convolutional layers are replaced by partial convolutional layers together with partial batch normalization. The partial convolution layer takes (i) an image (or its feature map) with a hole, and (ii) the mask indicate the location of the hole; the output is a partial convolution feature map skipping the hole region and the mask for the remaing hole in the feature map.
class PartialConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=0.1)
True
, adds a learnable bias to the output. Default: True
new_mask = AvePool2d(mask.float())>eps
. Default: eps=0.1
(batch, in_channel, height, width)
and mask (1, 1, height, width)
(batch, out_channel, new_height, new_width)
and mask (1, 1, new_height, new_width)
class PConvBlock(in_channel, out_channel, conv_para, pool_para, eps=0.1)
The input image and mask are passed to partial convolution and then partial batch normalization (avoiding the hole region). The output feature map is then passed to a ReLU
layer and MaxPool2d
layer; the mask is downsampled with the same MaxPool2d
layer.
torch.nn.MaxPool2d
(batch, in_channel, height, width)
and mask(1, 1, height, width)
(batch, out_channel, new_height, new_width)
and mask (1, 1, new_height, new_width)
class PConvNet(n_hidden=8)
There are 8 partial convolution blocks PConvBlock
downsampling the images, and 8 transposed convolution blocks reconstructing the images. Each transposed convolution is composed of the following:
torch.nn.ConvTransposed2d
functions as reverse of torch.nn.MaxPool
in the mirroring partial convolution block.torch.nn.BatchNorm2d
2D Batch Normalizationtorch.nn.ConvTransposed2d
functions as reverse of PartialConv2d
torch.nn.ReLU
ReLU layerPConvBlock
to the output of ReLU layertorch.nn.Conv2d
(batch, in_channel, height, width)
and mask(1, 1, height, width)
(batch, in_channel, height, width)
The original images are samples from Paris-SpaceNet
Camouflage | Original |
---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |