CNN Style Transfer论文复现

Posted on 2021-04-21 Edited on 2021-11-15 In learning Views: Symbols count in article: 2.2k Reading time ≈ 2 mins.

# NeuralStyle

My own implementation of CVPR 2016 paper: Image Style Transfer Using Convolutional Neural Networks. This work is, I think, simple but elegant (I mean the paper, not my implementation) with good interpretability.

CVPR 2016 OpenAccess Link is here: CVPR 2016 open access
Personal understanding of this paper [Chinese]: Blog of Enigmatisms/CNN Style Transfer论文复现

2021.11.15 complement: I have no intention to analyze and explain this paper, because I think it's simple, and I have a deep impression of this, therefore there is no point recording anything on the blog. Original Github Repo: Github🔗: Enigmatisms/NeuralStyle. This post is exactly the README.md of the repo.

To run the code

Make sure to have Pytorch / Tensorboard on your device, CUDA is available too yet I failed to use it (GPU memory not enough, yet API is good to go). I am currently using Pytorch 1.7.0 + CU101.

On Init, it might require you to download pretrained VGG-19 network, which requires network connection.

Tree - Working Directory

folder content: Where I keep content images.
folder imgs: To which the output goes.
folder style:
- lossTerm.py: Style loss and Content loss are implemented here.
- precompute.py: VGG-19 utilization, style and content extractors.
- transfer.py: executable script.

A Little Help

Always run transfer.py in folder style/, using python ./transfer.py -h， you'll get:

usage: transfer.py [-h] [--alpha ALPHA] [--epoches EPOCHES]
                   [--max_iter MAX_ITER] [--save_time SAVE_TIME] [-d] [-g]
                   [-c]
optional arguments:
  -h, --help            show this help message and exit
  --alpha ALPHA         Ratio of content loss in the total loss
  --epoches EPOCHES     Training lasts for . epoches (for LBFGS)
  --max_iter MAX_ITER   LBFGS max iteration number
  --save_time SAVE_TIME
                        Save image every <save_time> epoches
  -d, --del_dir         Delete dir ./logs and start new tensorboard records
  -g, --gray            Using grayscale image as initialization for generated
                        image
  -c, --cuda            Use CUDA to speed up training

Requirements

Run:

1	python3 -m pip install -r requirements.py

To find out.

Training Process

Something strange happened. Loss exploded twice (but recovered.). Tensorboard graphs:

Therefore, parameter images change like this (Initialized with grayscale image).



First few epochs	Exploded, for 2th row image	Recovered

Results

CPU training is tooooooo slow. Took me 2+ hours for 800 iterations. (i5-8250U 8th Gen @ 1.60Hz)



Style	Content	Output(800 Iterations)

I've also done the style transfer of Van Gogh's self portrait for my dad, which is not appropriate to display, but worked.

Possible TODOs

Try adding InstanceNorm into VGG-19 ? Useful ? Meaningful ?