Pillowでアノーテーション画像からマスク画像の作成

Blog

金, 2月 24, 2017

やりたいこと

JS Segmentation Annotatorで作成されたアノーテーション画像から，それぞれのラベルに対してマスク画像を作成したい

Pillow

Pillpwは開発の止まっているPILのfork． Python3系列にも対応している．

導入

pip install Pillow

だけ．依存するパッケージは公式を参照．

画像のmode（Concept| Pillow）

ピクセルの型や"深さ"を定義するものとしてmodeがある．現在（2017-02-24）は，以下のmodeをサポートしている．

1 (1-bit pixels, black and white, stored with one pixel per byte)
L (8-bit pixels, black and white)
P (8-bit pixels, mapped to any other mode using a color palette)
RGB (3x8-bit pixels, true color)
RGBA (4x8-bit pixels, true color with transparency mask)
CMYK (4x8-bit pixels, color separation)
YCbCr (3x8-bit pixels, color video format)
LAB (3x8-bit pixels, the Lab color space)
HSV (3x8-bit pixels, Hue, Saturation, Value color space)
I (32-bit signed integer pixels)
F (32-bit floating point pixels)

これらのmodeを引数として渡すときは__文字列__として渡す必要がある（1に注意）．

対応しているファイルフォーマット（Image file formats| Pillow）

現在（2017-02-24）は，以下のフォーマットに対応している．

Fully supported formats
- BMP
- EPS
- GIF
- ICNS
- ICO
- IM
- JPEG
- JPEG 2000
- MSP
- PCX
- PNG
- PPM
- SGI
- SPIDER
- TIFF
- WebP
- XBM

特に重要なのはこれらのファイル形式．他にもRead-onlyやWrite-onlyに対応しているファイル形式もある．

#### 読み込みのみサポート

CUR
DCX
DDS
FLI, FLC
FPX
FTEX
GBR
GD
IMT
IPTC/NAA
MCIDAS
MIC
MPO
PCD
PIXAR
PSD
TGA
WAL
XPM

#### 書き込みのみサポート

PALM
PDF
XV Thumbnails

#### 認識のみサポート

BUFR
FITS
GRIB
HDF5
MPEG
WMF

それぞれのフォーマットは対応しているmodeなどが異なるので気をつける．

二値画像のバグ？

参考:How to convert image which mode is “1” between PIL and numpy?| stackoverflow

mode='1'の画像についてnumpy.arrayとPIL.Image.Imageを行き来すると挙動が怪しい．何か理由があるのかも知れないが，とりあえず今後の課題．今回は，最後に二値画像へ変換し，途中ではmodel='L'やdtype=uint8を利用することにする．

アノーテーション画像のデコード

本題に入る． JS Segmentation Annotatorでラベリングしたアノーテーション画像から，それぞれのラベルに対応するマスク画像を出力する．

JS Segmentation Annotatorでラベリングした情報はpngファイルにRGB値としてエンコードされている．ラベル情報へ戻すにはデコードしてやる必要がある．エンコード・デコードの方法はGitHubレポジトリのMatlab tipsの項に記載されている．

JS Segmentation Annotator自体の使い方はGitHubレポジトリのREADME.mdやオンラインデモを参照．

pngファイルの読み込みとデコード

import numpy as np
from PIL import Image

# png画像（PATH_TO_ANNOTATION_IMG）を読み込み，RGB画像に変換（アルファチャネルの情報を削除）
img = Image.open(PATH_TO_ANNOTATION_IMG).convert("RGB")

# RGBそれぞれのチャネルを分割し，numpy.ndarrayへ変換
r,g,b = img.split()
rArray = np.asarray(r)
gArray = np.asarray(g)
bArray = np.asarray(b)

# デコード
annotation = rArray
annotation = np.bitwise_or(annotation,np.left_shift(gArray,8))
annotation = np.bitwise_or(annotation,np.left_shift(bArray,16))

# (X, Y)座標のピクセルのラベル情報
annotation[X,Y]

出力

二値画像として出力する．保存する前まではdtype=uint8やmode=Pとして扱う．

imgArray = np.where(annotation == IDX, 255*np.ones(annotation.shape, np.uint8 ), np.zeros(annotation.shape, np.uint8 ))
img = Image.fromarray(imgArray).convert('1')
img.save(OUTPUTPATH)

IDXは取り出したいラベルの番号．

以上で欲しいラベルのマスク画像が出力できた．

今後の課題

Pillowにおける二値画像の挙動の確認

最終更新日金, 2月 24, 2017

Ubuntu 16.04 LTSにCUDAとTensorFlowを入れて深層学習環境をつくる AWS CLIで複数アカウントを利用する