我有一个PASCAL VOC数据集。我想用它在tensorflow中构建一个深度学习模型。我认为我需要将其转换为TFRecord文件格式来构建模型,但我不确定我的想法是否正确。如果是,将PASCAL VOC转换成TFRecord文件格式的代码是什么?如果不是,您是否建议加载此PASCAL VOC数据集以在tensorflow中构建模型。这是我的PASCAL VOC数据集。
<annotation>
<filename>000000000.jpg</filename>
<source>
<annotation>ArcGIS Pro 2.1</annotation>
</source>
<size>
<width>256</width>
<height>256</height>
<depth>3</depth>
</size>
<object>
<name>0</name>
<bndbox>
<xmin>209.62</xmin>
<ymin>3.86</ymin>
<xmax>256.00</xmax>
<ymax>70.93</ymax>
</bndbox>
</object>
<object>
<name>0</name>
<bndbox>
<xmin>120.92</xmin>
<ymin>126.09</ymin>
<xmax>200.23</xmax>
<ymax>209.97</ymax>
</bndbox>
</object>
<object>
<name>0</name>
<bndbox>
<xmin>237.72</xmin>
<ymin>136.02</ymin>
<xmax>256.00</xmax>
<ymax>214.18</ymax>
</bndbox>
</object>
发布于 2019-05-07 23:16:30
Tensorflow对象检测API为其提供了一个tool,您可以运行以下命令:
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=<path/to/label/map.pbtxt> \
--data_dir=<path/to/data/dir> --year=<year_directory_name> --set=<train|test|val> \
--output_path=pascal_<train|test|val>.record这需要一个如下形式的树:
data_dir
|- year_dir
|- Annotations
|- *.xml
|- ImageSets
|- Layout
|- test.txt
|- train.txt
|- val.txt
|- trainval.txt
|- Main
|- *.txt
|- JPEGImages
|- *.jpg例如,对于通常的PASCAL数据集,结果将是:
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit --year=VOC2012 --set=val \
--output_path=pascal_val.record发布于 2019-05-08 00:51:51
VOC2007在最新的tensorflow-datasets==1.0.2版本中可用(在pip上还不可用)。
要安装它,请在终端中运行以下命令:
git clone https://github.com/tensorflow/datasets
cd datasets
python setup.py build
python setup.py install使用示例(Jupyter中的绘图):
import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image, ImageDraw
%matplotlib inline
OUTLINE = (0, 255, 0)
builder = tfds.builder('voc2007')
builder.download_and_prepare()
datasets = builder.as_dataset()
train_data, test_data = datasets['train'], datasets['test']
iterator = train_data.repeat(1).batch(1).make_one_shot_iterator()
next_batch = iterator.get_next()
with tf.Session() as sess:
for _ in range(1):
batch = sess.run(next_batch)
image = batch['image']
bboxes = batch['objects']['bbox']
bboxes, image = np.squeeze(bboxes), np.squeeze(image)
pil_image = Image.fromarray(image.astype('uint8'), 'RGB')
draw = ImageDraw.Draw(pil_image)
height, width = image.shape[:2]
try:
if (isinstance(bboxes[0], np.float32)
or isinstance(bboxes[0], np.float64)):
bboxes = [bboxes]
for bbox in bboxes:
ymin, xmin, ymax, xmax = bbox
xmin *= width
xmax *= width
ymin *= height
ymax *= height
c1 = (xmin, ymin)
c2 = (xmax, ymin)
c3 = (xmax, ymax)
c4 = (xmin, ymax)
draw.line([c1, c2, c3, c4, c1],
fill=OUTLINE,
width=3)
asnumpy = np.array(pil_image)
figure = plt.figure(figsize=tuple(x/50 for x in image.shape[:2]))
plt.imshow(asnumpy)
except TypeError:
pass

https://stackoverflow.com/questions/56025263
复制相似问题