TensorRT小结2
学习TensorFlow-TensorRT项目,总结学习心得.
https://github.com/ardianumam/Tensorflow-TensorRT
- Read input Tensroflow model
- Convert to frozen model ".pb"
- Convert (optimize) to TensorRT model
- Inference using TensorRT model
利用keras训练mnist数据集 利用tensorRT改造后 推理加速
1.设置训练过程 读取图片
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
zca_epsilon=1e-06,
rotation_range=0,
width_shift_range=0.0,
height_shift_range=0.0,
brightness_range=None,
shear_range=0.0,
zoom_range=0.0,
channel_shift_range=0.0,
fill_mode='nearest',
cval=0.0,
horizontal_flip=False,
vertical_flip=False,
rescale=None,
preprocessing_function=None,
data_format=None,
validation_split=0.0,
dtype=None)
通过实时数据增强生成张量图像数据批次,数据将不断循环.
rescale: 重缩放因子。默认为 None。如果是 None 或 0,不进行缩放,否则将数据乘以所提供的值(在应用任何其他转换之前)
置放缩因子为1/255,把像素值放缩到0和1之间有利于模型的收敛,避免神经元“死亡”。
创建train_generator和testing_generator
2.定义网络
定义keras网络,利用Sequential搭建: input_tensor --> Conv2D_1 --> Conv2D_2 --> Conv2D_3 --> Flatten --> Dense --> output_tensor
Layer (type) Output Shape Param #
=================================================================
input_tensor (Conv2D) (None, 28, 28, 20) 520
_________________________________________________________________
activation (Activation) (None, 28, 28, 20) 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 20) 0
_________________________________________________________________
conv2d (Conv2D) (None, 14, 14, 20) 10020
_________________________________________________________________
activation_1 (Activation) (None, 14, 14, 20) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 20) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 7, 7, 20) 10020
_________________________________________________________________
activation_2 (Activation) (None, 7, 7, 20) 0
_________________________________________________________________
flatten (Flatten) (None, 980) 0
_________________________________________________________________
dense (Dense) (None, 10) 9810
_________________________________________________________________
output_tensor (Dense) (None, 3) 33
=================================================================
Total params: 30,403
Trainable params: 30,403
Non-trainable params: 0
3.训练网络
model.fit_generator
4.推理
利用load_model
读取网络
利用np.asarray
转numpy的asarray格式
predict_classes与predict函数的区别 当使用predict()方法进行预测时,返回值是数值,表示样本属于每一个类别的概率,我们可以使用numpy.argmax()方法找到样本以最大概率所属的类别作为样本的预测标签。 当使用predict_classes()方法进行预测时,返回的是类别的索引,即该样本所属的类别标签。
出现的问题
Q1
a bytes-like object is required,not 'str'
在保存模型结果时,出现这种问题 encoding搞了半天也没有解决 最后将hdf5的版本由2.4.0升级到了2.5.0后问题解决.
Q2
Cuda Error in nvinfer1::cudnn::findFastestTactic
可能是因为cuda内存管理问题 导致溢出 多试了几次解决了
==================== YOLOv3 =====================
利用权重文件 yolov3_gpu_nms
转换成TensorRT_YOLOv3_2.pb
读取模型的输入输出:
input_tensor, output_tensors = \
utils.read_pb_return_tensors(tf.get_default_graph(),
TENSORRT_YOLOv3_MODEL,
["Placeholder:0", "concat_9:0", "mul_9:0"])
在Session读取
boxes, scores = sess.run(output_tensors,
feed_dict={input_tensor:
np.expand_dims(
img_resized, axis=0)})
boxes, scores, labels = utils.cpu_nms(boxes,
scores,
num_classes,
score_thresh=0.4,
iou_thresh=0.5)
image = utils.draw_boxes(image, boxes, scores, labels,
classes, SIZE, show=False)