井冈山大学学报自然科学版

文章摘要

汤鹏杰,谭云兰,李金忠,谭彬.基于双流混合变换CNN特征的图像分类与识别[J].井冈山大学自然版,2015,(5):53-59

基于双流混合变换CNN特征的图像分类与识别

IMAGE CLASSIFICATION AND RECOGNITION BASED ON DEEP TWO STREAM MIXED CNN FEATURES

投稿时间：2015-05-13 修订日期：2015-07-14

DOI：10.3969/j.issn.1674-8085.2015.05.011

中文关键词: 图像分类识别双流混合 CNN

英文关键词: image classification recognition two stream mixed transformation CNN

基金项目:江西省教育厅科技计划项目(GJJ14561);井冈山大学科研基金项目(JZ14012)

作者	单位	E-mail
汤鹏杰	井冈山大学数理学院, 江西, 吉安 343009	5tangpengjie@tongji.edu.cn
谭云兰	井冈山大学电子与信息工程学院, 江西, 吉安 343009
李金忠	井冈山大学电子与信息工程学院, 江西, 吉安 343009
谭彬	井冈山大学电子与信息工程学院, 江西, 吉安 343009

摘要点击次数: 3518

全文下载次数: 4820

中文摘要:

具有表达能力及可辨别性更强的特征是图像分类与识别技术的关键。深度CNN特征经过多次中间非线性变换,特征鲁棒性更强,在图像分类与识别领域已取得重大进展。但传统的CNN模型只增加变换层次,下层变换依赖于上层输出结果,因此其中间特征冗余度较低,最终得到的特征向量信息丰富程度不够。本文提出一种基于双流混合变换的CNN模型——DTM-CNN 。该模型首先使用不同大小的感受野卷积核提取图像不同的中间特征,然后在多次深度变换时,对中间特征进行混合流动,经过多次混合变换,最终得到1024维的特征向量,并使用Softmax回归函数对其分类。实验结果表明,该模型经过多次卷积、池化及激活变换,提取的特征更加抽象、语义及结构信息更加丰富,对图像具有更强的表达能力及辨别性,因此图像分类及识别性能优越。

英文摘要:

It is very important for image classification and recognition that the feature is more discriminative and has power representation ability. The deep CNN feature is more robust than other features because of its more non-linear transformation, and great breakthrough has obtained in the field of image classification and recognition based on the CNN. However, in the traditional CNN model, there just increase the transformation layers, and the posterior layer relies on the prior layer. As a result, the intermediate feature has low redundancy, and there is no enough information in the feature. In this paper, we propose a novel CNN model based on two stream and mixed transform. In this model, the intermediate feature is extracted via using different convolution kernels firstly. And then, the mixed feature is generated and flows forward when the deep transform is executed. Finally, we get a 1024D feature vector and classify it with the Softmax regression function. The experiment demonstrates that the feature extracted by the model is more abstract and has richer structural and semantic information via convolution, pooling and activation transformation repeatedly. And so, it has better performance for classification and recognition than other same models.

查看全文查看/发表评论下载PDF阅读器

关闭