Conclusion: This paper presents a deep separable convolutional neural network accelerator designed specifically for ShuffleNetV2. Based on the features of ShuffleNetV2, optimizations are made to the network structure, achieving a 1.09% increase in accuracy while reducing the parameters by 0.18M. The paper also proposes a reconfigurable hardware accelerator that supports both PwC and DwC. The power consumption of this accelerator is only 7.3W while achieving a power efficiency of 13.45 GOPS/W. The running frame rate achieves 675.7 fps.
Acknowledgments: The authors thank to the support by the Science and Technology Program of Guangdong Province under Grant 2022B0701180001.
References
  1. N. Ma et al , “ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design,” in Proceedings of the European conference on computer vision (ECCV)  pp. 116-131, doi:  arXiv:1807.11164v1.
  2. D. Wu et al ., ”A High-Performance CNN Processor Based on FPGA for MobileNets,” 2019 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain, 2019, pp. 136-143, doi:10.1109/FPL.2019.00030.
  3. Y. -G. Chen, H. -Y. Chiang, C. -W. Hsu, T. -H. Hsieh and J. -Y. Jou, ”A Reconfigurable Accelerator Design for Quantized Depthwise Separable Convolutions,” 2021 18th International SoC Design Conference (ISOCC), Jeju Island, Korea, Republic of, 2021, pp. 290-291, doi:10.1109/ISOCC53507.2021.9613976.
  4. Z. Fan, W. Hu, H. Guo, F. Liu and D. Xu, ”Hardware and Algorithm Co-Optimization for pointwise convolution and channel shuffle in ShuffleNet V2,” 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia, 2021, pp. 3212-3217, doi:10.1109/SMC52423.2021.9659057.
  5. Y. Yang et al , Synetgy: “Algorithm-hardware co-design for convnet accelerators on embedded fpgas,” Proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays. 2019, pp. 23-32, doi:10.1145/3289602.3293902.
  6. Y. Lin et al , ”A High-speed Low-cost CNN Inference Accelerator for Depthwise Separable Convolution,” 2020 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), Nanjing, China, 2020, pp. 63-64, doi:10.1109/ICTA50426.2020.9332057.