XMem-SimAM based semi-supervised video segmentation of pigs
Author:
Affiliation:

1.College of Informatics, Huazhong Agricultural University, Wuhan 430070, China;2.Ministry of Agriculture and Rural Affairs Key Laboratory of Smart Farming for Agricultural Animals, Wuhan 430070 China;3.College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;4.Hubei Hongshan Laboratory, Wuhan 430070, China

Clc Number:

TP391.4

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    The dynamic feeding and growth process of breeding pigs during the performance testing was used to solve the problem of accurate segmentation of pigs caused by complex environments in pig farms, dynamic growth of pigs, and changes in body size. A pig video dataset consisting of 234 video sequences was constructed. A XMem-SimAM based semi-supervised video segmentation of pigs was proposed. The ability of model to extract temporal information at different scales was improved and the temporal features of pigs' dynamic movements were captured by introducing SimAM attention for multi-scale feature fusion. The spatial-channel attention module was used to enhance the model's extraction of temporal semantic feature weights. The strategy for multi-scale feature fusion and upsampling module were optimized. The temporal correlation information in video sequences was fully utilized to improve the segmentation accuracy of pigs in videos at a fine-grained level. The results of testing and comparison showed that the Jaccard index, contour accuracy F-score, average metric J&F, and the Dice coefficient of of XMem-SimAM model on the pig video dataset was 96.9, 95.8, 98.0, and 98.0, superior to that of video object segmentation methods including MiVOS, STCN, DEVA, and XMem++, demonstrating its outstanding performance of segmentation. The processing speed reached 58.5 frames per second, with a memory consumption of 795 MB at the stage of reasoning, achieving a good balance between the efficiency of processing and the utilization of resource. The proposed method can be applied to video segmentation of dynamically growing pigs in the complex environments of a pig farm.

    Fig.1 Data acquisition device
    Fig.2 Pixel level annotation style
    Fig.3 Data augmentation example
    Fig.4 Overall framework diagram of network mode
    Fig.5 Query encoder framework diagram
    Fig.6 Value encoder framework diagram
    Fig.7 Comparison of segmentation effects
    Fig.8 Comparison of segmentation performance between XMem-SimAM and semi-supervised video object segmentation network
    Fig.9 Comparison of segmentation results for small target areas
    Fig.10 Grad-CAM feature visualization
    Table 1 Organizational form of pig video dataset
    Table 2 Comparison of evaluation indicators for validation set networks
    Table 3 Comparison of evaluation indicators
    Table 4 Comparison of complexity
    Table 5 Comparison of small target segmentation performance
    Table 6 Ablation study results comparison
    Reference
    Related
    Cited by
Get Citation

陈萌放,徐迪红,李国亮,刘小磊,周明彦,黎煊. XMem-SimAM based semi-supervised video segmentation of pigs[J]. Jorunal of Huazhong Agricultural University,2025,44(2):17-28.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 16,2024
  • Online: April 02,2025
Article QR Code