Fig. 2

Architecture of the proposed multi-granularity sequence encoding module, illustrating joint-level temporal convolution and part-level semantic pooling for structured feature extraction.
Architecture of the proposed multi-granularity sequence encoding module, illustrating joint-level temporal convolution and part-level semantic pooling for structured feature extraction.