Action-aware Linguistic Skeleton Optimization Network for Non-autoregressive Video Captioning | Synapse