简体中文 | English
💖 Welcome to scan the code and join the group discussion 💖
- Scan the QR code below with your Wechat and reply "video", you can access to official technical exchange group. Look forward to your participation.
PaddleVideo is a toolset for video tasks prepared for the industry and academia. This repository provides examples and best practice guildelines for exploring deep learning algorithm in the scene of video area.
- Please refer to Installation guide and Usage doc before using the model zoo.
| Action recognition method | ||||
| PP-TSM (PP series) | PP-TSN (PP series) | PP-TimeSformer (PP series) | TSN (2D’) | TSM (2D') |
| SlowFast (3D’) | TimeSformer (Transformer') | VideoSwin (Transformer’) | AttentionLSTM (RNN') | |
| Skeleton based action recognition | ||||
| ST-GCN (Custom’) | AGCN (Adaptive') | |||
| Sequence action detection method | ||||
| BMN (One-stage') | ||||
| Spatio-temporal motion detection method | ||||
| SlowFast+Fast R-CNN | ||||
| Multimodal | ||||
| ActBERT (Learning') | T2VLAD (Retrieval') | |||
| Video target segmentation | ||||
| CFBI (Semi') | MA-Net (Supervised') | |||
| Monocular depth estimation | ||||
| ADDS (Unsupervised‘) | ||||
| Action Recognition | |||
| Kinetics-400 (Homepage) (CVPR'2017) | UCF101 (Homepage) (CRCV-IR-12-01) | ActivityNet (Homepage) (CVPR'2015) | YouTube-8M (Homepage) (CVPR'2017) |
| Action Localization | |||
| ActivityNet (Homepage) (CVPR'2015) | |||
| Spatio-Temporal Action Detection | |||
| AVA (Homepage) (CVPR'2018) | |||
| Skeleton-based Action Recognition | |||
| NTURGB+D (Homepage) (IEEE CS'2016) | FSD (Homepage) | ||
| Depth Estimation | |||
| Oxford-RobotCar (Homepage) (IJRR'2017) | |||
| Text-Video Retrieval | |||
| MSR-VTT (Homepage) (CVPR'2016) | |||
| Text-Video Pretrained Model | |||
| HowTo100M (Homepage) (ICCV'2019) | |||
| Applications | Descriptions |
|---|---|
| FootballAction | Football action detection solution |
| BasketballAction | Basketball action detection solution |
| TableTennis | Table tennis action recognition solution |
| FigureSkating | Figure skating action recognition solution |
| VideoTag | 3000-category large-scale video classification solution |
| MultimodalVideoTag | Multimodal video classification solution |
| VideoQualityAssessment | Video quality assessment solution |
| PP-Care | 3DMRI medical image recognition solution |
| EIVideo | Interactive video segmentation tool |
| Anti-UAV | UAV detection solution |
| AbnormalActionDetection | Abnormal action detection solution |
- AI-Studio Tutorial
- [Official] Paddle 2.1 realizes video understanding optimization model -- PP-TSM
- [Official] Paddle 2.1 realizes video understanding optimization model -- PP-TSN
- [Official] Paddle 2.1 realizes the classic model of video understanding -- TSN
- [Official] Paddle 2.1 realizes the classic model of video understanding -- TSM
- BMN video action positioning
- ST-GCN Tutorial for Figure Skate Skeleton Point Action Recognition
- [Practice]video understanding transformer model TimeSformer
- Contribute code
- Figure skating action recoginition using skeleton based on PaddlePaddle, AI Studio projects, video course
- Table tennis action proposal localization based on PaddlePaddle
- CCKS 2021: Knowledge Augmented Video Semantic Understanding
PaddleVideo is released under the Apache 2.0 license.
- Many thanks to mohui37、zephyr-fun、voipchina for contributing the code for prediction.

