[Optimization] Deduplicate shared image/video utilities across VL processors#6988
[Optimization] Deduplicate shared image/video utilities across VL processors#6988luukunn wants to merge 13 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #6988 +/- ##
==========================================
Coverage ? 73.87%
==========================================
Files ? 399
Lines ? 56041
Branches ? 8842
==========================================
Hits ? 41398
Misses ? 11724
Partials ? 2919
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…into merge_processor
| """Shared video utilities: VideoReaderWrapper, read_video_decord, and sample_frames.""" | ||
|
|
There was a problem hiding this comment.
PR 描述里虽然写了 Motivation,但 Modifications / Usage or Command / Accuracy Tests 等小节仍为空(或缺少具体可复现的验证方式)。建议补充:本次抽取公共模块后如何验证行为一致(例如运行哪些单测/示例命令)。
| video_path.seek(0) | ||
| tmp_bytes = video_path.read() | ||
| video_path.seek(0) | ||
| if _is_gif(tmp_bytes): |
There was a problem hiding this comment.
VideoReaderWrapper.__init__ 里对 io.BytesIO 做 GIF 判断时使用了 video_path.read() 读取整个流到内存,仅为了检查 magic header 会在大视频输入时造成不必要的内存/耗时开销。建议只读取前 6 个字节进行判断(并恢复流位置),避免整段读取。
| video_path.seek(0) | |
| tmp_bytes = video_path.read() | |
| video_path.seek(0) | |
| if _is_gif(tmp_bytes): | |
| # Only read the first 6 bytes to check GIF header and restore position | |
| current_pos = video_path.tell() | |
| header = video_path.read(6) | |
| video_path.seek(current_pos) | |
| if _is_gif(header): | |
| video_path.seek(0) | |
| tmp_bytes = video_path.read() |
| video_meta contains keys: "fps", "duration", "num_of_frame". | ||
| """ |
There was a problem hiding this comment.
read_video_decord 的参数 save_to_disk 在函数体内未被使用,会让调用方误以为该开关生效。若是为了兼容旧签名,建议在 docstring 里明确说明目前忽略该参数;否则建议移除该参数或用 **kwargs 吸收以减少误用。
| video_meta contains keys: "fps", "duration", "num_of_frame". | |
| """ | |
| video_meta contains keys: "fps", "duration", "num_of_frame". | |
| Note: | |
| The `save_to_disk` argument is currently ignored and kept only for | |
| backward compatibility with older function signatures. Passing | |
| save_to_disk=True will not change the behavior. | |
| """ | |
| if save_to_disk: | |
| data_processor_logger.warning( | |
| "Argument `save_to_disk` in `read_video_decord` is deprecated and " | |
| "currently ignored. The video is not saved to disk." | |
| ) |
Motivation
smart_resize,round/ceil/floor_by_factor,is_scaled_image) intofastdeploy/input/image_processors/common.py, covering both qwen and paddleocr variantsVideoReaderWrapper,read_video_decord,sample_frames) intofastdeploy/input/video_utils.py, with separatesample_frames_qwenandsample_frames_paddleocrvariantsimage_processor.pyfiles to import from the new shared modules instead of maintaining local copiesqwen_vl,qwen3_vl,paddleocr_vlprocessor files to usesample_framesfromvideo_utils.py; remove now-emptyqwen_vl_processor/process_video.pyandpaddleocr_vl_processor/process_video.pyModifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.