sparse4d记录

张开发

• 2026/4/11 5:40:13 • 15 分钟阅读

分享文章

instance bank介绍num_anchor:总anchor个数包括历史的num_cur:当前帧预测的query个数num_temp:历史保存的query个数num_anchor num_cur num_tempget:forward时在所有decoder layer推理前调用。if 有历史信息历史结果egopose变换到当前帧并更新self.cached_anchor . 类似prepare的作用, cached_anchor应该历史保存的num_temp个else if 没有历史信息resetupdate第一层refine后调用将当前帧预测的query取top num_cur 个和历史num_temp个拼接。返回拼接后的 num_anchor 个query,不更新self.cached_anchorcache:所有decoder layer结束后调用取topk个query保存下来,更新 self.cached_anchordef forward:query instancebanck.get()for i in range(num_decoder_layer):query decoder_layer(query) # 从当前帧预测出num_anchor个queryif i 0:query instancebank.update(query)isinstancebank.cache(query) # 保存num_temp个query到instancebank中[2:] 截断后去掉首个 gnn, norm[deformable, ffn, norm, refine,temp_gnn, gnn, norm, deformable, ffn, norm, refine, (×5)]完整调用流程图forward() 开始│├─ instance_bank.get()│ 返回:│ instance_feature [B, num_anchor, C] ← 可学习参数不含历史│ anchor [B, num_anchor, D]│ temp_instance_feature cached_feature ← 上帧缓存可能为 None│ temp_anchor cached_anchor│ time_interval│├─ 拼接 dn_anchor → anchor / instance_feature 变为 [B, num_anchornum_dn, *]││ ════════ 单帧 Decoder第0轮只有后半段 ════════│├─ [op: deformable] 图像特征采样更新 instance_feature├─ [op: ffn] 前馈网络├─ [op: norm] Layer Norm├─ [op: refine] 预测 anchor/cls/quality│ prediction[anchor_0], classification[cls_0]│ len(prediction)1 num_single_frame_decoder ← 触发│ ││ └─ instance_bank.update(instance_feature, anchor, cls)│ ┌─ 若 cached_feature 为 None第一帧→ 直接返回不变│ └─ 否则│ topk 选出当前帧 top-(num_anchor-num_temp) 个实例│ 拼接 [cached_feature(历史) | selected_feature(当前)]│ instance_feature 更新为 [历史|当前] 混合shape 仍 [B, num_anchor, C]││ ════════ 时序 Decoder ×5 ════════││ ┌─ [op: temp_gnn] ← 跨帧 cross-attention│ │ query instance_feature已含历史│ │ key/value temp_instance_feature纯历史 cached_feature│ │ → 融合时序信息更新 instance_feature│ ││ ├─ [op: gnn] 帧内 self-attention│ ├─ [op: norm]│ ├─ [op: deformable] 图像特征采样│ ├─ [op: ffn]│ ├─ [op: norm]│ └─ [op: refine] 预测 anchor/cls/quality追加到 prediction/classification│ ×5 重复│├─ instance_bank.cache(instance_feature, anchor, cls)│ topk 选 num_temp_instances 个高置信度实例│ 存入 cached_feature / cached_anchor ← 供下一帧 temp_gnn 使用│└─ [仅推理] instance_bank.get_instance_id(cls, anchor)分配/继承跨帧 instance ID用于追踪关键设计点总结┌───────────────────┬────────────────────────────────────────────────────────┐│ 阶段 │ instance_feature 内容 │├───────────────────┼────────────────────────────────────────────────────────┤│ get() 之后 │ 纯可学习参数无历史 │├───────────────────┼────────────────────────────────────────────────────────┤│ 单帧 decoder 期间 │ 当前帧图像特征驱动 │├───────────────────┼────────────────────────────────────────────────────────┤│ update() 之后 │ [:num_temp_instances] 历史[num_temp:] 当前 top-N │├───────────────────┼────────────────────────────────────────────────────────┤

sparse4d记录

最新文章

如何快速构建交互式教程平台：Interactive Tutorials项目架构深度解析

Blender 3MF插件：3D打印工作流的终极解决方案

GLM-4.1V-9B-Base实战案例：基于STM32采集端的远程视觉分析系统

Android HAL开发避坑指南：从编写AIDL接口到VINTF注册的完整流程与常见错误排查

告别复杂配置！lora-scripts一键部署教程，轻松训练你的第一个LoRA模型

5步构建OpenVINO Notebooks模型推理服务监控告警系统

推荐文章

CSS Scroll Snap：打造丝滑滚动体验

【2026年最新600套毕设项目分享】springboot高校学习讲座预约系统（14328）

STM32H7 USB复合设备库：CDC+MSC+SDMMC一体化固件

STM32异步Web服务器：零拷贝HTTP/WS工业网关实战

Linux命令-nc（用于设置路由器，是网络工具中的瑞士军刀）

【电池损耗+需求响应】考虑电池储能寿命与需求响应模型的发电计划优化程序Matlab代码

相关文章

2025 AI写作革命：自定义API打造专属小说生成器

用GDAL和PyTorch搞定多光谱.tif图像训练Faster R-CNN（避坑全记录）

HC-SR501人体红外传感器：从参数解析到树莓派实战应用

2026年三维扫描仪选购指南：专业厂家如何选，这几点是关键

微信小程序项目目录结构优化指南：从tabBar报错看最佳实践

探索Feishin：打造个人专属的自托管音乐播放解决方案

分享文章

更多文章

剪流AI智能手机如何帮助初期创业者节省运营成本：一部手机组建的AI团队

从零到一：基于Rook Operator的Ceph集群云原生部署与Kubernetes存储集成全攻略

Qwen-Image-Edit进阶教程：使用LangChain构建复杂编辑工作流

Phi-3-Mini-128K保姆级教程：如何将对话历史导出为Markdown并自动归档

用Multisim搞定LM324带通滤波器：从理论计算到仿真调试的完整避坑指南

大语言模型应用开发：从理论到实践

Fish Speech 1.5部署教程：CSDN平台GPU实例网络策略与安全组配置

别再死记硬背了！用Python脚本模拟Modbus 0x03和0x10通信，5分钟搞懂报文结构

Vue项目里pdfh5.js白屏？别急着降版本，先看看你的依赖锁文件

从零构建ReAct Agent：完整代码实现解析

详细解析Spring如何解决循环依赖问题聘

Phi-4-Reasoning-Vision行业落地：农业遥感图像病虫害识别与防治建议推理