深度学习时代的主动学习,个人觉得肇始于MCdropout(2016年, 2019有一个补丁版)[1],兴盛于coreset(2018年)[2],中间夹杂着基于对抗的方法[3]和极少量的基于强化学习方法[4]。这两年,感觉主动学习的研究进入冷静期,虽然也有些工作,但细看还是修修补补居多,比如把coreset的距离度量替换成wiss.距离[5]。但现在主动学习面临的最尴尬的问题还没见到很好的解决方案:与竞品相比不够有效,又很难和其他方面发现的很有效的技巧结合来提升, 比如先进的数据扩增randaug,乃至于半监督学习等等。
推广coreset到模型参数后验的sparse corset(2019nips)[6], 结合不确定性和特征空间覆盖的BADGE(2020iclr)[7],切换视角来看分类器影响的NN classifier(2021aaai)[8],在线主动半监督MPART(2021icml)[9], 数据不平衡状态下的处理VaB-AL(2021cvpr)[10]。接下来还是想谈谈为啥AL有点不温不火,以及他还可以再战吗?
在炼丹这领域不管故事说的多好听,得拿出效果来。所以半温不火的原因,一定免不了在效果提升方面比不上竞品:半监督,自监督+少量标注样本fine-tune。简单拉下数据,cifar10上,半监督用250张标注样本能到95%了,自监督+250张标注样本fine-tune下也能到88%,而主动学习到这么高要多少样本?9000张标注样本,也才能到90%的精度(NNclassifier, coreset, learning loss, VAAL)。这么大的差距,直接击溃主动学习心态了。当然,我们也可能会想,那既然用主动学习比不上,么我把两者结合呢,搞个Active semi-sup.啥的,那就遇到更坑爹的问题,不像半监督学习和自监督学习,轻轻松松就结合起来了,两者都不用怎么改,组件一叠加就效果倍增,比如S4L。主动学习与其他组件结合性不好,甚至有点差。
其实和半监督的结合是个很自然的想法,既然主动学习能优化标注样本的选择,而半监督能利用无标注样本,大家结合一下更好嘛。但一搜文章,大家试着一结合,发现加上主动学习效果不明显,甚至起到反作用。发个图大家看看,图源[11]
直到今年发表的主动学习方面论文,在实验评价方面都还在用古早的设置,比如数据扩增只用随机翻转裁切,但主动学习现在的坑点在于他一结合强数据扩增效果就崩掉了啊,一直在弱扩增的条件下实验有个鬼用哦。甚至有人专门搞了个文章来喷这个[12].
我觉得能,但是得切换合适的场景。做个简单类比,主动学习和半监督学习就像游戏文明里的两种流派,半监督对应爆铺,不追求每个点很好,但以量取胜;主动学习对应精铺,希望每个点尽量好,但在一些版本比如文明6,效果越来越比不上爆铺。那与其在这个版本里研究怎么精铺,不如换个版本,换个游戏,换到更适合精铺的场景 不就好了。也就是跳出主动学习来做分类这个太经典的任务场景,找到更合适的场景:不仅标注成本更大,甚至采样数据点的成本都更大的地方?比如机器人领域,主动环境探索[13],主动环境监测[14],电磁场领域[15]等等等。
拿我现在组里师兄的一个工作来举个例子,简单点说就是利用主动学习的方法来选择采样点,规划布设自主水下航行器AUV的实际路径[14]。明显这种场景就更适合套主动学习,因为这场景下连其他采样点都没有,直接免去了来自更暴力的竞品——半监督的威胁。具体点描述:任务场景是我们想要建立大块海域的栖息地模型,也就是这块海洋哪是珊瑚,哪是沙子,哪是礁石,我们也有覆盖这片海域的粗粒度的声呐测深结果,想要选出最合适的地点投放潜航器来采集光学图像。在这个场景里,面临的就不仅仅是标注个样本的成本,而是采集数据点本身成本就很高,租能搭载潜航器的船,一组实验人员开销啥的,这时候我们就非常想要采到最有信息量的数据,所以套了主动学习,套用的主动学习方法也很简单,就MCdropout那一套,找找epistemic不确定性,投放潜航器就得。
[1] Gal, Yarin, et al. “Deep Bayesian Active Learning with Image Data.” Proceedings of the 34th International Conference on Machine Learning - Volume 70, 2017, pp. 1183–1192.
[2] Sener, Ozan, and Silvio Savarese. “Active Learning for Convolutional Neural Networks: A Core-Set Approach.” International Conference on Learning Representations, 2018.
[3] Sinha, Samrath, et al. “Variational Adversarial Active Learning.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 5972–5981.
[4]Casanova, Arantxa, et al. “Reinforced Active Learning for Image Segmentation.” ICLR 2020 : Eighth International Conference on Learning Representations, 2020.
[5]Shui, Changjian, et al. “Deep Active Learning: Unified and Principled Method for Query and Training.” International Conference on Artificial Intelligence and Statistics, 2020, pp. 1308–1318.
[6]Pinsler, Robert, et al. “Bayesian Batch Active Learning as Sparse Subset Approximation.” Advances in Neural Information Processing Systems, vol. 32, 2019, pp. 6356–6367.
[7] Ash, Jordan T., et al. “Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds.” ICLR 2020 : Eighth International Conference on Learning Representations, 2020.
[8] Wan, Fang, et al. “Nearest Neighbor Classifier Embedded Network for Active Learning.” AAAI, 2021, pp. 10041–10048.
[9]Kim, Taehyeong, et al. “Message Passing Adaptive Resonance Theory for Online Active Semi-Supervised Learning.” ICML 2021: 38th International Conference on Machine Learning, 2021, pp. 5519–5529.
[10]Choi, Jongwon, et al. “VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 6749–6758.
[11] Mittal, Sudhanshu, et al. “Parting with Illusions about Deep Active Learning.” ArXiv Preprint ArXiv:1912.05361, 2019.
[12] Munjal, Prateek, et al. “Towards Robust and Reproducible Active Learning Using Neural Networks.” ArXiv Preprint ArXiv:2002.09564, 2020.
[13] Liu, Liyang, et al. “Active and Interactive Mapping With Dynamic Gaussian Process Implicit Surfaces for Mobile Manipulators.” IEEE Robotics and Automation Letters, vol. 6, no. 2, 2021, pp. 3679–3686.
[14] Shields, Jackson, et al. “Towards Adaptive Benthic Habitat Mapping.” 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 9263–9270.
[15] Yan, Tianxu, et al. “Scattering Modeling for Complex Radar Target Based on Space Mapping Technique.” 2020 XXXIIIrd General Assembly and Scientific Symposium of the International Union of Radio Science, 2020.