<ul class="dashed" data-apple-notes-indent-amount="0"><li><span style="font-family: '.PingFangUITextSC-Regular'">文章标题:</span>Cross Initialization for Personalized Text-to-Image Generation</li><li><span style="font-family: '.PingFangSC-Regular'">文章地址:</span><a href="https://arxiv.org/abs/2312.15905">https://arxiv.org/abs/2312.15905</a> </li><li>CVPR 2024</li></ul> <img src="https://res.cloudinary.com/montaigne-io/image/upload/v1733132004/0697A456-D72E-4366-807B-0CE199B66FB8.png" style="background-color:initial;max-width:min(100%,2352px);max-height:min(978px);;background-image:url(https://res.cloudinary.com/montaigne-io/image/upload/v1733132004/0697A456-D72E-4366-807B-0CE199B66FB8.png);height:auto;width:100%;object-fit:cover;background-size:cover;display:block;" width="2352" height="978"><ul class="dashed" data-apple-notes-indent-amount="0"><li></li></ul> 方法很简单,就是将TI的初始化的embedding改成经过text encoder的hiddenstate。 作者通过对TI的学习中的embedding的大小和方向进行分析,发现其优化过程中,大小与方向与经过text encoder之后的embedding越来越接近,并且最终得到的embedding与初始的差别很大,因此作者就提出了一个新的初始化方法,该方法不仅仅加快了优化速度,还提高了生成图像的质量以及可编辑性。 <img src="https://res.cloudinary.com/montaigne-io/image/upload/v1733132004/B56C01FF-DFE0-4406-8A2C-8F7BAF77C122.png" style="background-color:initial;max-width:min(100%,2350px);max-height:min(1032px);;background-image:url(https://res.cloudinary.com/montaigne-io/image/upload/v1733132004/B56C01FF-DFE0-4406-8A2C-8F7BAF77C122.png);height:auto;width:100%;object-fit:cover;background-size:cover;display:block;" width="2350" height="1032"> <ul class="dashed" data-apple-notes-indent-amount="0"><li>数据:测试时微调</li><li>指标:人脸相似度(Arcface);文本相似度(CLIP)</li><li>硬件:1 A800/bs8</li><li>开源:<a href="https://github.com/lyuPang/CrossInitialization">https://github.com/lyuPang/CrossInitialization</a> </li></ul>