<ul class="dashed" data-apple-notes-indent-amount="0"><li><span style="font-family: '.PingFangSC-Regular'">文章标题:</span>PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding</li><li><span style="font-family: '.PingFangSC-Regular'">文章地址:</span><a href="https://arxiv.org/abs/2312.04461">https://arxiv.org/abs/2312.04461</a> </li><li>CVPR 2024</li></ul> <img src="https://res.cloudinary.com/montaigne-io/image/upload/v1723624690/506A32E8-3A1A-4552-9DEF-222C9DB07010.png" style="background-color:initial;max-width:min(100%,2448px);max-height:min(1460px);;background-image:url(https://res.cloudinary.com/montaigne-io/image/upload/v1723624690/506A32E8-3A1A-4552-9DEF-222C9DB07010.png);height:auto;width:100%;object-fit:cover;background-size:cover;display:block;" width="2448" height="1460"> 作者提出了PhotoMaker,一种高效的定制化人脸的文生图模型,其将输入的多张人脸图片编码为多个embedding,然后利用文本prompt中对应的名词的embedding进行融合,从而得到了作者提到的Stacked ID Embedding,然后再将该embedding替换原来文本中名词对应的embedding,输入到DM中进行条件生成。作者还介绍了构造训练数据的pipeline。 <img src="https://res.cloudinary.com/montaigne-io/image/upload/v1726491053/DE538AF0-9907-4429-8C20-522DCCA61526.png" style="background-color:initial;max-width:min(100%,2100px);max-height:min(1676px);;background-image:url(https://res.cloudinary.com/montaigne-io/image/upload/v1726491053/DE538AF0-9907-4429-8C20-522DCCA61526.png);height:auto;width:100%;object-fit:cover;background-size:cover;display:block;" width="2100" height="1676"> <div style="text-align: justify"><ul class="dashed" data-apple-notes-indent-amount="0"><li>数据:自己构造的</li><li>指标:CLIP-T DINO Face-Sim Face-Div</li><li>硬件:8 A100/bs48</li><li>开源:</li></ul></div> <div style="text-align: justify"><ul class="dashed" data-apple-notes-indent-amount="0"><li><a href="https://github.com/TencentARC/PhotoMaker">https://github.com/TencentARC/PhotoMaker</a></li></ul></div> <div style="text-align: justify"><ul class="dashed" data-apple-notes-indent-amount="0"><li></li></ul></div>