TL;DR RSEdit通过通道与上下文token拼接技术,将预训练扩散模型适配为遥感图像编辑框架,在保持地理空间一致性的同时实现灾害影响、城市扩张等场景的精准编辑。 摘要 通用领域的文本引导图像编辑器虽能实现高度的照片级真实感,但会引入伪影、产生物体幻觉 ...
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including ...
Abstract: We present Text-driven object-centric style editing model named Style-Editor, a novel method that guides style editing at an object-centric level using textual inputs. The core of ...