Can AI Accurately Translate Text in Images While Keeping the Original Style?

2 points by apichar 11 hours ago

We’re working on an Image-to-Image Translation Model that extracts, translates, and reinserts text into images while keeping the original style.

So far, our pipeline involves: - OCR (PaddleOCR) for text extraction - Inpainting to remove original text - Overlaying translated text in a matching font

Where we’re going: - Non-Latin scripts (e.g., Hindi, Arabic, Chinese) - Text with complex orientations (curved, stylized fonts) - Seamless rendering that preserves the original aesthetics

We’re exploring diffusion models, ControlNet, and GlyphControl, but we’re still figuring out the best approach.

Has anyone worked on this or have insights on in-scene text translation?

Full thoughts here: https://jigsawstack.com/blog/diffusion-model-text-rendering