MicromOne: Reverse Engineering Google’s SynthID: What This Project Teaches Us (and How to Use It)

As AI-generated content becomes indistinguishable from reality, companies like Google have introduced systems to keep things traceable. One of the most important of these is SynthID, a watermarking technology designed to invisibly tag AI-generated media.

But an open-source project called reverse-SynthID, created by Aloshdenny, is showing that even sophisticated watermarking systems can be analyzed from the outside.

This article explains what the project actually does, how it works in simple terms, and how you can experiment with it yourself.

What SynthID Actually Does (in simple terms)

SynthID is not a visible watermark like a logo or text overlay. Instead, it modifies content in a way that humans cannot perceive.

For images, the system works in the frequency domain. That means instead of changing pixels directly, it slightly alters patterns that only appear when you mathematically transform the image using something like a Fourier Transform.

Think of it like this:

The image you see stays the same
Hidden underneath, there’s a structured signal
That signal can later be detected by specialized tools

This is much harder to remove than metadata, because it’s embedded inside the content itself.

What reverse-SynthID Discovered

The key idea behind reverse-SynthID is surprisingly simple:
If you generate enough AI images and analyze them carefully, patterns start to emerge.

The developer used a clever trick:

Generate uniform images (for example, completely black images)
Since the image itself contains almost no information, any remaining signal is likely the watermark
Convert the image into frequency space
Look for consistent patterns across multiple samples

What emerged is that the watermark is not random noise. It has structure:

It appears at specific frequency points
The pattern is consistent across images
It changes depending on resolution

This means the watermark can be studied, mapped, and partially predicted.

Why This Matters

This doesn’t mean SynthID is useless. It still works well in many real-world cases.

But it reveals something important:

Any system that produces consistent outputs can be reverse engineered
Even invisible signals leave detectable traces
Security based only on secrecy is never permanent

In practice, the project shows that it may be easier to:

disrupt detection
or confuse classifiers

than to completely remove the watermark.

How reverse-SynthID Works (high level)

The project relies on classic signal processing rather than AI.

Main steps:

Generate a dataset of AI images
Apply a Fourier Transform (FFT)
Identify repeating frequency components
Isolate the watermark signal
Analyze or manipulate it

No neural networks, no proprietary access—just math and observation.

How to Use reverse-SynthID (Practical Guide)

If you want to try it yourself, here’s a simple walkthrough.

Clone the repository

Open your terminal and run:

git clone https://github.com/aloshdenny/reverse-SynthID.git
cd reverse-SynthID 

Install dependencies

The project typically uses Python with scientific libraries:

pip install -r requirements.txt

If there’s no requirements file, you’ll likely need:

numpy
scipy
matplotlib
pillow

You’ll need AI-generated images (for example from:

Gemini
Imagen
other generative tools)

Tip*Use simple images (solid colors or minimal detail) to make the watermark easier to detect.

Run frequency analysis

The core of the project is applying FFT to images.

Example (conceptually):

import numpy as np
from PIL import Image

img = Image.open("image.png").convert("L")
data = np.array(img)

fft = np.fft.fft2(data)
fft_shift = np.fft.fftshift(fft)

This transforms the image into frequency space, where hidden patterns become visible.

Look for consistent patterns

Repeat the process across multiple images and compare results.

What you’re looking for:

bright points in the same positions
structured grids or patterns
signals that persist across images

These are likely parts of the watermark.

Experiment with removal (advanced)

Once identified, you can try:

masking specific frequencies
adding noise
compressing or resizing

This doesn’t always remove the watermark completely, but it can degrade detection.

Limitations You Should Know

Before jumping to conclusions, it’s important to be realistic.

This is not a full SynthID breaker
Results depend heavily on the dataset
Detection systems may still work even if the signal is altered
Real-world robustness is still debated

In short:
You’re exploring behavior—not defeating the system entirely.

reverse-SynthID is a great example of how open-source research can challenge assumptions about AI safety systems.

It shows that:

even invisible mechanisms can be studied
simple mathematical tools remain incredibly powerful
transparency and robustness must go hand in hand

MicromOne

Pagine

Reverse Engineering Google’s SynthID: What This Project Teaches Us (and How to Use It)

Clone the repository

Post più popolari