From Grainy to Crystal Clear: How AI Ultrasound Enhancement Works
The Beauty Problem with Ultrasound
Ultrasounds are extraordinary medical tools. They let us peer inside the womb without a single incision, measuring heartbeats, checking organ development, and monitoring growth—all using nothing but sound waves. But there's a reason even the most emotional parents sometimes squint at the screen, tilt their head, and ask "Which part is the face again?"
That's where FirstGlimpse AI comes in. We use a cutting-edge form of artificial intelligence to take the structural data locked inside your ultrasound—the exact shape of your baby's nose, the curve of their cheeks, the arch of their brow—and transform it into a photorealistic portrait.
What Is an Ultrasound Actually Capturing?
Before understanding the AI side, it helps to understand what an ultrasound image is. An ultrasound machine emits high-frequency sound waves that bounce off tissues at different rates. Denser tissues (like bone) reflect strongly and appear bright white. Softer tissues reflect less and appear grey. Fluid doesn't reflect at all and appears black.
A 3D ultrasound takes this a step further by collecting data from multiple angles simultaneously, allowing software to reconstruct a volumetric surface map of the baby's face. It knows the geometry—depth, distance, curvature. But it has no information about skin texture, colour, or the subtle details that make a face seem alive.
The Technology: Depth-Guided Diffusion Models
The AI powering FirstGlimpse is built on a class of models called Depth-Guided Latent Diffusion Models. Here's what that means in plain English:
Step 1: Depth Map Extraction
When you upload your ultrasound, the AI first generates a depth map—a grayscale image where brighter pixels mean "closer to the camera" and darker pixels mean "further away." This encodes the 3D geometry of your baby's face in a format the generative AI can work with. Crucially, this preserves the shape of your baby's specific features, not a generic baby template.
Step 2: Iterative Denoising
Diffusion models work by learning to reverse a process of gradually adding noise to images. During training, the model sees millions of pairs of noisy images and clean images and learns the underlying patterns. At inference time (when you upload your scan), it starts with random noise and iteratively refines it into a coherent image—guided by both the depth map and a text prompt describing the desired characteristics.
Step 3: Personalisation via Prompt Conditioning
This is where the magic of personalisation happens, and it's unique to FirstGlimpse. Rather than generating a generic baby, our system uses a sophisticated skin tone blending algorithm. When you provide parent skin tone swatches, the system calculates the average hex colour value between both parents and embeds that exact colour code into the generation prompt. This gives the AI a very specific target for how the baby's skin should look—not just "light" or "dark" but a mathematically derived blend of the actual colours you provide.
Why Does It Sometimes Look "Not Quite Right"?
Honest answer: because we're asking AI to do something extraordinarily difficult. Ultrasounds, especially 2D ones, have significant artifacts—shadow bands, acoustic noise, areas of poor signal. When the structural input is ambiguous, the AI fills in with its best guess based on statistical patterns learned during training. This can sometimes produce features that look a little "off" or anatomically awkward.
That's exactly why we built the "Fix Anatomy" mode—a second-pass generation that applies a stronger creative override with specific prompting to straighten a wonky head shape, remove hands covering the face, or reduce scan artifacts. Think of it as asking the AI to do a second draft.
Is the Output "Real"?
We prefer the term hyper-realistic prediction. The face geometry—the underlying structure—is derived directly from your specific ultrasound scan. If your baby has a button nose, the AI will generate a button nose. If they have chubby cheeks, you'll see chubby cheeks. The skin texture (pores, fine hairs, eyelash length) is synthesised by AI because sound waves simply do not contain that level of detail. Think of it like a master sculptor's clay model being brought to life by a painter: the sculptor used your baby's actual face as the mould; the painter added the final realism.
What Happens to My Scan?
Your uploaded scan is sent to a secure, temporary cloud storage bucket for processing only. It is not stored permanently in our systems, not used to train any future AI models, and not shared with third parties. The generated portrait URL is also temporary and expires after 24 hours. We take your pregnancy data seriously.
Ready to see your baby's face? Try the AI Portrait Generator—it takes less than 2 minutes.
Written by
FirstGlimpse Editorial Team
