I was going through old backups when I found a folder from a dental appointment: a cone beam CT from early 2024. The clinic had given me the scan on a CD, I’d backed it up at some point and mostly forgot about it. The zip was 696 MB. For a single dental scan that felt wrong, so I opened it.

Inside: 752 .dcm files from a Vatech PHT-60CFO scanner, a bundled Windows DICOM viewer (Vatech’s Ez3D-i Simple Viewer), Visual C++ redistributables, and locale files for 100+ languages including Kyrgyz and Yemeni Arabic.

DICOM is a container, not an image format

This was my first surprise. DICOM works like an MP4: the outer file holds metadata (scan date, scanner model, slice position) and the pixel data inside can be encoded in different formats. The encoding is declared by a Transfer Syntax UID, a UUID that maps to a specific codec.

Mine said 1.2.840.10008.1.2.1: Explicit VR Little Endian. Translation: raw uncompressed pixels. Every pixel written out flat.

752 slices × 752×752 pixels × 16 bits = ~812 MB of raw scan data.

Lossless JPEG is a real thing

When I first saw “JPEG-LS lossless” I assumed it was a contradiction. JPEG means lossy compression, that’s the whole point, that’s why you get artifacts on heavily compressed photos. But JPEG is a family of standards, and one branch (JPEG-LS) was designed specifically for lossless and near-lossless compression of continuous-tone images. It has nothing to do with the lossy variant most people know.

Lossless compression has been part of the DICOM standard since 2000. JPEG-LS lossless was added that year. JPEG 2000 lossless in 2002.

So why didn’t the clinic use it?

The DICOM spec only requires receivers to support one transfer syntax: raw uncompressed. Everything else is optional. When a clinic burns a CD or USB for a patient, they have no idea what software you’ll use to open it. Uncompressed is the one guaranteed safe choice.

Once a scan lands in a hospital’s actual PACS (Picture Archiving and Communication System) it typically does get recompressed to JPEG-LS or JPEG 2000. That’s standard archival practice. The copy handed to the patient stays uncompressed for portability.

Your scanner also has other things to do right after acquisition. Writing 752 raw slices fast is easier than compressing each one on the way out.

The numbers

dcmtk has a binary called dcmcjpls that does JPEG-LS lossless in one shot per file:

brew install dcmtk

ls /path/to/CT/*.dcm | xargs -P8 -I{} sh -c \
  'f="{}"; base=$(basename "$f"); \
   dcmcjpls "$f" "/path/to/output/CT/$base"'

8 parallel workers, 752 slices, done in 6 seconds on an M1-series Mac.

Results sampled from 4 slices at different depths:

Format% of original
Uncompressed100%
JPEG-LS lossless25%
JPEG 2000 lossless24%
RLE lossless44%

The full scan went from 812 MB to 212 MB. I was expecting 50-70% off based on what I’d read. It came out closer to 75%.

Why JPEG-LS crushes it: it predicts each pixel from its neighbors and only stores the difference. Medical scans have smooth gradients (most of your jaw bone is the same shade of gray), so those differences are tiny. RLE just collapses repeated values into counts, which only helps in flat regions like the air around the skull. JPEG 2000 does something similar to JPEG-LS using wavelets, which is why their numbers land so close.

Verifying it’s lossless

Before deleting anything I ran a pixel comparison across all 752 slices:

import pydicom, numpy as np
from pathlib import Path

src, out = Path('original/CT'), Path('compressed/CT')
failures = [
    f.name for f in sorted(src.glob('*.dcm'))
    if not np.array_equal(
        pydicom.dcmread(str(f)).pixel_array,
        pydicom.dcmread(str(out / f.name)).pixel_array
    )
]
print("all good" if not failures else failures)

All 752 passed. The decompressed pixels are identical to the originals.

Does it actually open

The scan came with Vatech’s Ez3D-i Simple Viewer for Windows. I copied the recompressed files to a Windows 11 ARM virtual machine on VMware Fusion and tried opening them.

It took long enough to load that I thought it had hung. Once it did, it was painfully slow. No dedicated GPU in a VM means everything is software-rendered. But it worked: the 3D skull reconstruction came up, the MPR planes populated, and I could scroll through slices. Same as the originals would have.

I only tested the one viewer that shipped with the scan. The embedded viewer handled JPEG-LS without complaints. There are other DICOM viewers out there (Horos, OsiriX Lite, 3D Slicer on the Mac side) that support standard transfer syntaxes if you need something that runs natively.

What I ended up with

Two different numbers worth keeping straight:

The scan data itself went from 812 MB of raw pixels to 212 MB JPEG-LS. That’s 74% off, lossless.

The final zip went from 696 MB to 510 MB. Smaller saving, because most of the zip isn’t scan data. It’s the bundled Vatech viewer (DLLs, the Lite installer’s Setup.exe, locales). Zip can’t squeeze those much further, and zip’s deflate was already doing real work on the raw DCMs in the original (about 55% off on raw pixels). The new gain comes from swapping deflate-on-raw-pixels for JPEG-LS, which understands image structure (predicting pixels from neighbors) instead of just hunting for repeated bytes.

I also stripped the locales down to English and Spanish variants and dropped the Visual C++ redistributables, which trimmed another ~32 MB off the viewer side.

If you have a CBCT or similar scan from a clinic sitting around, this is the whole workflow:

brew install dcmtk
ls /path/to/CT/*.dcm | xargs -P8 -I{} sh -c \
  'f="{}"; dcmcjpls "$f" "/output/CT/$(basename "$f")"'

Then run the pydicom check above before deleting your originals.