Han Nôm Decoder · iOS

Read what
time is erasing.

NomLens decodes Han Nôm script from photos of stone steles, temple inscriptions, and ancient Vietnamese manuscripts — on your iPhone, offline.

Characters, Quốc ngữ transliteration, and English meaning. In seconds.

Download on App Store How it works

97.6% validation accuracy<10ms per character on Neural Engine10.6 MB 10.6 MB model · works offline972 character classes (v1)

The Script

What is Han Nôm?

Chữ Nôm (漢喃) was the primary writing system of Vietnam for over a thousand years. It combines Chinese characters (Hán) with characters invented specifically for Vietnamese sounds and concepts (Nôm).

The script was used for imperial edicts, poetry, religious texts, genealogies, and stele inscriptions. Vietnam's greatest literary work — Truyện Kiều — was originally written in chữ Nôm.

Today, fewer than 100 scholars worldwide can read it fluently. The physical artifacts — stone steles, aged manuscripts, temple carvings — deteriorate every year. The window for preservation is closing.

Research context →

Pipeline

How NomLens works

Five steps from raw photo to structured decode. Steps 1–4 run entirely on-device.

📷

Photograph

Point your camera at any Han Nôm source — a stone stele, temple inscription, manuscript page, or printed text. NomLens accepts photos from your library too.

⚙️

Preprocess

Core Image filters run on-device in milliseconds: adaptive thresholding corrects uneven lighting on weathered stone, noise reduction cleans aged manuscript ink, and perspective correction fixes keystoning.

🔲

Segment & Sort

Apple's Vision framework locates individual characters. NomLens clusters them into columns and sorts right-to-left, top-to-bottom — the correct Han Nôm reading order.

🧠

Classify

Each character crop is passed to an on-device Core ML model (EfficientNet-B0, 10.6 MB). High-confidence results are instant. Low-confidence characters escalate to Claude Vision API for expert fallback.

📋

Results

A structured grid returns each character with its Unicode form, Quốc ngữ transliteration, English meaning, and a confidence badge. Tap any cell for full decode details. Everything persists in local history.

Full pipeline walkthrough →

Performance

By the numbers

EfficientNet-B0 with temperature-scaled calibration. v1 model, trained on HWDB handwriting + Han Nôm font renders.

97.6%

Validation Accuracy

EfficientNet-B0, v1

99.3%

Precision @ ≥90% Confidence

Calibrated confidence scores

1.4%

Routes to Claude

Below 60% threshold

10.6 MB

Model Size

.mlpackage, INT8-ready for v2

<10ms

Inference Speed

iPhone Neural Engine

0.0034

Calibration Error (ECE)

After temperature scaling (T=0.6908)

972

Character Classes (v1)

83.5% of corpus coverage

296K+

Training Images

HWDB handwriting + font renders

Model architecture & training →

Mission

“Once a script dies, the history it carried dies with it.”

We are in a narrow window — perhaps the last one — where the final generation of living scholars who can fluently read Han Nôm, the surviving physical artifacts, and modern AI technology all still exist at the same time. That window is closing. NomLens was built to seize it.

漢喃

This is not abstract research. This is a race against time.

For nearly a thousand years, Han Nôm was the soul of Vietnamese culture — the script in which ancestors recorded history, poetry, law, medicine, and everyday life. Today it is in mortal danger. Physical inscriptions erode under rain and pollution. Manuscripts crumble. Fewer than a hundred people in the world still possess deep, native-level mastery of the script. When they pass, and when the stones and papers finally disintegrate, an irreplaceable portion of Vietnam's heritage disappears forever.

NomLens turns anyone with a smartphone into a guardian of that heritage. Every photo taken at a remote temple, every character corrected by a user, every inscription recorded and verified becomes a permanent digital record — feeding directly back into the model, expanding its knowledge of rare glyphs, and building the largest open archive of Han Nôm ever created. You don't need to be a scholar. You just need to care.

No Han Nôm inscription should ever be lost again simply because no one could read it. Future generations — scholars, students, ordinary Vietnamese — deserve to still touch the words of their ancestors.

Academic background →Partnership opportunities

Who It's For

Built for three kinds of people

🏛️

Field Users

Works where the inscriptions are

Remote temples, rural steles, archaeological sites with no cell signal. NomLens runs entirely on-device. No internet required after the initial model download.

Download →

📜

Scholars

Accuracy you can cite

97.6% validation accuracy on the Han layer. Temperature-calibrated confidence scores (ECE 0.0034). Full decode provenance: character, Unicode codepoint, source type, model version. Export to structured JSON.

Model details →

⚙️

Developers

Open architecture, OTA model delivery

EfficientNet-B0 Core ML classifier, confidence-routed fallback to Claude API, SHA-256 verified OTA model updates. Full pipeline documentation and manifest spec.

Read whattime is erasing.