20260527-OCR artifacts swept out of fourteen translations
44 total fixes!
Forty-four translation lines were corrected today across fourteen translations, syncing the live database to the master ia_all.xml after several months of accumulated edits. About three quarters of the changes share a single underlying cause — OCR artifacts in translations that were originally digitized from print — and the remainder are a small assortment of orthography and citation cleanups.
OCR artifacts in scanned-print translations
Several of the older translations on the site were OCR’d from paper originals years ago. The pass was good enough to be readable, but it left scattered character substitutions that have surfaced one at a time as readers and proofreaders looked at individual ayat. Today’s batch cleaned up a long-accumulated set. The signature is unmistakable: digit 1 reading as letter l, digit 0 reading as o, the sequences n1 and rr1 reading as m.
| Translation | Lines fixed | Representative artifacts |
|---|---|---|
| Thomas Cleary | 14 | set the1n → set them; frorr1 → from; on1nipresent → omnipresent; peop1e → people; sp1inter → splinter; encJ,.1ntn1ent → enchantment |
| Abdul Majid Daryabadi | 5 | Wor1d → World; spoi1 → spoil; Mercifu1 → Merciful; therefrcm → therefrom |
| Amatul Rahman Omar | 5 | houghty → haughty (recurring in passages on the jinn) |
| Hamid S. Aziz | 4 | space restored in Biblical cross-references (John7:16 → John 7:16, Deuteronomy18 → Deuteronomy 18, etc.) |
| Maududi | 3 | unlawfu1 → unlawful; Lord1 → Lord; A1-Furgan → Al-Furqan |
| [Al-Muntakhab] | 3 | sou1 → soul; c1othing → clothing; missing sentence-final period restored |
| Bijan Moeinian | 2 | world1 → world; (Mohammad0 → (Mohammad) |
| Mubarakpuri | 1 | Alla0h → Allah (three occurrences on one ayah) |
| Muhammad Mahmoud Ghali | 1 | (who1..are) → (who are) |
| Dr. Laleh Bakhtiar | 1 | stray verse-number text removed (signs we37:181re → signs were) |
Non-OCR corrections
A smaller set were orthographic or transliteration cleanups rather than OCR:
The Study Quran (58:2). Transliteration glyphs restored — ?ihar → ẓihār, with the dot-below and long-vowel marking restored on a verse that turns on the term.
Hilali – Khan (20:135). Missing apostrophe: Allahs religion → Allah's religion.
Sayyed Abbas Sadr-Ameli (28:11). Smart quotes restored where placeholder characters had crept in: :�Follow him.� → : "Follow him.", with a missing terminal period also added.
Abdul Hye (7:160). A u/a typo: your stuff → your staff (Moses striking the rock).
A technical note on the second pass
The first 37 of the 44 corrections went in cleanly on the first transactional batch. Seven failed the dry-run match — not because the rows were missing, but because the source diff carried XML-escaped entities (") while the database stores them un-doubled ("). A single normalization pass on the match text, a second dry-run confirming all seven, and they committed cleanly. The fix has been folded into our standing diff-to-SQL workflow so the same trip-up won’t recur on future syncs.
Thanks
These corrections were sent in, in very useful electronic format, by Eric Pement; many many thanks.