Bug 152227 - Add support for Zarnegar format
Summary: Add support for Zarnegar format
Status: NEW
Alias: None
Product: Document Liberation Project
Classification: Unclassified
Component: General (show other bugs)
(earliest affected)
Hardware: All All
: medium enhancement
Assignee: Not Assigned
Depends on:
Reported: 2022-11-25 23:24 UTC by Hossein
Modified: 2022-11-26 00:29 UTC (History)
0 users

See Also:
Crash report or crash signature:
Regression By:

Sample Zarnegar75 file (5.94 KB, application/octet-stream)
2022-11-25 23:24 UTC, Hossein

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2022-11-25 23:24:13 UTC
Created attachment 183793 [details]
Sample Zarnegar75 file


Zarnegar was a commercial word processor from SinaSoft co. for DOS and later Windows that supported Persian/Arabic languages. It was the prominent word processor in Iran in 1990s.
Libraries and elsewhere might have many Zarnegar files as a heritage. Being able to import and use these files can be very helpful to preserve them.
More information about this format can be found here:

Zarnegar (word processor)

The original software can be download from here:

Two different versions of the program are available:
* Zarnegar 5.2 (Windows): Has problems with Windows 10. The software interface is Persian.
* Zarnegar 76 (DOS): Can be used with DOSBox

Character encoding and examples:

There are 2 common formats for Zarnegar: Zarnegar1 and Zarnegar75. Quoting from the wiki article above:

"Zarnegar1 character set
Zarnegar used an Iran System-based character encoding system, named Zarnegar1, with text file formats for its early versions, up to the Zarnegar 75 version. The Zarnegar1 character set is a two-form left-to-right (visual) encoding, meaning that every Perso-Arabic letter receives different character codes based on its cursive joining form, but most letters receive only two forms, because of the limited code-points available."

A sample from the Python Zarnegar1 convertor:

Also from the same article:

"Zarnegar75 character set
With the Zarnegar 75 version, a new character encoding system was introduced, and the file format was changed to a binary format. The Zarnegar75 character set is a four-form bidirectional visual encoding, meaning that every Perso-Arabic letter receives a one, two, or four character code, depending on its cursive joining form, and these letters are stored in the memory in the semantic order."

A sample Zarnegar75 binary file format is attached.


Some convertors available for Zarnegar file format. For example, this one for Zarnegar1 format is in Python, and provides some examples:
Converter for Zarnegar Encoding and File Format to Unicode Text

There is another (closed source) convertor that can convert Zarnegar format to RTF:
Comment 1 Eike Rathke 2022-11-26 00:29:03 UTC
This could be a candidate for The Document Liberation Project. It's also tracked here, I'm setting its product.
See https://www.documentliberation.org/

Note that the Python code is under GPLv3+ and can't be used.