This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Inspired by the Adventures with Morse Code post by u/vk6flab, I decided to code some experiments comparing Morse with an ASCII-based encoding.
Results and disclaimer: Morse is optimized for humans. A 5-bit ASCII-based protocol could yield ~2x efficiency improvement, but would be too difficult to use without a machine interpreting the stream on each end.
Morse can be converted to binary to measure its efficiency. According to the spec, a dit is the shortest unit, equivalent to a single bit (1). A dah has the duration of three dits, so requires three binary bits (111). The gap between the dits and dahs of a single letter has a minimum silence duration of one dit (0), three between letters (000), and seven between words (0000000).
ASCII is normally 8 bits, but we can cover upper-case letters and basic punctuation with only 5 bits, accommodating 32 characters: 26 letters 6 punctuation marks (space .,!?'
). See the bottom of this post for a note about how digits 0-9 are handled. I did this by truncating the normal ASCII ordinals to their 5 least significant digits and manually defining punctuation in the unused slots. Because this is fixed-width, extra silence bits between characters and words aren't required. The space between words is encoded as a regular character, included in the set of punctuation.
Morse tried to optimize by making the most frequent letters shorter, but that was done before the time when computers could ingest entire libraries at once to determine true frequency. The first few letters by order frequency of use in English text are ETAOINSHR.
Letter | Morse | Morse binary | Morse duration | 5-ASCII binary |
---|---|---|---|---|
E | . | 1000 | 4 | 01000101 |
T | – | 111000 | 6 | 01010100 |
A | .– | 10111000 | 8 | 01000001 |
O | ––– | 11101110111000 | 14 | 01001111 |
I | .. | 101000 | 6 | 01001001 |
N | –. | 11101000 | 8 | 01001110 |
S | ... | 10101000 | 8 | 01010011 |
H | .... | 1010101000 | 10 | 01001000 |
R | .–. | 1011101000 | 10 | 01010010 |
And the last few.
Letter | Morse | Morse binary | Morse duration | 5-ASCII binary |
---|---|---|---|---|
J | .––– | 1011101110111000 | 16 | 01001010 |
X | –..– | 11101010111000 | 14 | 01011000 |
Q | ––.– | 1110111010111000 | 16 | 01010001 |
Z | ––.. | 11101110101000 | 14 | 01011010 |
The 5-ASCII binary is of course 5 bits of duration in each case. The "Morse binary" I've written here is not just "kind of" equivalent to Morse; it's exactly equivalent. If you fed that into a signal generator that stepped through each bit slow enough for a human to hear and interpret, it would generate a perfect Morse signal.
Here are some practical efficiency comparisons. Spaces in the binary are included for reading convenience, but are not part of the data stream.
"HELLO WORLD."
- Morse: .... . .–.. .–.. ––– .–– ––– .–. .–.. –.. .–.–.–
- Morse binary, 134 bits: 1010101000 1000 101110101000 101110101000 11101110111000 0000 101110111000 11101110111000 1011101000 101110101000 1110101000 10111010111010111000
- 5-ASCII, 60 bits: 01000 00101 01100 01100 01111 00000 10111 01111 10010 01100 00100 00001
Article 1 of the Universal Declaration of Human Rights: "ALL HUMAN BEINGS ARE BORN FREE AND EQUAL IN DIGNITY AND RIGHTS. THEY ARE ENDOWED WITH REASON AND CONSCIENCE AND SHOULD ACT TOWARDS ONE ANOTHER IN A SPIRIT OF BROTHERHOOD."
- Morse: .– .–.. .–.. .... ..– –– .– –. –... . .. –. ––. ... .– .–. . –... ––– .–. –. ..–. .–. . . .– –. –.. . ––.– ..– .– .–.. .. –. –.. .. ––. –. .. – –.–– .– –. –.. .–. .. ––. .... – ... .–.–.– – .... . –.–– .– .–. . . –. –.. ––– .–– . –.. .–– .. – .... .–. . .– ... ––– –. .– –. –.. –.–. ––– –. ... –.–. .. . –. –.–. . .– –. –.. ... .... ––– ..– .–.. –.. .– –.–. – – ––– .–– .– .–. –.. ... ––– –. . .– –. ––– – .... . .–. .. –. .– ... .––. .. .–. .. – ––– ..–. –... .–. ––– – .... . .–. .... ––– ––– –.. .–.–.–
- Morse binary, 1422 bits: 10111000 101110101000 101110101000 0000 1010101000 1010111000 1110111000 10111000 11101000 0000 111010101000 1000 101000 11101000 111011101000 10101000 0000 10111000 1011101000 1000 0000 111010101000 11101110111000 1011101000 11101000 0000 101011101000 1011101000 1000 1000 0000 10111000 11101000 1110101000 0000 1000 1110111010111000 1010111000 10111000 101110101000 0000 101000 11101000 0000 1110101000 101000 111011101000 11101000 101000 111000 1110101110111000 0000 10111000 11101000 1110101000 0000 1011101000 101000 111011101000 1010101000 111000 10101000 10111010111010111000 0000 111000 1010101000 1000 1110101110111000 0000 10111000 1011101000 1000 0000 1000 11101000 1110101000 11101110111000 101110111000 1000 1110101000 0000 101110111000 101000 111000 1010101000 0000 1011101000 1000 10111000 10101000 11101110111000 11101000 0000 10111000 11101000 1110101000 0000 11101011101000 11101110111000 11101000 10101000 11101011101000 101000 1000 11101000 11101011101000 1000 0000 10111000 11101000 1110101000 0000 10101000 1010101000 11101110111000 1010111000 101110101000 1110101000 0000 10111000 11101011101000 111000 0000 111000 11101110111000 101110111000 10111000 1011101000 1110101000 10101000 0000 11101110111000 11101000 1000 0000 10111000 11101000 11101110111000 111000 1010101000 1000 1011101000 0000 101000 11101000 0000 10111000 0000 10101000 10111011101000 101000 1011101000 101000 111000 0000 11101110111000 101011101000 0000 111010101000 1011101000 11101110111000 111000 1010101000 1000 1011101000 1010101000 11101110111000 11101110111000 1110101000 10111010111010111000
- 5-ASCII, 850 bits: 00001 01100 01100 00000 01000 10101 01101 00001 01110 00000 00010 00101 01001 01110 00111 10011 00000 00001 10010 00101 00000 00010 01111 10010 01110 00000 00110 10010 00101 00101 00000 00001 01110 00100 00000 00101 10001 10101 00001 01100 00000 01001 01110 00000 00100 01001 00111 01110 01001 10100 11001 00000 00001 01110 00100 00000 10010 01001 00111 01000 10100 10011 00001 00000 10100 01000 00101 11001 00000 00001 10010 00101 00000 00101 01110 00100 01111 10111 00101 00100 00000 10111 01001 10100 01000 00000 10010 00101 00001 10011 01111 01110 00000 00001 01110 00100 00000 00011 01111 01110 10011 00011 01001 00101 01110 00011 00101 00000 00001 01110 00100 00000 10011 01000 01111 10101 01100 00100 00000 00001 00011 10100 00000 10100 01111 10111 00001 10010 00100 10011 00000 01111 01110 00101 00000 00001 01110 01111 10100 01000 00101 10010 00000 01001 01110 00000 00001 00000 10011 10000 01001 10010 01001 10100 00000 01111 00110 00000 00010 10010 01111 10100 01000 00101 10010 01000 01111 01111 00100 00001
Note about digits: In my 5-bit ASCII-based protocol, digits 0-9 would be represented by prepending a ! character, followed by one or more binary values corresponding to each 0-9 digit (eg. 1 = 00001, 2 = 00010), and a final ! to switch back to normal mode and automatically insert a space after the number. Optional dots and commas within the number would use their normal encoding but with a 1 as the first bit so they aren't interpreted as numbers. This is proposed but not actually implemented in the following code, since it wasn't needed for my sample texts.
My rough, proof-of-concept code:
https://gist.github.com/serif/68fa1b389e90072d9c1b377de123a92d
edit: updated the code and post after receiving this from u/vk6flab:
FYI, the spacing between words is inclusive of the space at the end of a symbol.
The opening of the Universal Declaration of Human Rights is now 1422 bits, down from 1509, and HELLO WORLD is similarly fixed. That makes 5-ASCII's 850 bits a 40% reduction.
I've also written a new test for 6-ASCII, which with 64 character options without using shift codes could be considered more directly in parity with Modern International Morse, and also standard 8-bit ASCII, to better answer the original Morse vs. ASCII question. Testing the opening of the Universal Declaration of Human Rights:
- 5-ASCII: 850 bits
- 6-ASCII: 1020 bits
- 8-ASCII: 1360 bits
- Morse: 1422 bits
Subreddit
Post Details
- Posted
- 1 year ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/HamRadio/co...