The Art & Skill of Radio-Telegraphy

-Second Revised Edition-
William G. Pierpont N0HFF

Back to Table of Contents

Chapter 28 - Letter Frequency Counts


The letter frequency counts (left-most column) are taken from one of the common books on cryptanalysis, based on number of
occurrences per thousand of normal English text material.  Each character is analyzed ("structure") into units, 1 for minimum
signal duration (one dit), 111 (three units duration) for a dah, and each equal unit of silence denoted by  0 (zero).  The required three units of silence separating each character is added (000) to each one below.

Freq.    Letter           Structure    Units          Total
130        E                 1000        4             520
 92        T               111000        6             552
 79        N             11101000        8             632
 76        R           1011101000       10             760
 75        O       11101110111000       14            1050
 74        A             10111000        8             592
 74        I               101000        6             444
 61        S             10101000        8             488
 42        D           1110101000       10             420
 36        L         101110101000       12             432
 34        H           1010101000       10             340
 31        C       11101011101000       14             434
 28        F         101011101000       12             336
 27        P       10111011101000       14             378
 26        U           1010111000       10             260
 25        M           1110111000       10             250
 19        Y     1110101110111000       16             304
 16        G         111011101000       12             192
 16        W         101110111000       12             192
 15        V         101010111000       12             180
 10        B         111010101000       12             120
  5        X       11101010111000       14              70
  3        Q     1110111010111000       16              48
  3        K         111010111000       12              36
  2        J     1011101110111000       16              32
  1        Z       11101110101000       14              14

1000     Ave. Structure length 11.23  Ave. 9.07        9076

From the above, if we take five times the above average letter length and add the space required for word spacing (seven total
or 0000000) we arrive at the normal English word length as 5 x 9.076 + 4 = 49.38.  This is just a bit less than 1% shorter than
50 units per standard word.  (By contrast, a random five-letter group averages 60.15 units.  This is 20.3% longer than normal
English word length.)

A similar analysis of numbers will show that the average length of a number is 17 units (minimum 12, maximum 22) or a group of five numbers takes about 1.78 times as long to transmit as a five letter word.

Comparing these calculations will show some of the reasons why receiving speeds vary with the kind of material being sent.

As a matter of interest, we list here the letters from the shortest to the longest by the number of units (less letter space) -- notice that all lengths are odd numbers: 1 - E;   3 - I, T;   5 - A, N, S;   7 - D, H, M, R, U;   9 - B, F, G, K, L, V, W;  11 - C, O, P, X, Z; 13 - J, Q, Y.

FOREIGN ADAPTATIONS OF THE INTERNATIONAL MORSE CODE:

If the same kind of calculations are carried out for several foreign languages, the following results are obtained for the average character length:  (Frequency data from Secret and Urgent, Fletcher Pratt l942 Tables II to IV, p. 253 ff.)
German  8.640,  French 8.694,  Spanish 8.286 . These range on the average from 5 - 9% shorter per character than in English.
There seem little doubt that if the code were somewhat redesigned and adjusted to optimize it for English a reduction of about 5% could be made.

For the Original American Morse code:-
Mr. Ivan Coggeshall made an analysis of American Morse comparatively, using the same normal dah lengths and word spacings one unit shorter, and arrived at an average letter (frequency) length of 7.978 (as compared with 9.076)  and average number length of l4.  As noted in Chapter 16, American Morse timing is open to considerable variation.


The Art and Skill of Radio-Telegraphy-Second Revised Edition-
©William G. Pierpont N0HFF