[ TechnoCage | Caskey | numbers ]

Minimum length complete strings

What is the minimum string of numbers that contains, as a sub-string, every number in a range?

It all started with a certain l0pht advisory discussing the inherent insecurity in remote access to answering machines. In this advisory, they made reference to a 2600 Magazine article that claimed that a certain string of numbers was sufficient to fool a certain brand of answering machine into allowing access. This answering machine only used 2 digit access codes and did not care how many tries you made at getting the right two.

The L0pht/2600 string (102 characters):

00112233445566778899135790246803692581471593704948382726-
1605173950628408529630074197531864209876543210

My friend Brad decided to curse me by posing the question as to whether or not this is *the* shortest string containing every two digit number combination and whether or not there was a way to compute *the* shortest string. I wrote a little perl script that attempts to generate these strings. Currently my script does not attempt to be intelligent about finding the shortest string and this shows in the difference between my script's output and the l0pth/2600 string.

For base 10 strings of length 2, my second script version outputted 110 character strings.

numbers.pl v.2

Length: 110
SubString Match Interrupts: 9
00102030405060708090112131415161718191223242526272829233-
435363738393445464748494556575859566768696778797889899

I noticed that this was rather poor performance compared to the l0pht string and wondered as to what the cause could be. Upon reflection over my missives with brad, I realized that the troublemaker was the leading '00'. The algorithm of selecting the first available substring to add onto the working string caused the string to grow like so: 00 + 01 + 10 + 02 + 20 + 03 + 30 + 04 + 40 + 05 + 50 + 06 + 60 + 07 + 70 + 08 + 80 + 09 + 90 As soon as the 90 was reached, there were no remaining numbers that began with a zero. I mused as to what would happen if I started with the 99 instead of the 00.

numbers.pl v.3

Length: 101
SubString Match Interrupts: 0
99001020304050607080911213141516171819223242526272829334-
353637383944546474849556575859667686977879889

The algorithm

...

The Proof

Courtesy of Vlad Gluzman

Considering that the strings version 3 came up with are the shortest
possible strings to contain every combination of digits, and if they *do*
in fact contain every combination of digits, no shorter string will do the
trick.

Why are they the shortest possible? In a string of length n, the number of
distinct sets of k consecutive letters is n - (k - 1). 

For example:

123456789

can be split up the following ways into sets of consecutive 3 letters:

(123)456789
1(234)56789
12(345)6789
123(456)789
1234(567)89
12345(678)9
123456(789)

Notice there are 9 - (3-1) = 7 sets of these.

Notice also that there are 10^k possible sets of consecutive digits of
length k. (so, 00, 01, 02, 03, 04, 05, 06, ... comes out to 10^2 = 100).
From above, the shortest string that contains 10^k sets of consecutive
letters (not necessarily distinct) is of length 10^k + (k-1).

Note that the strings the gentleman's programs came up with are of
lengths:

101 = 10^2 + (2-1) for strings of length 2
1,002 = 10^3 + (3-1) for strings of length 3
10,003 = 10^4 + (4-1) for strings of length 4

The above argument shows that no smaller string can contain all the
possible consecutive sequences of numbers of these lengths. Ergo, if these
strings do indeed contain every possible combination of numbers, then they
are the shortest possible.

-V :-)

De Bruijn sequences

I've been told by Hank Turowski that there is an word for the types of strings described above---De Bruijn Sequences.

Unfortunately, Road Runner feels that all customers of Cable & Wireless (the company that bought, the company that bought my ISP) are not worthy of communicating with the customers of Road Runner. You see it so happens that a company called Hi-Speed Media (completely unrelated to TechnoCage) also happens to be a customer of Cable & Wireless and Road Runner feels that if Cable & Wireless is willing to sell Hi Speed Media service then all of C&W's customers must be of the same ilk and not someone Road Runner's customers want to communicate with. Therefore I cannot reply to his email.

If Mr. Turowski reads this, I would appreciate it if he could contact me again with another email address that I may get in touch with him at. I do appreciate his email and drafted a reply to it which was subsequently bounced by Road Runner's email servers. Of course I encourage all internet users to avoid companies that unilaterally decide who their users may receive mail from, especially by casting entire blocks of IP addresses into the rejection pile.

Raw data

v2

v3



Last updated: 2003-03-20