From alnl-owner Thu Aug  8 19:11:01 1991
Received: by scapa.cs.ualberta.ca id <42467>; Thu, 8 Aug 1991 18:47:15 -0600
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <91Aug8.184715mdt.42467@scapa.cs.ualberta.ca>
Date:	Thu, 8 Aug 1991 18:47:02 -0600

Dear subscriber to the alnl mailing list:

At last, release 2.0 of the atree ALN software is ready!

But don't just archive it -- use it!  Release 2.0 is definitely not a
toy program -- it is much improved over release 1.0, and can be used
to do some real applications.  There are a lot of changes: memory is
used very economically, there are now separate versions for Unix and
for Windows on the PC (the original DOS version could only deal with
very small lf programs, if at all), there is a new fast_tree feature
that doubles execution speed of a trained tree, you can now store
learned functions, and many more.

Since you're on the mailing list, you are being informed about it
first.  After some feedback from you about any problems you encounter,
it can be announced to the fworld on the net.  There is a temporary
problem which suggests that it would be better to limit ftp-ing to the
alnl people for the next few days: menaik is unstable, and must have
some sort of brain surgery this Saturday, August 10, to fix it.  I
hope it will be OK next week, and that the problems will not hamper
you in geting the software.

You can get the software via anonymous ftp from menaik.cs.ualberta.ca
[129.128.4.241] in pub/atree2.tar.Z or pub/atree2.zip.  The zip file
requires pkunzip, which should be easily obtainable if you don't have
it.  If this proves inconvenient, we could also make a self-unzipping
executable file.  Windows 3.0 is required for the PC.  Apologies to
the DOS users, but getting access to more memory was essential to make
a serious tool for researchers and experimenters.

Please, please if you try it out, let us know.  A message to alnl would
also let others know about your experiences.

Many thanks, and good luck with atree release 2.0!

Bill Armstrong


From alnl-owner Thu Aug  8 20:46:05 1991
Received: from media-lab.media.mit.edu ([18.85.0.2]) by scapa.cs.ualberta.ca with SMTP id <42484>; Thu, 8 Aug 1991 20:22:45 -0600
Received: by media-lab.media.mit.edu (5.57/DA1.0.3)
	id AA22455; Thu, 8 Aug 91 22:21:28 EDT
Date:	Thu, 8 Aug 1991 20:21:28 -0600
From:	Marvin Minsky <minsky@media-lab.media.mit.edu>
Message-Id: <9108090221.AA22455@media-lab.media.mit.edu>
To:	arms@cs.ualberta.ca, minsky@media-lab.media.mit.edu
Cc:	alnl@cs.ualberta.ca
In-Reply-To: Bill Armstrong's message of Thu, 8 Aug 1991 18:47:02 -0600 <91Aug8.184715mdt.42467@scapa.cs.ualberta.ca>

Thanks.  I hope to work with it later this fall.  I've long suspected
that something like this will be the bridge between neural nets and
symbolic processing...   I've unsuccessfully been trying to get people
to work on a dual problem: of abstracting more compact descriptions
from the coefficient set of a well trained net.

From alnl-owner Mon Aug 12 20:47:28 1991
Received: from helios ([128.194.15.2]) by scapa.cs.ualberta.ca with SMTP id <42240>; Mon, 12 Aug 1991 20:28:42 -0600
Received: from diamond.tamu.edu by helios (AA29560); Mon, 12 Aug 91 21:25:16 CDT
Received: from topaz.tamu.edu.tamu.edu by diamond.tamu.edu (4.0/SMI-4.0)
	id AA27766; Mon, 12 Aug 91 21:27:15 CDT
Date:	Mon, 12 Aug 1991 20:27:15 -0600
From:	jdm5548@diamond.tamu.edu (Darrell McCauley)
Message-Id: <9108130227.AA27766@diamond.tamu.edu>
Received: by topaz.tamu.edu.tamu.edu (4.0/SMI-4.0)
	id AA08050; Mon, 12 Aug 91 21:27:16 CDT
To:	alnl@cs.ualberta.ca
Cc:	jdm5548@diamond.tamu.edu
Subject: publication of ALN application

Howdy,

I began using ALN's and atree as sort of constructive way to spend my
free time. After having some success, my boss peeked over my shoulder
and became very interested.  As a result, we wrote a paper for an
upcoming conference comparing statistical, unsupervised, and
supervised algorithms for classifying beef marbling levels. (Our input
was frequency parameters of ultrasounic signals. A copy of
this paper, whittaker.meat.ps.Z, is on neuroprose.)

[The idea here is to develop an non-invasive, automated way of grading
beef (perhaps for on-the-hoof, i.e., live animals).  Visual inspection
by USDA graders is not all that accurate, which costs beef producers
lots of money. Besides, a non-invasive approach would be much better
from a food safety point of view.]

Since then, I've used the atree package to get some encouraging results
using image texture parameters of ultrasonic *images* (not signal) to do
the same job (marbling, or % fat, classification).  Here's a peek:

	Class     "Hits" "out of" Accuracy
	<3% fat   17      22      0.772727
	3-7% fat  123     139     0.884892
	>7% fat   5       29      0.172414
	total     145     190     0.763158
 
	Using the network as a predictor (instead of a classifier),
        it miss-predicted by an average of 0.973554 % fat.

This was just a rough, late-night, first run.  These may not look
so good to average person without knowledge of this application, but
76% accuracy is pretty darn good. 

Now I am soliciting for suggestions for an appropriate (referred)
journal to submit some of these results. This would definitely be an
applications-oriented paper. Where have you seen applications of ALNs
published?  I know of journals related to animal science and
agricultural engineering, but I'm looking for something more NN
related. Suggestions?

Thanks,
James Darrell McCauley, Grad Res Asst, Spatial Analysis Lab 
Dept of Ag Engr, Texas A&M Univ, College Station, TX 77843-2117, USA
(jdm5548@diamond.tamu.edu, jdm5548@tamagen.bitnet)

P.S.: I just tried joining this list, so if you send replies there,
      make sure that you CC: jdm5548@diamond.tamu.edu

From alnl-owner Mon Aug 12 21:14:43 1991
Received: by scapa.cs.ualberta.ca id <42242>; Mon, 12 Aug 1991 21:06:25 -0600
Subject: Re: ALN 2.0
From:	Bill Armstrong <arms>
To:	Ian_Clements@vos.stratus.com
Date:	Mon, 12 Aug 1991 21:05:53 -0600
Cc:	alnl
In-Reply-To: <9108130143.AA09644@lectroid.sw.stratus.com>; from "Ian_Clements@vos.stratus.com" at Aug 12, 91 7:43 pm
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <91Aug12.210625mdt.42242@scapa.cs.ualberta.ca>

> 
> I  have  just  completed  a build of the anl 2.0 code on a mac running
> unix (au/x).
> 
> On  the  first  attempt  I got an unresolved symbol "rint" in the "ld"
> operation referenced in atree.o.
> 
> I found a reference to "rint" in a line in atree_encode:
> 
> index = (int) rint ((x - code -> low) / code ->step);
> 
> I  just  removed  the word "rint" and re-built with no errors.  I have
> re-built the examples and they have run and seemed to be OK.
> 
> I  don't  know  if  this is just some noise that was introduced in the
> download but I thought I should let you know.
> 
> Ian Clements
> 
> 
 

Ian, I hope you don't mind if I pass this reply to your message on to
the alnl mailing list, since the answer may affect a few people out
there.  Thank you for not ignoring it, because your solution has some
(easily rectified) problems associated with it.  I hope others will
direct their comments through alnl.  Anyone not on the mailing list
can subscribe by asking me (or sending a request to subscribe to
alnl-request@cs.ualberta.ca. which amounts to the same thing,)

The "rint" function is meant to Round the floating value to an INTeger.
(This was a last-minute change from "nint", in an attempt to satisfy all
Unix systems.  Too bad it didn't work for the Mac.)  In that way, there
is no ambiguity as to which quantization level the upper and lower
values of the range belong.  In the most simple case, if you want two
quantization levels, the min is 0.0 and the max is 1.0, then rounding will
take all values of

 ((x - code -> low) / code ->step) = ( x - 0.0)/ (1.0) = x

to 0 if they are below 0.5.  Recall that

   code -> step = (high - low) / (count - 1).

Otherwise, we found that to be in the top quantization level, a value of
x actually had to BE at the max.  This has caused us some embarrassment at
the wrong moment already!

In this way, the top quantization level actually reaches 1/2 (code ->
step) above max, and the same amount below min.  The top and bottom
levels then have 1/2 as much probability of being used if x ranges
uniformly between min = low and max = high , but this is better than
having one of them used almost none of the time, which is what your
change causes.

If you change the code to read

> index = (int) (0.5 + ((x - code -> low) / code ->step));

then you should get the effect of rounding desired, provided the
cast to an int truncates down in your implementation.

This was our mistake in release 1.0, but I hope it is OK now in
release 2.0 for most implementations.  Others must find how to do
rounding on their compiler.

Thank you for your comments, Ian.  I'm glad to see a report that
essentially everything with our software appears to run -- even on a
Mac!  I hope you will give us some timing information so we can
compare the Mac to an IBM-PC clone and a Sparc.

Best wishes,

Bill


From alnl-owner Tue Aug 13 02:25:51 1991
Received: by scapa.cs.ualberta.ca id <42296>; Tue, 13 Aug 1991 02:16:49 -0600
Date:	Tue, 13 Aug 1991 02:08:00 -0600
From:	eba@computing-maths.cardiff.ac.uk
To:	alnl@cs.ualberta.ca, jdm5548@diamond.tamu.edu
Message-ID: <9108130712.AA07419@v1.cm.cf.ac.uk>
Subject: Re: publication of ALN application

Hi James!
Many thanks for your message. Unfortunately I have not seen any application
of ALN paper published. ( fortunaletlly??). You should try any journal involved
in vision like PAMI, Computer Graphics and Image Processing and SM&C.
There special issues of journal dedicated to neural computing I am sending
an announce of the Journal Systems Enginnerring. Best wishes, Eduardo.

From alnl-owner Tue Aug 20 16:29:17 1991
Received: by scapa.cs.ualberta.ca id <42561>; Tue, 20 Aug 1991 16:20:23 -0600
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <91Aug20.162023mdt.42561@scapa.cs.ualberta.ca>
Date:	Tue, 20 Aug 1991 16:19:46 -0600

Release 2.0 of atree has been available now for about two weeks, and
there don't seem to have been any reports of problems (except the
"rint" function, to round to an integer, which wasn't available in one
case).

I was hoping for some comments, either positive or negative, before
putting an announcement in the News.  It seems though that a lot of
people not on the mailing list are copying the software, which is
OK as long as the software doesn't need to be corrected.

Any comments?  Are the Unix and PC versions free of problems so it
is OK to announce them in the News?

Thanks.
$a

Bill

From alnl-owner Thu Aug 22 19:06:48 1991
Received: by scapa.cs.ualberta.ca id <42597>; Thu, 22 Aug 1991 18:57:55 -0600
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <91Aug22.185755mdt.42597@scapa.cs.ualberta.ca>
Date:	Thu, 22 Aug 1991 18:57:43 -0600

bagchi@vuse.vanderbilt.edu (Sugato Bagchi) asked whether any documents
were available and whether there has been any major modifications to
the adaptation algorithm(s) described in IEEE-SMC'79.

My reply:

Thank you for your interest in the ALN approach.  I will put a
document in a file to be ftp-ed, but there are two figures missing
which are not available in electronic form.  I'll call it atree2.ps.Z.

In answer to your other question about whether there have been changes
in the learning heuristics since the SMC '79 article: yes and no.

There has been a change in that "latest error" and "global search"
heuristic responsibilities required memory of a state from previous
patterns (two flip-flops to remember the latest errors on the left and
right in each element), whereas the new heuristics are embodied in
strictly combinational circuitry.  However, the basic ideas of true
responsibility and of "error" responsibility were already there in the
'79 article; they are now combined (by OR-ing them together) in a way
that improves the learning and simplifies the circuit.

Another advance is to use a double adaptation to patterns which are
incorrectly classified.  This is a simple substitute for a
non-existent "pedagogy" of adaptive logic networks.  There are other
advances that could be incorporated, but it is better to stick with
the basics for now.  That will give other people a chance to
experiment on altering the basic heuristics.

I suppose it may seem irrelevant to have been thinking about hardware
when fundamental issues about learning remain to be resolved, but I
regard the highest possible speed at reasonable cost as essential to
make the quantum leap in capabilities I am hoping for.  If some
company will put up the money to build adaptive hardware, we'll see
that three orders of magnitude speedup, which grows even more with the
size of the problem, will make a big difference.

Bill



From alnl-owner Fri Aug 23 14:48:17 1991
Received: by scapa.cs.ualberta.ca id <42601>; Fri, 23 Aug 1991 14:39:34 -0600
Subject:  Re: atree2.zip
From:	Bill Armstrong <arms>
To:	alnl
Date:	Fri, 23 Aug 1991 14:39:15 -0600
Cc:	Derek.Holmes@ub.cc.umich.edu, monroe (Monroe Thomas)
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <91Aug23.143934mdt.42601@scapa.cs.ualberta.ca>

Derek Holmes found that the pub/atree2.zip file available by anonymous ftp
from menaik.cs.ualberta.ca did not work.

Due to lack of a good connection between a PC and Unix here in the
department, the zip file was transferred to our Unix network by
telephone.  That could explain why the zip file on Unix ended up about
1000 bytes too short.

The ftp directory pub on menaik now has an updated file that is of the
same length as a file on disk that unzips correctly on a PC, so it should be ok.
 
 -r--r--r--  1 arms       353894 Aug 23 13:39 atree2.zip
 

Thank you, Derek, for reporting the problem.  I'm very sorry for the
trouble this has caused you and for any problems it may have caused for
others.  I hope the problem is now corrected.

I haven't had any other reports of problems with either the Unix
version (except the problem with rint "round to an integer" on the
Mac) or the Windows version.  I plan to announce atree release 2.0 on
the News today.  If anyone on the alnl mailing list has detected any
problems that mean I should make corrections first, please let me know
immediately.

Thanks.

Bill


From alnl-owner Mon Sep 16 18:03:34 1991
Received: by scapa.cs.ualberta.ca id <42656>; Mon, 16 Sep 1991 17:47:06 -0600
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <91Sep16.174706mdt.42656@scapa.cs.ualberta.ca>
Date:	Mon, 16 Sep 1991 17:46:43 -0600

Dear alnl subscriber.

We had a bit of trouble with ftp -- we changed the O/S on menaik and the
old ftp software then had a problem.  This has now been fixed.

Today, someone had a problem with the document atree2.ps.Z.  I have
replaced the document on menaik with one made using a newer version of
dvips.  It corrected the problem with only printing the first four
pages of the document.

The following is a little (slightly edited) script intended to help
people get started using the atree release 2.0 package, starting with
the ftp and running a few examples.  Please ignore if this is all old
hat for you.  (Except: please make your file histogram executable.)

If you have any other problems, I'd be glad to try to help.

Bill (arms@cs.ualberta.ca)


Script started on Mon Sep 16 10:48:24 1991
bash$ mkdir tardir
bash$ cd tardir
bash$ ftp
ftp> open 129.128.4.241
Connected to 129.128.4.241.
220 menaik FTP server (Version 4.217 Wed Sep 4 22:00:59 MDT 1991) ready.
Name (129.128.4.241:arms): anonymous
331 Guest login ok, send ident as password.
Password: <your email address, for example>
230 Guest login ok, access restrictions apply.
ftp> cd pub
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get atree2.tar.Z
200 PORT command successful.
150 Opening data connection for atree2.tar.Z (binary mode) (145697 bytes).
226 Transfer complete.
145697 bytes received in 2.7 seconds (52 Kbytes/s)
ftp> quit
221 Goodbye.
You have mail in /usr/spool/mail/arms
bash$ ls
atree2.tar.Z
bash$ uncompress atree2.tar.Z
bash$ tar -xvf atree2.tar
x atree2/MANIFEST, 266 bytes, 1 tape blocks
x atree2/Makefile, 379 bytes, 1 tape blocks
x atree2/README, 357 bytes, 1 tape blocks
x atree2/BEGINNER.INF, 11732 bytes, 23 tape blocks
x atree2/EXPERT.INF, 12441 bytes, 25 tape blocks
x atree2/atree.c, 50728 bytes, 100 tape blocks
x atree2/atree.h, 5983 bytes, 12 tape blocks
x atree2/atree.tex, 49080 bytes, 96 tape blocks
x atree2/bv.c, 17641 bytes, 35 tape blocks
x atree2/bv.h, 4242 bytes, 9 tape blocks
x atree2/examples/Makefile, 280 bytes, 1 tape blocks
x atree2/examples/mosquito.c, 6883 bytes, 14 tape blocks
x atree2/examples/mux.c, 2211 bytes, 5 tape blocks
x atree2/examples/multiply.lf, 2357 bytes, 5 tape blocks
x atree2/examples/sphere.lf, 60137 bytes, 118 tape blocks
x atree2/examples/sphere.results, 49368 bytes, 97 tape blocks
x atree2/histogram, 3591 bytes, 8 tape blocks
x atree2/lf.c, 16192 bytes, 32 tape blocks
x atree2/lf.h, 4675 bytes, 10 tape blocks
x atree2/neuronnet.idraw, 16681 bytes, 33 tape blocks
x atree2/nowarranty, 2528 bytes, 5 tape blocks
x atree2/synan.y, 26738 bytes, 53 tape blocks
bash$ ls
atree2		atree2.tar
bash$ cd atree2
bash$ ls
BEGINNER.INF	README		bv.c		lf.c		synan.y
EXPERT.INF	atree.c		bv.h		lf.h
MANIFEST	atree.h		examples	neuronnet.idraw
Makefile	atree.tex	histogram	nowarranty
bash$ make
cc -g -c lf.c
yacc  synan.y
mv y.tab.c synan.c
cc -g -c synan.c
cc -g -c atree.c
cc -g -c bv.c
cc -o lf -g lf.o synan.o atree.o bv.o -lm
(cd examples; make)
cc -g -I.. -c mosquito.c
cc -g -I.. -o mosquito mosquito.o ../atree.o ../bv.o -lm
cc -g -I.. -c mux.c
cc -g -I.. -o mux mux.o ../atree.o ../bv.o -lm
bash$ cd examples
bash$ mosquito
Creating training data
There are 68 patients with malaria, and 432 healthy in the training set
Training tree
Epoch : 0
Estimated number correct: 470
Epoch : 1
Estimated number correct: 499
Actual number correct: 500
Testing the tree
500 correct out of 500 in final test
bash$ ls
Makefile	mosquito.c	multiply.lf	mux.c		sphere.lf
mosquito	mosquito.o	mux		mux.o		sphere.results
bash$ mux
Creating training data
Training tree
Epoch : 0
Estimated number correct: 465
Epoch : 1
Estimated number correct: 485
Epoch : 2
Estimated number correct: 490

 tick tock tick tock ...  maybe the tree should have been bigger

Epoch : 97
Estimated number correct: 495
Epoch : 98
Estimated number correct: 497
Epoch : 99
Estimated number correct: 493
Actual number correct: 493
Testing the tree
488 correct out of 500 in final test
bash$ ls
Makefile	mosquito.c	multiply.lf	mux.c		sphere.lf
mosquito	mosquito.o	mux		mux.o		sphere.results
bash$ ../lf sphere.lf
Out of memory

< I guess my old Sun 3/50 couldn't cope.  Rolf says he can improve
things by not keeping trees in memory when they have learned.  He may
make a patch for us.  Trying multiply now.>

bash$ ../lf multiply.lf

  tick tock tick tock .... <over an hour!  slower than a PC!>
1
1.000000 0	1.000000 0	1.000000 0	1.000000 0
1.000000 0	2.000000 1	2.000000 1	2.000000 1
1.000000 0	3.000000 2	3.000000 2	3.000000 2
1.000000 0	4.000000 3	4.000000 3	4.000000 3
1.000000 0	5.000000 4	5.000000 4	5.000000 4
1.000000 0	6.000000 5	6.000000 5	6.000000 5
1.000000 0	7.000000 6	7.000000 6	7.000000 6
1.000000 0	8.000000 7	8.000000 7	8.000000 7
1.000000 0	9.000000 8	9.000000 8	9.000000 8
1.000000 0	10.000000 9	10.000000 9	10.000000 9
1.000000 0	11.000000 10	11.000000 10	11.000000 10
1.000000 0	12.000000 11	12.000000 11	12.000000 11
2.000000 1	1.000000 0	2.000000 1	2.000000 1
2.000000 1	2.000000 1	4.000000 3	4.000000 3

    etc. 145 lines altogether

12.000000 11	4.000000 3	48.000000 47	48.000000 47
12.000000 11	5.000000 4	60.000000 59	60.000000 59
12.000000 11	6.000000 5	72.000000 71	72.000000 71
12.000000 11	7.000000 6	84.000000 83	84.000000 83
12.000000 11	8.000000 7	96.000000 95	96.000000 95
12.000000 11	9.000000 8	108.000000 107	108.000000 107
12.000000 11	10.000000 9	120.000000 119	120.000000 119
12.000000 11	11.000000 10	132.000000 131	132.000000 131
12.000000 11	12.000000 11	144.000000 143	144.000000 143
bash$ exit

script done on Mon Sep 16 14:03:22 1991

Now what I should have done is piped the output of multiply through
the shellscript histogram to show the learning was perfect.  Instead I
took the complete script above and edited out everything that wasn't
the output of multiply to get datafile.  I also had to change
permissions (oops!, the pipe wouldn't have worked anyway.):

spedden$ chmod 0755 histogram
spedden$ histogram<$HOME/datafile
        0 errors =      144
Inaccuracy level =      0

From alnl-owner Mon Oct 21 10:54:39 1991
Received: by scapa.cs.ualberta.ca id <42161>; Mon, 21 Oct 1991 10:38:40 -0600
Subject: error fix for atree2 (fwd)
From:	Bill Armstrong <arms>
To:	alnl
Date:	Mon, 21 Oct 1991 10:38:30 -0600
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <91Oct21.103840mdt.42161@scapa.cs.ualberta.ca>

Forwarded message from Monroe Thomas

> Bill, found this small little error in the atree code when testing
> out your boolean function idea... the function atree_read_code sets the
> low and high fields of the encoding data structure to 1.0 and 0.0, when 
> in fact it should set them to the low and high values read in from
> the file.
> 
> The code in atree.c in function atree_read_code is currently:
> 
>     if (code -> width == 1)
>     {
>         atree_set_code(code, 0.0, /* low */
>                              ^^^
>                              1.0, /* high */
>                              ^^^
> 
>                              2,   /* count */
>                              1,   /* width */
>                              1);  /* distance */
>     }
> 
> It should be:
> 
>     if (code -> width == 1)
>     {
>         atree_set_code(code, code -> low,  /* low */
>                              ^^^
>                              code -> high, /* high */
>                              ^^^^
>                              2,    /* count */
>                              1,    /* width */
>                              1);  /* distance */
>     }
	NB.  This is not the original message, which didn't include the
"code ->" part.  Just rewriting history!  WWA

> 
> Apparently we just forgot to change this part when we decided to keep
> original high/low values for boolean functions, as opposed to forcing them
> to be 1.0/0.0.
> 
> 
> Monroe
> 


From alnl-owner Mon Oct 28 11:49:56 1991
Received: from escher.trg.saic.com ([139.121.21.51]) by scapa.cs.ualberta.ca with SMTP id <42406>; Mon, 28 Oct 1991 11:29:41 -0700
Received: by escher.trg.saic.com (5.61/A/UX-2.00)
	id AA00216; Mon, 28 Oct 91 10:30:12 PST
Date:	Mon, 28 Oct 1991 11:30:12 -0700
From:	jcp@escher.trg.saic.com (John C. Peterson)
Message-Id: <9110281830.AA00216@escher.trg.saic.com>
To:	alnl@cs.ualberta.ca
Subject: Gray codes for integer encoding?

Greetings,

   Please forgive me if this has topic has been discussed before on the
list, I've only been on it for a little while. The thing that has really
bothered me from the start about the ALN algorithm is the method used to
encode integer or floating point numbers. It just seems too complicated to
me, and I've always tended to equate simplicity with elegance.

   In the course of reading up on a somewhat related topic, namely Genetic
Algorithms, I've serendipitously stumbled across something that seems to
me would be a nice alternative to the encoding scheme currently in atree2.
In Genetic Algorithms (GA), one also wants a continuous mapping from integers
or floating point to {0,1} bit strings. If you take the numbers 2^n, and
(2^n)-1 that differ by only 1, they are far apart from one another (in the
Hamming norm) when using the familiar binary encoding. The GA text I have
calls this the avalanche effect. This makes the standard binary encoding
unsuitable for most GA work.

   One solution used in the GA community is to employ Gray codes. They are
constructed such that integers that differ by only one, are encoded to bit
strings that differ by exactly one in the Hamming norm. Here is an example
of a four bit Gray code for the integers 0...15 ;

  +-- INTEGER
  |
  |     +-- GRAY CODE
  |     |
 00  0000
 01  0001
 02  0011
 03  0010
 04  0110
 05  0111
 06  0101
 07  0100
 08  1100
 09  1101
 10  1111
 11  1110
 12  1010
 13  1011
 14  1001
 15  1000

   This seems like just the ticket, so what I am wondering is; Has anybody
tried this? Does it work well? If not why? BTW, the GA text I found this in
is the book "Genetic Algorithms in Search, Optimization, and Machine Learning"
by David Goldberg, Addison-Wesley, 1989. It is a well written book, and
gets my personal recommendation.

Regards, John C. Peterson < jcp@trg.saic.com >

From alnl-owner Mon Oct 28 13:54:44 1991
Received: from life.ai.mit.edu ([128.52.32.80]) by scapa.cs.ualberta.ca with SMTP id <42399>; Mon, 28 Oct 1991 13:40:48 -0700
Received: from transit (transit.ai.mit.edu) by life.ai.mit.edu (4.1/AI-4.10) id AA22261; Mon, 28 Oct 91 15:39:56 EST
From:	hqm@ai.mit.edu (Henry Minsky)
Received: by transit (4.1/AI-4.10) id AA00801; Mon, 28 Oct 91 15:41:07 EST
Date:	Mon, 28 Oct 1991 13:41:07 -0700
Message-Id: <9110282041.AA00801@transit>
To:	alnl@cs.ualberta.ca
In-Reply-To: John C. Peterson's message of Mon, 28 Oct 1991 11:30:12 -0700 <9110281830.AA00216@escher.trg.saic.com>
Subject: Gray codes for integer encoding?


(my $0.02)

I think it is clear that you want an error correcting code, such as
those used in computer memories. I think it should also be clear that
you want a code which is optimized for random errors, rather than
burst errors. I say this because it seems that when you are encoding
the results of an array of trees as a codeword, that there is not
likely to be any interaction between neighboring bits, since they come
from different trees.

I looked up some information in a book on digital coding for error
control: There is a class of cyclic codes called BCH codes are some
sort of generalization of hamming codes, which deal with multiple bit
errors.  These would probably be the thing to use. These codes also
have the nice property that they can be made "systematic", which
simply means that you can put the original data all in one contigous
block, and put the parity check bits at the end. This makes
translating back and forth, and storing the encoded data, quite
convenient.

The BCH codes give you the following properties: (From "Error Control
Coding", by Lin and Costello)

For any positive integers m (m >= 3) and t (t < 2^(m-1)), there exists
a binary BCH code with the following parameters:

Block Length: 			n = 2^m - 1
Number of Information digits:	k
Number Parity Check Digits:	n - k <= mt
Minimum (hamming) distance	dmin >= 2t + 1

You can correct any combination of t or fewer errors in a block of
length 2^m-1 digits.

There are several algorithms, such as Berlekamp's iterative algorithm,
for finding the locations of errors in a received codeword.

Also, if, say, the ALN networks were generalized to deal with
non-binary values, such as 8 bit bytes or something, then you could
use Reed-Solomon codes to do multiple-symbol error detection.

From alnl-owner Mon Oct 28 15:52:27 1991
Received: from wotan.compaq.com ([131.168.249.254]) by scapa.cs.ualberta.ca with SMTP id <42405>; Mon, 28 Oct 1991 15:34:24 -0700
Received: by wotan.compaq.com (Smail3.1.19)
	id <m0kbfHJ-00062yC@wotan.compaq.com>; Mon, 28 Oct 91 16:17 CST
Received: by twisto.eng.hou.compaq.com (Smail3.1.19)
	id <m0kbf28-0001PXC@twisto.eng.hou.compaq.com>; Mon, 28 Oct 91 16:02 CST
Message-Id: <m0kbf28-0001PXC@twisto.eng.hou.compaq.com>
Received: by bangate.compaq.com with VINES ; Mon, 28 Oct 91 16:04:57 CST
Date:	Mon, 28 Oct 1991 14:34:12 -0700
From:	John=Watters%CAE%Eng=Hou@bangate.compaq.com
Subject: coding
To:	alnl@cs.ualberta.ca

I too have vague aprehensions about the encoding and decoding going on in 
atree. I'm a bit confused by John Peterson's request for hamming codes since 
these are in atree 2 already (maybe I missed his point :) )

I made a modification to the existing encoder to guarantee that successive 
codes get farther and farther apart to avoid codes in one range from coming 
too close to codes in another range. This comes at the expense of bits, of 
course. 

For example, specify a distance of 3 and a 'depth' of 5 and you get a code 
list that has the code for '4' at a distance of 3 from '3', 6 from '2', 9 from
'1' and 12 from '0'.

The results showed some improvement, but nothing to write home about.

Error correcting sounds interesting, but my main problem with output decoding 
is the hardware/time expense in computing. One of the great assets of these 
trees is that they're fast. Input encoding is somewhat expensive if you need 
to deal with real numbers, but find-the-nearest-output decoding is quite 
costly. (Actually, this sounds like a job for another neural net!)

Another idea - since each output bit is calculated independently of the 
others, a major source of info is unavailable to a particular net - what 
are the other outputs doing? If each tree were 'aware' of the other outputs, 
perhaps thay could avoid generating unused output codes. Perhaps an iterative 
loop could converge on a legal output? 

As for genetic algorithms, I have added such a creature to the unix version of
atree 2. The genes control the initial node function and leaf connections. 
Normal training may be used on top of the genetic search (probably necessary 
if you want to find a good answer before you retire!). In general, the genetic
algorithm I've added does a fair job in finding smaller trees that do the job 
(I've put tree size in terms of AND/OR gates into the genetic figure of 
merit). It's fun to play with if you have cpu time to burn! 

Some other features I've played with:

- Verilog output of the final trees (a logic simulation netlist).
- Interject noise into test patterns to test noise sensitivity in pattern
  recognition.

Regards,

John Watters


From alnl-owner Mon Oct 28 18:21:13 1991
Received: by scapa.cs.ualberta.ca id <42416>; Mon, 28 Oct 1991 18:07:47 -0700
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <91Oct28.180747mst.42416@scapa.cs.ualberta.ca>
Date:	Mon, 28 Oct 1991 18:07:36 -0700

On coding for ALNs:

The Gray codes have the property that the codes in the sequence differ
in only one bit.  Unfortunately, two Gray codes that are far apart in the
sequence can be close together in Hamming distance.

0000
0001 <--
0011
0010
0110
0111
0101 <--
0100
1100
...

1000

The goal of coding for representing inputs and outputs of continuous
functions is to approximately have the Euclidean metric on, say,
quantization levels 0,...n-1 correspond to the Hamming metric on boolean
vectors.  To preserve the metric exactly gives a thermometer code
using n-1 bits (unary numbers up to some reordering of bits and
interchange of 1s and 0s).  This is quite inefficient (that is, if we
actually have to write it out; but note that BINARY SEARCH on the bits
of a thermometer code is as efficient as reading a binary number --
something which I think will play a role in the theory of ALNs.)

So the random walk technique is a compromise -- use a lower
dimensional d bit vector (d << n if p=1) and map 0, 1, 2, ...,n-1 into
this space by taking steps of size p and avoiding coming back close to
where you have been.  Of course, once you take k steps you could be
k*p away from where you started, and if this is close to d/2, you have
a Hamming distance as great as it is likely to get. So it is only
within k steps that Hamming vectors in the encoding are "close".

Let's see if I can propose a methodology for choosing a code for
values x in [a,b].  The goal is to have a forest of trees compute a
function f:[a,b] -> [c,d] with a certain tolerance for error.  Suppose
we can tolerate that f(x) be in error by an amount e.  Suppose f is
differentiable on [a,b] and the magnitude of the derivative never
exceeds d.  There are actually three sources of error when we compute
f(x) given x based on quantization and binary coding: 1. the x is
replaced by a quantized value x', 2.  f(x') can be computed as y by
the forest with error e', and 3. That value y can differ from f(x).
Let us take a quantization stepsize on x, qx, with qx <= 2e/3d.  Then
when we quantize, the effective x, x', is within one-half quantization
step, i.e. |x-x'| <= qx/2 = e/3d.  This assures that |f(x)-f(x')| <=
e/3.  The error in f(x'), e', can arise because there is simply no
code of a y-value for f(x'), but we can achieve e'<= qy/2 by training
the trees to produce the nearest code to f(x').  Actually, we train
not on (x',f(x'))-pairs, but (z,f(z))-pairs which are quantized.
Hence a z giving x' could differ from x' by qx/2 and f(z) could differ
from f(x') by e/3.  Hence even if the training set is very dense and
learning perfect, we still have an error e' <= e/3 + qy/2.  Setting qy
= 2e/3, we get |f(x) - y| <= |f(x) - f(x')| + |f(x') - y| <= e/3 + e/3
+ e/3 = e, as required.

So we now know what qx and qy are.  I also know the range of x values
and y values, so I know how many steps there are to code.  Now we have
to start asking: "If x values are close, what does that tell me about
f(x) values?"  This requires some a priori knowledge about the
function f.  If I code x values using the random walk technique, the
trees can benefit from insensitivity.  Namely, if I can take k steps
of size qx in x, traversing a distance k*qx, and f(x) is still LIKELY
within qy of its starting value (the probability computed over all x
in [a,b]) then it benefits me to have k*p < d/4 , where d is the
number of bits in my code for x.  (At Hamming distance d/4,
d-component vectors are still "close" in Hamming distance.)  The trees
can benefit from insensitivity, i. e.  they can possibly be smaller
and be defined by smaller training sets if I take d >= 4kp.  The bound
d>4k is possibly better than d > (b-a)/qx for a thermometer code.  (Up
to this point, there is no reason for taking p to be different from
1).

Suppose now that although I can map x' to a prescribed y close to
f(x') perfectly by large enough trees, I choose to use smaller trees
for cost or speed reasons.  This increases the probability of an
error in a bit of the code for the quantization level for y.  If
the random walk used for coding the output has stepsize p, then I
can allow < p/2 trees to be in error and still  decode to the correct
quantization level of the output.  If p trees are in error, I may still
decode to a near quantization level.  (This is where it is important that
the walk not come back close to itself.)

I hope the above gives some insight into the coding of continuous
values.

Unfortunately, the trees do not interpolate as multilayer perceptrons
with sigmoid functions do.  But if MLPs have large weights or sigmoids
with a large value of b in 1/(1+ e^(-b*u)), then the advantage of
smooth interpolation over a quantized space of x values may be lost.
A change of x by qx can cause any size of jump in the output of the
MLP.

Following up Henry Minsky's remark about algebraic codes:

Allen Supynuk has described the potential use of Golay codes (12
information bits + 12 bits redundancy) in predicting motion of an
autonomous robot (a linkage like a worm).  He suggests using Golay
code regions as hash buckets.  If a forest produces 24 bits, that
vector is mapped to one of 2^12 = 4096 buckets, each of which is
associated with a list of the (very few) possible quantization levels
of the output whose codes are to be checked for proximity to the given
24-bit vector.  This is much faster than the linear search done in
atree2 to decode.  (We could also look up the 24 bit vectors closest
neighbor, but that would require much more memory, and would not
illustrate the idea.)

Sorry this is so long, and still doesn't resolve all the complex
problems to do with coding.  Someone else will have to help.

Bill


From alnl-owner Mon Oct 28 20:46:38 1991
Received: from life.ai.mit.edu ([128.52.32.80]) by scapa.cs.ualberta.ca with SMTP id <42412>; Mon, 28 Oct 1991 20:28:23 -0700
Received: from transit (transit.ai.mit.edu) by life.ai.mit.edu (4.1/AI-4.10) id AA02305; Mon, 28 Oct 91 22:27:37 EST
From:	hqm@ai.mit.edu (Henry Minsky)
Received: by transit (4.1/AI-4.10) id AA00968; Mon, 28 Oct 91 22:28:48 EST
Date:	Mon, 28 Oct 1991 20:28:48 -0700
Message-Id: <9110290328.AA00968@transit>
To:	jcp@escher.trg.saic.com, alnl@cs.ualberta.ca
In-Reply-To: John C. Peterson's message of Mon, 28 Oct 91 17:06:44 PST <9110290106.AA00560@escher.trg.saic.com>
Subject: Gray codes for integer encoding?

   Date: Mon, 28 Oct 91 17:06:44 PST
   From: jcp@escher.trg.saic.com (John C. Peterson)

   You write;
   > 
   > I think it is clear that you want an error correcting code, such as
   > those used in computer memories. I think it should also be clear that
   > you want a code which is optimized for random errors, rather than
   > burst errors. I say this because it seems that when you are encoding
   > the results of an array of trees as a codeword, that there is not
   > likely to be any interaction between neighboring bits, since they come
   > from different trees.

   Hi Henry,

       My impression was that the coding used in atree *does* have a crude
   facility for error correction. The code members are formed by a random walk,
   and there is a #define or global variable that specifies the minimum Hamming
   distance between elements of the code. The decoding process picks the closest
   member of the code.

       This certainly gives error detection or correction depending on the
   minimum Hamming distance, but it seems unattractive because the mapping does
   not have any guarentee of continuity.

       I'm in agreement with you that using a Hamming or other code based on
   some nifty abstract algebra would be much more elegant. The last time I sat
   down and thought much about all this a couple months ago, that was one of
   the first things that struck me. Perhaps a good solution is to cascade in
   series, a Gray code, followed by a Hamming or other ECC on the binary
   representation.

The "systematic" encoding used by the common ECC codes gives you this
property you desire, I think. That is, you can use regular binary
numbers for your numbers, and just tack on the parity bits at the end.
The ECC code doesn't care if the errors occur in the "user data" or
the parity bits. Thus I don't think you need to use the gray or
hamming code at all. The ECC correction provides the exact
"continuity" you want.

   > 
   > The BCH codes give you the following properties: (From "Error Control
   > Coding", by Lin and Costello)
   > [...]

      Thanks for the references on this, it's been 10 years+ since I've done
   much with ECC. As I remember, things get a little "hairy" when you look at
   longer codes with error bit correcting as well as error detection. This
   stuff isn't all that hard to program, but getting up to speed on the theory
   would take me awhile.

Yeah, things do get real hairy real fast, not so much in the encoding
algorithms, (for which the theory is plenty hairy, but are actually
simple in implementation), but the decoding is really the black magic.
I wrote some code to do Reed-Solomon encoding/decoding, based on
algorithms in a book by Cain and Clark. But that stuff is bascially a
BCH code on 8-bit symbols, which is only really optimal when you
expect burst errors on the order of the length of a byte, rather than
random bits anywhere in the codeword, as I think you get with these
vectors of trees.


      By chance, have you seen any source code in the public domain for this?

I've got the reed solomon stuff, but I think it is suboptimal for the
totally random bit-error model. I don't recall seeing any binary
multiple-error BCH code packages around, but I may sit down and try to
implement some of the algorithms in one of the books, because I think
the ALN guys deserve an elegant solution to the encoding of numbers to
go along with their elegant ALN algorithms.

	Henry



From alnl-owner Tue Oct 29 10:22:56 1991
Received: by scapa.cs.ualberta.ca id <42244>; Tue, 29 Oct 1991 10:08:14 -0700
Subject: Random walk encoding for ALNs
From:	Bill Armstrong <arms>
To:	alnl
Date:	Tue, 29 Oct 1991 10:08:04 -0700
Cc:	arms (Bill ARMStrong)
In-Reply-To: <9110290410.AA00982@transit>; from "Henry Minsky" at Oct 28, 91 9:10 pm
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <91Oct29.100814mst.42244@scapa.cs.ualberta.ca>

Here is my answer to possibilities suggested by Henry Minsky: 1. that
the errors made by trees (in the forest computing a certain function)
are truly random, and 2. that a judicious choice of coding techniques
could do better than the random walk technique used in atree2.  (I am
posting to the alnl list instead of replying privately, since this
discussion is probably useful to alnl-subscribers.)

I think the errors made by individual trees will not be totally
random, and there will be a tendency to make more errors where the
function is harder to learn, e.g. has high frequencies or high values
of the gradient (i. e. where it is "sensitive", tree functions
naturally tend to be very insensitive.).  Some output bits may even be
constant in such sensitive parts of the function, if the function
values, though varying quickly, stay in a certain range.  So those
bits will not likely have errors in them.

We have found that taking majority votes often reduces the error rate.
This shows that error correcting coding might have even greater
benefits because the trees then don't simply repeat the same "message"
to decrease the error rate by majorit vote.  It is well-known that
mere repetition does not lead to optimal encoding for transmission.

As to whether the usual algebraic codes are appropriate, about all I can
say is:

1. Ordinary coding involves packing as many code words as possible
into the total space {0,1}^n.  Encoding for ALNs involves packing a
"sausage" of a prescribed number of links of radius r ( around codes
of quantized values) into the same space, where the dimension of the
space is kept small.

2. Random codes are "good" codes by the random coding theorem, so a
random walk should usually generate some segments that have codes that
are part of a good random code.

3. Systematic coding techniques are good for making nice round hash
buckets to use for decoding.  Similarly, since ALN trees are
insensitive, there is a tendency for them to form hash buckets that
put neighboring input vectors into the same bucket.  (This could be
useful in many areas.) Associating with each bucket a list of a few
possible responses to the vector coming into the bucket could result
in a fast way of decoding even random codes, say.  I haven't tested
this, though.


Bill


From alnl-owner Thu Nov  7 17:04:38 1991
Received: by scapa.cs.ualberta.ca id <42237>; Thu, 7 Nov 1991 16:44:30 -0700
Subject: archive and a correction
From:	Bill Armstrong <arms>
To:	alnl
Date:	Thu, 7 Nov 1991 16:44:26 -0700
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <91Nov7.164430mst.42237@scapa.cs.ualberta.ca>

I put the archive of the alnl mailing list since atree release 2.0
came out on menaik.cs.ualberta.ca [129.128.4.241] as
pub/atree2.alnl.Z, so anyone can have a look at it via anonymous ftp.

There was a slight correction to the atree release 2.0 package thanks
to Monroe Thomas.  It involved two lines of atree.c.  Most people will
not have been affected, only those needing to read in a stored code.

> The code in atree.c in function atree_read_code was:
>
>     if (code -> width == 1)
>     {
>         atree_set_code(code, 0.0, /* low */
>                              ^^^
>                              1.0, /* high */
>			       ^^^^                       
>                              2,   /* count */
>                              1,   /* width */
>                              1);  /* distance */
>     }
>
> It should be:
>
>     if (code -> width == 1)
>     {
>         atree_set_code(code, code -> low,  /* low */
>                              ^^^^^^^^^^^
>                              code -> high, /* high */
>                              ^^^^^^^^^^^^
>etc.

The original message speaking of this correction was missing the
"code ->" part, and so the result didn't compile, of course.
My apologies to anyone inconvenienced.

This has been corrected in the atree2.tar.Z file, but not in the
atree2.zip file on menaik.

Besides those who have already described adaptive logic network
projects they are doing on the alnl mailing list, is there anyone else
lurking out there?  If so, please email to alnl@cs.ualberta.ca and
tell the mailing list people about it.  Even work in progress would be
interesting, since others may be able to help with suggestions.  I
will try to help your projects along if I can.

Best wishes,

Bill