History of the Perceptron
The evolution of the artificial neuron has progressed through several stages. The roots of which, are firmly grounded within neurological work done primarily by Santiago Ramon y Cajal and Sir Charles Scott Sherrington . Ramon y Cajal was a prominent figure in the exploration of the structure of nervous tissue and showed that, despite their ability to communicate with each other, neurons were physically separated from other neurons. With a greater understanding of the basic elements of the brain, efforts were made to describe how these basic neurons could result in overt behaviors, to which William James was a prominent theoretical contributor.
Working from the beginnings of neuroscience, Warren McCulloch and Walter Pitts in their 1943 paper, "A Logical Calculus of Ideas Immanent in Nervous Activity," contended that neurons with a binary threshold activation function were analogous to first order logic sentences. The basic McCulloch and Pitts neuron looked something like the following:
The McCullochPitts neuron worked by inputting either a 1 or 0 for each of the inputs, where 1 represented true and 0 false. Likewise, the threshold was given a real value, say 1, which would allow for a 0 or 1 output if the threshold was met or exceeded. Thus, in order to represent the “and” function, we set the threshold at 2.0 and come up with the following truth table:
Input x_{1} 
Input x_{2} 
Output 
0  0  0 
0  1  0 
1  0  0 
1  1  1 
This table shows the basic “and” function such that, if x1 and x2 are both false, then the output of combining these two will also be false. Likewise, if x1 is true or equal to 1 and x2 is true or equal to 1, then the threshold of 2 will be met and the output will be 1.
This follows also for the “or”
function, if we switch the threshold value to 1. The table
for the “or” function being,
Input x_{1} 
Input x_{2} 
Output 
0  0  0 
0  1  1 
1  0  1 
1  1  1 
This type of artificial neuron could also be used to solve the “not” function, which would have only one input, as well as, the NOR and NAND functions. The McCullochPitts neuron, therefore, was very instrumental in progressing the artificial neuron, but it had some serious limitations. In particular, it could solve neither the “exclusive or” function (XOR), nor the “exclusive nor” function (XNOR). Limited to binary code, the following truth tables could not be accurately solved using this early artificial neuron.
XOR
Input x_{1} 
Input x_{2} 
Output 
0  0  0 
0  1  1 
1  0  1 
1  1  0 
XNOR
Input x_{1} 
Input x_{2} 
Output 
0  0  0 
0  1  1 
1  0  1 
1  1  1 
One of the difficulties with the McCullochPitts neuron was its simplicity. It only allowed for binary inputs and outputs, it only used the threshold step activation function and it did not incorporate weighting the different inputs.
In 1949, Donald Hebb would help to revolutionize the way that artificial neurons were perceived. In his book, The Organization of Behavior, he proposed what has come to be known as Hebb’s rule. He states, “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.” [1] Hebb was proposing not only that, when two neurons fire together the connection between the neurons is strengthened, but also that this activity is one of the fundamental operations necessary for learning and memory.
For the artificial neuron, this meant that the McCullochPitts neuron had to be altered to at least allow for this new biological proposal. The method used was to weight each of the inputs. Thus, an input of 1 may be given more or less weight, relative to the total threshold sum.
Frank Rosenblatt, using the McCullochPitts neuron and the findings of Hebb, went on to develop the first perceptron. This perceptron, which could learn in the Hebbean sense, through the weighting of inputs, was instrumental in the later formation of neural networks. He discussed the perceptron in his 1962 book, Principles of Neurodynamics. A basic perceptron is represented as follows:
This perceptron has a total of five inputs a1 through a5 with each having a weight of w1 through w5. [2] Each of the inputs are weighted and summed at the node. If the threshold is reached, an output results. Of great importance is that each of the inputs may not be given equal weight. The perceptron may have “learned” to weight a1 more than a2 and so on.
The summation formula for determining whether or not the
threshold (θ) is met for the artificial neuron with N inputs (a_{
1}, a_{2}…a_{N}) and their respective weights of w_{
1}, w_{2},…w_{N }is:
_{ }_{N}  
b 
= 
(∑w_{ j}a_{ j}) 
+ 
θ  
^{ j=1} 
The activation function then becomes:
x = f(b)
The activation function used by McCulloch and Pitts was the threshold step function. However, other functions that can be used are the Sigmoid, Piecewise Linear and Gaussian activation functions. These functions are shown below. [3] (See the glossary attached to this applet for the corresponding mathematical formulas.)
Threshold Step 
Sigmoid 
Piecewise Linear 
Gaussian 




Despite the many changes made to the original McCullochPitts neuron, the perceptron was still limited to solving certain functions. Unfortunately, Rosenblatt was overly enthusiastic about the perceptron and made the illtimed proclamation that:
"Given an elementary αperceptron, a stimulus world W, and any classification C(W) for which a solution exists; let all stimuli in W occur in any sequence, provided that each stimulus must reoccur in finite time; then beginning from an arbitrary initial state, an error correction procedure will always yield a solution to C(W) in finite time…” [4]
With these types of remarks Rosenblatt had drawn a line in the sand between those in support of perceptron styled research and the more traditional symbol manipulation projects being performed by Marvin Minsky . As a result, in 1969, Minsky coauthored with Seymour Papert , Perceptrons: An Introduction to Computational Geometry. In this work they attacked the limitations of the perceptron. They showed that the perceptron could only solve linearly separable functions. Of particular interest was the fact that the perceptron still could not solve the XOR and NXOR functions. Likewise, Minsky and Papert stated that the style of research being done on the perceptron was doomed to failure because of these limitations. This was, of course, Minsky’s equally illtimed remark. As a result, very little research was done in the area until about the 1980’s.
What would come to resolve many of the difficulties was the creation of neural networks. These networks connect the inputs of artificial neurons with the outputs of other artificial neurons. As a result, the networks were able to solve more difficult problems, but they have grown considerably more complex. However, many of the artificial neural networks in use today still stem from the early advances of the McCullochPitts neuron and the Rosenblatt perceptron.
[1] Hebb, Donald O. (1949). The Organization of Behavior. New York: Wiley, pg. 62.
[2] The diagram is from, http://www.neuroscience.com/Technologies/nn_history.htm
[3] Graph diagrams of the functions are from, http://home.cc.umanitoba.ca/~umcorbe9/neuron.html#Theory
[4] Rosenblatt, Frank (1962). Principles of neurodynamics. New York: Spartan. Cf. Rumelhart, D.E., J. L. McClelland and the PDP Research Group (1986). Parallel Distributed Processing vol. 1&2. Cambridge: MIT.