In Unix there is no real parallelism but only a time-share mechanism. Taking this into consideration, the following parallelization simulation was performed.
For layer design, a layer class was constructed. The fields of the class are: type of the contained neurons, a list of neurons belonging to the layer and their number. I used a type for the neurons, in order to allow a neuron to perform different types of computation, beside the classical weighted addition. This is an extended concept of neuron, viewed as an elementary processor, that can perform only one simple operation.
After defining the neuron class, a neuron becomes an independent item. In order to really function without any influences from other neurons, it can have, generally speaking, a UNIX process (or thread) allocated to it. I discuss in the thesis about how restrictions, like a limited number of processes per user, can slightly change this 1:1 distribution.
For input and output between these independent processes, UNIX offers several tools, from which I selected the shared memory, that can be reached by different processes (here, neurons) simultaneously or sequentially. In order to have a correct traffic of data, I also have to set some rules: for instance, a neuron can read from several locations (from all its inputs), but can write only in one (its own output).
In this way, a neuron is completely independent, unaware of the processing of other neurons, working parallel to other neurons. Also, the input data doesn't have to be copied locally for each neuron, thus saving memory space.
Due to the fact that on a real parallel system the number of neurons in a layer may be very large compared to the number of given processors, the compromise solution of dividing the neurons between the existing processors was considered. Therefore, I defined the optimum sharing key as the total number of neurons that have to be active at a given moment of time, divided by the given processors.
Because there is only one layer that is active at one time, all the neurons contained in that layer have to be divided, as uniformly as possible, by the total number of processors. In order to limit the computational effort and to minimize, therefore, the amount of data transmitted between the processors, the processors will contain only the information about the neurons of each layer that they will process.
To give free way for further developments and to design a software that is as hardware independent as possible, the parallel processors have been simulated on the UNIX machine by a class processor, that contains:
In order to have good results with the job division among the different processors, a correct communication mechanism has to be implemented between the component parts.
Figure 5: The intercommunication between master and slaves, through the common resources; the division of the processors on the slaves, the division of the layers between the slaves and the division of the neurons on the layers