The world’s leading publication for data science, AI, and ML professionals.

Stay updated with Neuroscience: June 2021 Must-Reads

How sparse neural connections 🖇️ influences functional behavior? A more biological plausible backpropagation: error-vector broadcasting…

Image by Domink Lange from Unsplash
Image by Domink Lange from Unsplash

Join Medium with my referral link – Stefano Bosisio

Why should you care about Neuroscience?

Neuroscience is the root of nowadays artificial intelligence 🧠🤖. Reading and being aware of the evolution and new insights in neuroscience not only will allow you to be a better "Artificial Intelligence" guy 😎 , but also a finer neural network architectures creator 👩 ‍💻 !

Here today 3 new papers from arxiv.org. The first one investigates the sparsity connectivity and its influence on functional modularity. The second article proposes a new methodology to back propagation, which may be more biologically plausible – definitely, a lot of research should be devoted here. Finally, the third paper studies Attention in deep learning neural network trying to capture how attention works in the brain.

Extreme sparsity gives rise to functional specialization

Gabriel Béna, Dan F.M. Goodman, Paper

Neural networks are modular, namely, they can be decomposed into independent subnetworks. These subnetworks present structural modularity, where neurons are partitioned into different modules, and functional, where every single module can perform independent operations. Given these notions, is there a relationship between structural and functional modularity? Can we assess to what extent structural modularity affect the functional one? Authors have investigated these aspects, proposing a controllable structural modularity neural network and monitoring functional modularity metrics, understanding to what extent the two modularities are influencing each other, trying to find an answer also to the neural-biological counterpart.

Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules

The proposed architecture is composed of two sub-networks. These networks authors have interconnections at different degrees, ranging from sparse connections between nodes in different sub-networks modules, to no modularity when a full-dense architecture have been achieved. As input tasks, each sub-network receives an MNIST digit and the network has to communicate with all the others to return whether two digits across networks have parity or not. At the same time, each sub-network is able to specialize in recognizing their own digit. To measure the functional modularity authors have monitored three metrics:

  • Bottleneck metric: mean accuracy of a narrow 5 neurons layers after readout to check if they can recognize the input digits
  • Weight mask metric: this is a metric for specialization, where a subset of q% of parameters from a sub-network is kept to check performance against a given task
  • Correlation metrics: analysis on the hidden layers, computing the Pearson correlation coefficient between the hidden states of a sub-network n for the same digit example

Fig. 1 shows the final results for each of the three metrics for the investigated subnetworks. Overall, all of the three metrics varies along with the structural modularity of the network, measure both with sparsity (Proportion of active connections) and with Q modularity . The following conclusions can be drawn:

  • Imposing structural modularity ( high value of Q modularity) induces more functional modularity (high value for Proportion of active connections)
  • At the extreme levels of sparsity or structural modularity, we have a high structural specialisation across all the three metrics
  • A high level of structural modularity (Q modularity) is required to ensure functional modularity – namely functional modularity can be achieved with very sparse connections between sub-networks
Fig.1: On the left degree of sparsity between two sub-networks, on the right Q modularity measure. The lines indicate the mean for that metrics, while the shaded regions depict 1 standard deviation across 10 repetitions of the same experiment
Fig.1: On the left degree of sparsity between two sub-networks, on the right Q modularity measure. The lines indicate the mean for that metrics, while the shaded regions depict 1 standard deviation across 10 repetitions of the same experiment

These conclusions have remarkable implications for biological neural networks. The entire idea of connectomics, where knowing the structural properties of networks is sufficient to give us an understanding of the functional properties, may not be true anymore. As the authors wrote:

We should not conclude any degree of functional modularity simply by observing a moderate degree of structural modularity


Credit Assignment Through Broadcasting a Global Error Vector

David G. Clark, L.F. Abbott, SueYeon Chung, Paper

Neural circuits do not implement a back propagation approach (BP), as evolution has found another algorithm or route to allow circuit plasticity to work and train all the Brain networks. A possible biological solution to BP is credit assignment – here a review where we were talking about the credit assignment problem – where a mysterious global learning signal is broadcast throughout the network.

From here, the authors propose a new alternative to BP, called error-vector broadcasting (GEVB) where a global learning signal is broadcast to all hidden units in a neural network to update weights. In particular, the signal distributed across the network is information about the output error, acting as no unit-specific feedback. The neural network implementing GEVB are called vectorized nonnegative networks (VNNs). Furthermore, GEVB recalls biological neural network, imposing non-negative weights to the very first layer, as it happens to the excitatory behaviour of the cortical projection neurons. Finally, each weight is updated by an amount which is given by the inner product of the presynaptic activation and the global error vector. Fig. 2 shows a graphical way to understand the differences across various alternatives to BP.

Fig.2: Examples of alternatives with respect to the backpropagation (BP) algorithm: a) the BP algorithm, where the weights are updated layer by layers by transposing the error step by step b) the feedback alignment (FA) where the error is sent backwards and layer-by-layer c) the direct feedback alignment (DFA) where the error is broadcast directly to each hidden layer d) the proposed global error-vector broadcasting (GEVB) where the global error vectors is broadcast to all hidden units wihtout a unit-specific feedback.
Fig.2: Examples of alternatives with respect to the backpropagation (BP) algorithm: a) the BP algorithm, where the weights are updated layer by layers by transposing the error step by step b) the feedback alignment (FA) where the error is sent backwards and layer-by-layer c) the direct feedback alignment (DFA) where the error is broadcast directly to each hidden layer d) the proposed global error-vector broadcasting (GEVB) where the global error vectors is broadcast to all hidden units wihtout a unit-specific feedback.

What results do we have? At first, authors have tried to compare BP to GEVB, using vectorize-nonnegative networks (non-negative weights networks) and conventional networks, for different ranges of connectivity, against MNIST and CIFAR-10 datasets. Our attention will be focused on the vectorise-nonnegative networks, as represented by tab.1 and 2. Overall the error comparison is compatible across the different connectivity and weight-updating algorithm, with remarkable overlap between GEVB and BP

Further encouraging results are obtained when clustering CIFAR-10 images with t-SNE, where GEVB cluster quality is statistically better than BP methods.

These are remarkable result which may pave a new road to make artificial neural networks more similar to biological ones. The GEVB algorithm dictates a new question about a biological implementation of vectorisation and how presynaptic and postsynaptic neurons responses could interfere with each other, broadcasting a "weight-update" signal across the network. Stay tuned here guys!


Object Based Attention Through Internal Gating

Jordan Lei, Ari Benjamin, Konrad Paul Kording, Paper

Attention is the mechanism used by the brain to select a meaningful subset of a given stimulus or features

One mysterious mechanism in the brain is attention. Computational neuroscience and Machine Learning have been successful in giving us models which use attention mechanism to detect and recognize objects in simple tasks. More thorough results have been achieved employing deep neural networks, which proved to be able to understand more complex scenes, however, they are still far from what human brain can do. Human brain attention can be thought of as a mix of these given elements:

  • Modulation of neural activation: when a subject recognizes an object, visualization neurons exhibit a variation in activity, from about 5 to 30% of modulation, increasing the attention mechanism in the visual cortex
  • Attention invariant tuning: Although the modulation of neural activity is on, the attention leave the neurons’ tuning characteristics untouched and not altered
  • Internal gating: Attention is filtering irrelevant features, returning a clearer signal
  • Hierarchical processing: in the brain, there are hierarchically organized cells, which present a wide range of tuning characteristics. This allows the brain to learn complex non-linear features. The output from these cells increase the information gathered by the attention layers
  • Top-down neuromodulation of attention: Taking the visual problem as an example, the information travels as a feedforward, feedback and lateral flow. The feedback and lateral flow enable a top-down attention, while feedforward routes define receptive fields in the early layers.
  • Inhibition of return: regions activated by a visual input become inhibited, to allow the subject to move from one detected object to a new one.

Authors have provided a new neural network model, which can encompass all these features, in order to study the nature of attention for visual stimuli, trying to replicate what is biologically going on, to give a plausible biological answer.

Fig.1 shows the implemented neural network which is subdivided into 3 main areas, reflecting brain visual pathways. Three lines are present, the feedforward (black line), which can generate features maps and return a prediction; the feedback pathway (orange line) which implements attention masks and the horizontal connections (green lines) which projects the feedforward pathway on the feedback pathway. The attention masks act in V2 as an internal attention gating and at the final stage, creating a pixel-space interpretable image. The final result gives a possible interpretation about attention and its mechanism

Fig. 3: On the left the implemented attention-based neural network, on the right visual attention pathway as it could be in the brain. The peculiarity of the implemented neural network is to preserve feedforward, lateral and feeback connectivity.
Fig. 3: On the left the implemented attention-based neural network, on the right visual attention pathway as it could be in the brain. The peculiarity of the implemented neural network is to preserve feedforward, lateral and feeback connectivity.

Fig. 4 shows results for the MNIST and COCO datasets. The original input is processed at different stages in terms of attention mechanism. The gated input shows how attention works on the input image, with black white regions are inhibited areas and black regions are uninhibited ones. The attention mask drives the gating mechanism, focusing on the most important areas, Finally, the IOR mask (inhibition of return) specifies which regions have been checked and which will be inhibited for future iterations.

Fig. 4 Results from attentions layers for the MNIST dataset ( on the left) and the COCO dataset (on the right).
Fig. 4 Results from attentions layers for the MNIST dataset ( on the left) and the COCO dataset (on the right).

This learning approach has brought advances in the understanding of the attention mechanisms in the brain as a mix of feedforward, feedback and lateral connections with internal gating and inhibition of return. From the neural network output it seems that there is a general attention rule for dealing with objects:

  • neurons’ tuning curves do not change
  • there is an inhibition at different neural levels
  • there is a peaked modulation of attention in deep neural layers

This study poses the basis for achieving a deeper understanding of attention, but future works must be carried on. One note, linked to the previous article, is: what if we implement a more biological plausible back propagation algorithm to the network, such as the GEVB?


I hope you like this review on June 2021 Neuroscience arxivg.org papers. Please, feel free to send me an email for questions or comments at: [email protected]


Related Articles