The world’s leading publication for data science, AI, and ML professionals.

Here’s Why We May Need to Rethink Artificial Neural Networks

A new study proves artificial neurons are too simple

ARTIFICIAL INTELLIGENCE | NEWS | CRITIQUE

Photo by Jr Korpa on Unsplash
Photo by Jr Korpa on Unsplash

What if I told you that a biological neuron is better represented by a whole artificial neural network than by a single artificial neuron?

Deep learning, with the help of large training datasets and huge computing resources, saw a wave of success like no other in the early 2010s. Soon, Artificial Intelligence, machine learning, and deep learning were the center of a technological revolution. And together with them was another concept: Artificial neural networks (ANNs).

What most people don’t know is that neither of those concepts is remotely new. AI was born as a field of research in 1956. ML and DL were introduced in 1959 and 1986, respectively. And artificial neural networks – a term borrowed from the, at the time, immature field of neuroscience – appeared in 1943, with the pioneering work of Warren S. McCulloch and Walter Pitts.

Their work is now considered the first ever on artificial intelligence. As a subfield of AI, ANNs passed through times of disinterest and obscurity until the deep learning revolution elevated them to the highest peak in the history of AI.

The first artificial neural network

It was in the very early days of computer science – AI didn’t even have a name – that the first ANN was created. Neuroscientists had been modeling biological neurons for decades before, but McCulloch and Pitts’ work was the first in describing a biological neuron in terms of propositional logic.

Those of you familiar with ANNs will recognize this picture:

MCP neuron. Image by author
MCP neuron. Image by author

This is the McCulloch and Pitts neuron model (MCP), conceived almost 80 years ago. It’s, for the most part, the same model that’s taught in every modern introduction course or book on deep learning. It’s not because MCP is an accurate model that needs no refinement, but because deep learning hasn’t changed a bit at this elemental level since 1943.

The MCP neuron was intended to represent only a simplified neurophysiological version of biological neurons: A series of inputs go into the neuron which processes them and then either generates an output or not. It had a threshold activation function – if the sum of the inputs is negative, the output is 0. Otherwise, it’s 1.

Current ANNs have weighted inputs and more complex nonlinear activation functions to allow for meaningful learning and more precise I/O mapping. But they’re just a slightly improved version of the MCP model. The basis is the same; some pondered inputs are transformed into an output. Whether the neuron learns well or not is irrelevant to the fact that these models don’t – not even loosely – resemble a real, biological neuron.

Current ANN neuron model. Image by author
Current ANN neuron model. Image by author

The main simplification and the one that most hurts the MCP model is that each neuron in the ANN is collapsed into a single point in space. This is sufficient to simulate the behavior of some neurons, but the biophysical nature of other, more complex neurons is too nuanced and intricate.

Electricity flows through the dendrites, the soma, and to the axons through space and time. Not all dendrites function the same way. Not all inputs participate in generating the outputs – voltage decreases along the dendrites. Dendritic tree morphology, synaptic patterns, and different types of receptors all influence neuronal behavior. And there are many more elemental mechanisms and processes that form the bases that eventually give rise to our intelligence.

None of those characteristics are described in MCP or current ANN models.

By the time the MCP neuron was ideated neuroscience already knew neurons had features that made them non-reducible to a space-point neuron. McCulloch and Pitts simplified the complexities in order to make a logic-based model.

In the process, they set the groundwork of a whole field that never deigned to look again at his own premises and contrast them with relevant discoveries in neuroscience.

The complexity of biological neurons

The MCP model was created almost 80 years ago. Since then, neuroscience has developed drastically and our understanding of neurons has come to a point in which we can’t keep calling artificial neurons "neurons" anymore. Let’s recall some of the most relevant research on the topic to illustrate just how far AI has diverged from the cognitive sciences.

Neurons and dendrites – Processors within processors

In the 80s, Christof Koch and others found that dendritic morphology and synaptic patterns could influence how neurons internally processed the inputs. For a long time, scientists thought dendrites behaved uniformly and passively summated the inputs. Koch’s experiments led to the conclusion that neurons are far more complex than that.

More recently, neuroscientists investigated the role of individual dendrites and discovered that they act as processing units themselves: Dendrites have their own threshold to generate spikes (called dendritic spikes), which is different than the threshold of the whole neuron.

That is, neurons aren’t simple "logic gates," as the MCP model suggests. Dendrites seem to be capable of acting as logic gates themselves. A biological neuron is therefore a processing system that, in turn, is composed of independent processing systems.

To represent this in ANNs, connections between neurons would need to have distinct morphologies that would affect their role each time the neuron generates an output. And those connections would act internally as processing systems: Each input connection that arrives at the neuron would, internally, generate or not a spike that would change radically the overall output of the neuron.

This means that an artificial neuron is better understood as a layered network, in which the layers (dendrites) function as nonlinear intermediate I/O mappings. The resulting intermediate outputs are then summed accordingly to the morphology of the "connection tree" to produce the final output.

Building on these striking discoveries, Albert Gidon and colleagues published a groundbreaking paper in Science last year. They found a new I/O characteristic in pyramidal human neurons that wasn’t described by current models. Dendrites from these neurons produced a type of spike in which intensity was highest for stimuli at the threshold level and lowest when the incoming electrical current was increased.

Their discovery proved that some dendrites can act as XOR logical gates – the output is true if and only if one of the inputs is true. In 1969 Minsky and Papert proved that a single-layer perceptron – a basic early type of ANN – couldn’t do this type of computation. Now, it’s clear that a single biological dendrite can. That’s two degrees of complexity higher; from a single dendrite to a single neuron, to a simple ANN.

If a dendrite can do work at the level of ANNs then, how much more complex are biological neurons compared to artificial neurons?

It takes 1000 artificial neurons to simulate 1 biological neuron

A few days ago, David Beniaguev and his colleagues, published a paper in Neuron that proves what’s been suggested all these years: An artificial neuron can’t accurately represent a biological neuron at all.

To prove this, they decided to use modern machine learning techniques to simulate the I/O behavior of a pyramidal human neuron. They wanted to test two things: Whether an ANN can precisely predict neuronal outputs when trained on real I/O pairs, and how big the ANN needs to be to capture the whole complexity of a biological neuron with accuracy.

They found that, at the very least, a 5-layer 128-unit TCN – temporal convolutional network – is needed to simulate the I/O patterns of a pyramidal neuron at the millisecond resolution (single spike precision). They modified depth and width and found that the best performance was achieved by an 8-layer 256-unit TCN.

To make a gross comparison: This means a single biological neuron needs between 640 and 2048 artificial neurons to be simulated adequately. It doesn’t imply a biological neuron has this much more computational power or complexity. However, it’s a clear sign that both types of neurons are further apart in form and substance than previously thought.

The researchers were able to pin down the exact mechanisms by which the biological neuron was so difficult to simulate: dendritic morphology and the presence of a specific type of synapse receptor called NMDA. Both are structural and functional aspects of neurons well-known in neuroscience for a long time but completely ignored in modern AI and ANNs.


Some questions arise from these results: Why hasn’t the AI community tried to remodel its foundations to adapt better to the reality they’re trying to simulate? Is AI destined to fail in its quest to achieve AGI until those foundations are overthrown and rebuilt from the ground up? What would be the consequences of changing AI at such an elemental level?

Let’s see how all this unfolds.

AI & neuroscience – Diverging pathways

The neurons in our brain – although not all – are way more complex than their artificial counterparts. It’d be reasonable to approach this issue by checking if the assumptions established by AI and deep learning still hold despite the recent – and not so recent – discoveries in neuroscience.

It may be the case that AI could still work perfectly fine without changing anything. It’d carry on with its path towards AGI despite the apparent differences between digital and biological neural structures. However, it seems almost no one in AI cares enough to even check it.

The reason is that, from the very early days, neuroscience and AI parted ways – although both fields are trying to answer tightly related questions. Neuroscience is concerned with intelligence, the brain, and the mind. Neuroscientists decided to look inwards, to the only instance of intelligence we know of; us. In contrast, AI is concerned with replicating intelligence using artificial means. They care about designing and building intelligent agents that can perceive the world and act on it accordingly.

Neuroscience is a pure science which purpose is to find truth. It’s driven by curiosity and a hunger for knowledge. AI – at least short-term AI – is largely driven by money and usefulness. People in the industry aren’t concerned that the very basis of all deep learning could crumble into pieces if we carefully analyzed it. They care that AI keeps attracting funding and their models seem to somehow work, even if unreasonably.

Neuroscience keeps reviewing time and again its foundations but artificial intelligence has chosen another way: They made the assumptions and went forward without looking back once.

The levels at which both fields are working and developing aren’t the same, but it’s not fair to say that everyone in AI sees it from a technological, money-driven lens. There are people working very hard to advance the field as a science. Those who still see the field as a means to solve intelligence and fulfill the original mission of AI’s founding fathers: Artificial general intelligence.

They acknowledge the distinction between useful AI that works fine for simple, narrow tasks and it’s being deployed everywhere, and the challenging AI that needs important breakthroughs to get to the next level. In the latter case, there’s an ongoing debate about what’s the best path to follow. Whereas some argue deep learning is the way – it may need some tweaks but it’ll work eventually – , others think it won’t ever be enough by itself.

But is that what they should be debating about?

Is the AI community focused on the wrong problems?

This debate should be happening if and only if all the lower-level debates are closed and agreed upon. Yet, nothing further from the truth. The lowest possible cornerstone under which deep learning is believed to be the path to AI’s Future remains in doubt: Artificial neurons may be too dissimilar from biological neurons to ever give rise to complex cognitive processes and human-like intelligence.

We could compensate for the lack of complexity in artificial neurons with larger models, tons of computing power, and gigantic datasets, but that’s too inefficient to be the eventual last step of this quest.

Yet, those are the priorities of the AI industry. How can they make chips that don’t lose bandwidth while keeping efficiency? Either they stack GPUs or make/buy specialized chips (only within reach for the richest ones). How can they extract and curate larger and larger sets of data? Unsupervised learning and auto labeling. How can they create and train larger models? Either they’re a big-tech company or will need to ask one for funding.

They keep finding solutions, but is this trend sustainable? It doesn’t seem like it. We may need to go back to the basics. Not only because we won’t be able to build AGI like this, but because we’re starting to feel the collateral burden of denying the inefficiency of today’s AI.

But here’s the catch; if they find they really need to make a change, the whole field of AI as we know it would need a complete restoration. And they’re simply not willing to accept that. AI industry leaders may even know AI’s bottlenecks are unpassable. But they may simply prefer to act as if it doesn’t matter so they don’t have to face the cost of having built all this on top of the wrong assumptions.

There’s an important clarification to make here, though. Some AI systems work well and don’t contaminate that much. AI is still an incredibly useful technological discipline that’s bringing lots of innovation across many industries. I’m not denying it. But that’s the exception to the rule. There’s an ongoing race to create more powerful AIs and every major player is there, fighting to get a portion of the pie.

As I’ve argued before, progress shouldn’t come at any cost.

ANNs should be more neuroscience-based for two reasons, one that looks at the future and one that looks at the present: First, the difference in complexity between biological and artificial neurons will result in differences in outcome – AGI won’t come without a reform -, and second, the inefficiency with which we’re pursuing this goal is damaging our society and the planet.

Is it worth it?

The consequences – For AI and the world

Even if the AI community doesn’t act regarding the facts I’ve outlined here, AI, as the fertile industry it is, will still keep bringing a whole lot of new research projects and useful applications each year.

Narrow AI systems will still succeed at the simple tasks they’re made for despite AI not coming closer to neuroscience. Artificial neural networks will still be popular whether or not the AI community accepts that biological neurons are way more complex than artificial ones. The AI industry will still benefit greatly from pursuing the quest of AGI whether or not it’s achieved eventually – near-AGI AI can also be world-changing, for better or worse. And the desire to keep raising the standard of living for the privileged people of the developed world will remain, too.

But at what costs?

Ethical concerns in AI are at their heyday, and the models don’t seem to be getting better. Just a few days ago the New York Times reported that a Facebook AI system had labeled a group of Black men as primates. Another AI made by Google showed the same harmful bias in 2015. Are we going to ignore all this and put a band-aid to the problem, as Google did by removing gorillas from the training dataset?

Making AI explainable, interpretable, and accountable is the key to solving these issues. Those are hot areas within AI but they aim at solving the problem a posteriori. Instead of But how could we do it if there are no robust theoretical underpinnings behind ANNs? There aren’t any neural models that could explain the behavior of neural nets. We use them to predict and forecast because they work, but we don’t know why.

With half the planet burning up and the other half drowning in unexpected floods, the climate catastrophe is around the corner. And AI isn’t helping. Its overall carbon footprint is untenable.

In 2019, researchers from the University of Massachusetts Amherst studied the environmental impact of large language models (LLMs) – increasingly popular nowadays with GPT-3 as spearhead – and found that training one of these big models generates around 300,000 kg of CO2 emissions; the same as 125 New York-Beijing round-trip flights, says Payal Dhar for Nature. Some big tech companies (Google, Facebook) are now working to reduce this issue and gradually shift to renewable energies.

Related to this issue is that ANNs are extremely inefficient. To learn the simplest tasks they need immense amounts of computing power. That’s the reason why these systems generate such a large carbon footprint. Inefficiency leads to higher exploitation of resources, which generates more pollution.

Human brains contaminate just a fraction of that and don’t consume nearly the same amounts of energy to learn or do the same things. The brain is an extremely efficient organ. How can we do such complex things when the brain uses so little energy – not to mention that is way slower than computers? Could the reason for this extreme difference be that the complexity of sub-neuronal structures is manifold higher than that of ANNs? Yes, it could be.

LLMs, reserved for the biggest players, are the ones that attract the attention of investors and the mass media. The reason is these models always come surrounded by an overhype that’s transmitted to the public: "Robots Can Now Read Better Than Humans," "GPT-3 […] is freakishly good at sounding human," "a robot wrote this entire article. Are you scared yet, human?"

Overpromising and underdelivering is AI’s industry trademark. But not all the AI community is participating in selling something they don’t have with the sole purpose of generating publicity and attracting money.

In the words of Emily M. Bender, professor of computational linguistics at the University of Washington: "LLMs & associated overpromises suck the oxygen out of the room for all other kinds of research." There’s crucial research being done besides LLMs that’s being neglected by funding institutions and the media.

"[B]ecause these big claims are out there, and because the LLMs succeed in bulldozing the benchmarks by manipulating form, other more carefully scoped work, likely grounded in very specific application contexts that doesn’t make wild overclaims is much harder to publish."

  • Emily M. Bender

Maybe some of that research that’s getting lost in oblivion is trying to alert those with eyes only for the shiny LLMs that we’re doing AI all wrong. Maybe some people are working to no avail on these exact problems I’m describing.

If ANNs are ill-based, only LLMs seem to matter, and only big tech companies can build and deploy them, then there’s a very real risk that the AI industry is effectively an oligopoly focused on the wrong goals. And there would be no one capable of raising their voice enough so those in charge hear how mistaken they are.


It seems, however you look at it, that AI and particularly ANNs have problems of paramount importance to solve. Even if we rethink them from their very foundations it’s improbable that the issues will be solved right away.

If we want to understand the bigger picture and unveil new paths that may hint we’ve been heading towards a local maximum all this time, looking at neuroscience and acknowledging its discoveries it’s the first step.

Maybe the solution is to not try to imbue computers with intelligence. Evolution found a better way. We’re trying to replicate digitally what can be done in the physical realm. Why not try to create silicon-based physical artificial neural networks? Using a computer was Turing’s proposal, but we’ve come a long way since the 1950s. Maybe neuromorphic computing has the answer we should be looking for.

The AI community may not rebuild everything from scratch, but at the very least they should take these two actions: First, acknowledge the limitations and shortcomings of current AI paradigms – mainly ANNs – and take them into account for both future research and for the promises they make. Second, work to make the relevant adjustments in theory and practice.

They may decide to continue with AI and ANNs exactly as they’re today, but at least it won’t be because of calculated ignorance, but because of honest unwillingness.


If you liked this article, consider subscribing to my free weekly newsletter Minds of Tomorrow! News, research, and insights on Artificial Intelligence every week!

You can also support my work directly and get unlimited access by becoming a Medium member using my referral link here! 🙂


Related Articles