According to the Gangnam Times article “Musk Declares Arrival of ‘Robot Civilization’ at Davos: ‘Within 5 Years, Humans Will Step Down from the Lead Role’”, Elon Musk stated:
“There is a possibility that an AI more intelligent than any human will appear by the end of this year or next year at the latest.” “By around 2030 or 2031, AI will reach a level of intelligence higher than all of humanity combined.”
However, it must be said that the possibility of this prediction becoming a reality as an extension of the current path is extremely low. This is because the very structure of current Large Language Models (LLMs) is disconnected from the path to essential “intelligence.”
Limitations of LLMs and the Absence of “Emergence”
Current LLM foundation models essentially rely on a statistical mechanism of “predicting masked words.” While their ability to parse syntax and grasp context has improved dramatically, they possess zero ability to emerge new concepts from scratch. It is principle-wise impossible for an LLM to generate phenomena that lie outside the vocabulary defined by its tokenizer.
In summary, current LLMs lack the following elements:
- Conceptual understanding of time
- Logical grasp of state transitions
- Causality as an internal representation
- Intention, purpose, and value functions
These are indispensable elements that constitute intelligence, yet current AI possesses none of them. In other words, current AI has not even reached the starting point (the trailhead of a low mountain) of mimicking “human perceptual integration,” “embodiment,” or “learning structures.” Dreaming of “Superintelligence” as an extension of this is nothing more than an illusion that ignores engineering leaps.
“Human Specialization” as Decelerationism
The greatest fallacy in Musk’s judgment lies in “deifying humans as special beings.” This is likely a bias rooted in Western religious views that humans are made in the image of God. This bias leads to flawed technical choices, such as the obsession with humanoid robots (Optimus) and autonomous driving that relies solely on vision (Tesla Vision).
This is not accelerationism, but rather a stagnation that should be called “Decelerationism.” Musk’s premises always follow this flawed diagram:
- Human form = Optimal
- Human senses = Optimal
- Human intelligence = Optimal
- Human movement = Optimal
For example, Musk believes that “humans drive with only their eyes,” but this is a fatal misunderstanding of human perceptual integration.
In reality, humans drive by integrating the following elements:
- Vestibular system (acceleration and tilt)
- Auditory (engine sound and surrounding traffic noise)
- Tactile (road vibrations from the steering wheel and seat)
- Prediction and instinct (danger perception based on past experience)
- Dynamic switching of attention
Humans never grasp space through visual information alone. Furthermore, given the high frequency of accidents caused by human drivers, the premise that human driving ability is “optimal” is fundamentally broken. Ignoring the fact that “human driving ability is neither special nor optimal” and attempting to make AI mimic the same flawed structure is an act that distorts the discussion of safety.
Inefficiency of the “Humanoid” Form in Robotics
The obsession with humanoid robots is similar. From an engineering perspective, the human body structure is by no means efficient.
- Instability due to bipedalism
- Joint structures that are prone to wear and tear
- An incomplete upright structure that causes back pain
- Extremely low energy efficiency
In terms of robotics, the humanoid form is one of the “worst designs.” If one aims for true accelerationism, there is no need to be bound by the shape of the “randomly selected species” known as humans.
Why Does Musk Persist in “Human-Centricity”?
It is thought that economic rationality is at work here more than engineering reasons.
- Piggybacking on existing infrastructure: Roads, factories, and houses are all designed for “humans.” If it is humanoid, it can be brought to market without rebuilding social infrastructure, allowing costs to be passed on to society.
- Data lock-in: The vast amount of video data held by Tesla is based on “human vision.” If LiDAR and multi-faceted sensor integration become mandatory, the superiority of their vision data will be lost.
- “Understandability” as marketing: Investors will put money into a robot that looks, moves, and speaks like them, rather than into an unknown advanced intelligence.
Conclusion: Liberation from the Curse
True accelerationism is nothing other than liberating intelligence from the “curse” of the human form.
| Item | Musk’s Premise (Human-Centric) | Reality/Engineering Perspective |
|---|---|---|
| Perception | Vision alone is sufficient | Vision is a fragment of information. Multi-faceted sensor integration is essential. |
| Form | Humanoid is versatile and optimal | Bipedalism has extremely low efficiency and stability. |
| Intelligence | Consciousness dwells in the extension of language models | LLMs are merely statistical engines that do not handle “meaning.” |
| Safety | Equivalent to human is a pass | Systems require reliability that far surpasses humans. |
Setting the current body structure and perceptual system, which were “accidentally” selected during the process of evolution, as the final engineering goal is an act of rejecting true evolution. True breakthroughs only exist beyond the point where we deconstruct the “definition of intelligence” held by humans and discard the human scale.
