เป็นไปได้หรือไม่ที่จะทำลายไมโครคอนโทรลเลอร์ด้วยซอฟต์แวร์

29

สมมติฐาน:

ไม่มีการเชื่อมต่อวงจรภายนอก (นอกเหนือจากวงจรการเขียนโปรแกรมซึ่งเราถือว่าถูกต้อง)
ยูซีไม่ผิด
โดยการทำลายฉันหมายถึงการปล่อยควันสีฟ้าแห่งความตายไม่ใช่การก่ออิฐในซอฟต์แวร์
มันเป็น "ปกติ" uC 1-in-a-million ที่มีวัตถุประสงค์เฉพาะ

มีใครเคยเห็นอะไรแบบนี้เกิดขึ้นบ้างไหม? มันเป็นไปได้ยังไงกัน?

พื้นหลัง:

A speaker of a meetup I assisted to said it was possible (and not even that hard) to do this, and some other people agreed with him. I have never seen this happen, and when I asked them how it was possible, I didn't get a real answer. I'm really curious now, and I'd love to get some feedback.

microcontroller software damage

— Juan Carlos
แหล่งที่มา

3

The only feasible way for this to happen, IMO, is if a pin is physically connected to VCC/COM, and said pin is configured to be driven opposite to what it's connected to, causing an over-current condition. But that's a combined HW/SW fail.

— Shamtam

6

Many controllers have flash which can be written under software control, and which is subject to wear. Would software which wore out the memory in a short period of time count as "destroying" the chip?

— supercat

1

Aside from seconding @supercat's observation about EEPROM or flash wear (it's possible to wear out EEPROM in a few minutes), I'll add that there is very little difference in many cases from a user pov between a physically destroyed device and a 'bricked' product. If it has to go back to the factory, it looks pretty much the same.

— Spehro Pefhany

1

Beware of the nth-complexity infinite binary loop. It has been around for ages...

— jippie

3

@Roh I already burnt a chip, because the hardware guy swapped the Vcc and GND pins on the PCB. (I think he though that the chip was a drop in replacement... It wasn't.) There was smoke and burnt plastic. It didn't last long, but the wire can survive this apparently.

— Mishyoshi

20

Of course you can, with the HCF instruction!

That said, I say that's impossible without any external circuitry, apart from power and such.

Even including some non purposely faulty connections possibly won't cut it: if you tie all the gpios to a power rail, setting them as output (to the opposite power rail) that can dissipate quite a lot of power. A gpio pin is probably protected against short circuit and such so nothing harmful will happen.

Designing an external circuit that destroys the chip at will is not trivial too in my opinion. The first thing that comes to mind needs a somewhat high voltage power supply, a nmos and a resistor:

schematic

^{simulate this circuit – Schematic created using CircuitLab} Where:

$V_{CC}$ is the usual supply for the micro, some 3v3 to 5V or whatever is needed
HV is a supply which voltage is well above the absolute maximum ratings of the micro
D1 is there to protect your valuable 3V3 voltage source
R1 pulls the mosfet gate high when the micro is not keeping it to ground
M1 is the designated killer

the operation is simple: if the micro releases GPIOx M1 turns on, Vcc rises and your chip catches fire. Note that this is a crappy setup, for example HV must be turned on after you are extra sure that GPIOx is firmly held to ground. Some transistors might not like some -5V Vgs, and so on... But you get the picture.

— Vladimir Cravero
แหล่งที่มา

3

love the HCF reference.

— placeholder

Hey, thanks for giving me a new TV series to check out!

— OJFord

@OllieFord I'm not sure of what you're talking about...

— Vladimir Cravero

1

@VladimirCravero en.wikipedia.org/wiki/Halt_and_Catch_Fire_(TV_series)

— Renan

15

Disclaimer: supercat said that first in a comment.

Actually, it is not possible to physical destroy most MCUs, but it is possible to wear it enough to start malfunctioning to a point where it is unusable. I have experience with TI's MSP430, so here it goes:

Those MCUs allows reprogramming the whole flash at any time. Not only it is possible to wear the flash by rewriting it millions of times until it fails, but the on-chip flash programming generator may cause failure on lower-end processor if the programming generator is incorrectly configured. These is an allowed range of frequency allowed for programming. When getting outside of that range (slower), the programming time may become excessively long and cause a failure of the flash cells. After only a few hundred of cycles, it is possible to "burn" the flash cells causing permanent failure.

Also, some models allows to overclock the core so that it gets to higher speed by increasing internal voltage. The MCU runs from 1.8-3.6V voltage supply, but the core itself is designed to run at 1.8V. If you overclock too much the core on a 3.6V power rail while toggling all I/Os, activating all peripherals and running at a blazing 40MHz (normal is max 25MHz on larger models) in a small closed case, you may end up frying the core because of overheating. Actually some guys said that they achieved those frequencies (usually the DCO fails before and the chip is saved, but well... maybe).

Just try it?

— Mishyoshi
แหล่งที่มา

nit-pick - I believe most flash is guaranteed to work for no more than 10,000 writes, and not "millions". Probably not worth fixing as you are making a point.

— gbulmer

2

Ah... Flash wear. I remember the first time I had a bug that caused constant writes to EEPROM on a pic. All it took was 10 seconds or so of run time. It took me about a minute to realize what happened :-)

— slebetman

6

According to stackexchange - "Is it really a bad idea to leave an MCU input pin floating?"

It describes several circumstances in which a chip may be damaged by an open circuit pin. Edit: an an example Spansion Analog and Microcontroller Products says:

4.1 Port Input / Unused Digital I/O Pins
It is strongly recommended to do not leave digital I/O pins unconnected, while they are switched to input. In this case those pins can enter a so-called floating state. This can cause a high ICC current, which is adverse to low power modes. Also damage of the MCU can happen.

The condition in this question is exactly open circuit pins.

So, our task is to drive that from may to will damage the pin. I think that is enough to go beyond 'bricking'.

One mechanism identified in that answer is driving an input pin to a mid-value voltage, where the two complementary transistors are both 'on'. Operating in that mode, the pin interface may get hot or fail.

An input pin has a very high impedance, and is also a capacitor. Presumably, their is enough coupling between adjacent pins that toggling neighbouring pins fast enough may drive charge onto the input pin and push it into that 'hot' state. Might half the I/O pins being driven into that state warm the chip up enough to cause damage?

(Is there a mode, where the capacitance of an open cirrcuit pin might be used like a voltage doubler? Hmm.)

I also think damaging flash is enough. I think that is bad enough to make the chip useless.

It doesn't need to be all of flash, but only the page which contains the Power-on, RESET etc vectors. The limit on a single page might take a few tens of seconds.

I had an indication, but no solid evidence) that for some MCU's it may be worse. I attended a presentation a couple of years ago. Some one asked why competitors offered parts with much higher flash-writecycles. The (large unnamed MCU manufacturer's) presenter said they took a very much more conservative approach in their flash memory specifications. He said their guarantee was defined at a significantly higher temperature than was the industry norm. Someone asked "so what". The speaker said several manufacturers products would have a significantly lower rewrite life-times than their parts at the same temps as they used. My recollection was 5x would become <1x. He said it is very non-linear. I took that to mean programming at 80C instead of 25C would be a "bad thing".

So, flash rewriting combined with a very hot chip, might also render it useless in less than 10 seconds.

Edit:
I think "releasing the blue smoke of death" is a harder constraint than required. If any of the: RESET pin circuit, brown-out-detector, power-up circuitry, RC or crystal oscillator (and probably a few other circuits) could be damaged, the chip would be rendered useless.

As others have noted, breaking flash would kill it irreparably too.

"Smoke" sounds impressive, but less obvious fatal attacks are still fatal, and much harder to detect.

— gbulmer
แหล่งที่มา

5

One potential source of such destruction is SCR latchup, where unintended (intrinsic) transistors in a chip get together to form a kind of TRIAC which can then sink a lot of current. This can easily blow bond wires, and I've even seen plastic encased devices visibly warped because of the heat produced.

The typical cause is driving (even momentarily) an input to above or below the supply or ground rails respectively, but I guess you might see it happen if an input was left floating. And it's not then hard to imagine a circuit where the input's floating-ness was software controlled (although that would be a very silly thing to allow).

— tkp
แหล่งที่มา

4

It's POSSIBLE that software intentionally written for the purpose, targeted at a very specific processor, might be able to force overclocking to the point at which the processor would overheat. Provided, of course, that the processor contains software-configurable clock-control registers.

It's NOT possible that ALL processors can be damaged this way, of course. If that were true, there'd've been billions of Z80s and 6800s and 6502s laid by the wayside by wayward software-writing tyros back when we were still typing in machine code manually, making lots of random mistakes.

— TDHofstetter
แหล่งที่มา

2

You don't need access to configure the clock. You just need to run software in a manner the CPU designer didn't envision. Here's some code that tries to achieve the theoretical 4 FLOPS per cycle on an Intel Core series processor: stackoverflow.com/questions/8389648/…. This code has been known to overheat CPUs.

— slebetman

1

Is this related to "power virus" programs?

— davidcary

1

@davidcary, that's a brand-new term for me. I was referring, though, not to a series of clock-hungry instructions but rather to actual alteration of clock rate (some processors support software control over the clock rate through manipulation of control registers) to a higher frequency than the CPU or its heat sink can deal with.

— TDHofstetter

3

This is my entry for ruining a microcontroller with as few parts as possible...

Just toggle the output pins at a few kHz!

You still might not see smoke, depending on the internal failure mode though.

schematic

^{simulate this circuit – Schematic created using CircuitLab}

--Edit, added Aug 22--

Now, I don't think you can ruin a microcontroller with the criteria given. But you can EASILY ruin external circuitry with the wrong code. A example that comes to mind is a simple boost converter I designed recently... simply pausing the code while debugging could short an inductor to ground through a MOSFET. POOF

— Daniel
แหล่งที่มา

2

I don't want to be "That Guy", but Assumption #1: "No external circuitry connected"

— Radian

1

You're being "That Guy". The subtext of this response is "No, you can't ruin a chip like that."

— Daniel

2

In terms of regular user mode code I don't think you can write anything that will break the chip.

However, I do remember the days of microprocessors that could be destroyed in less than a minute or even seconds if the heat sink fell off. Then they added thermal detection circuits that would turn the clock down if the part got too hot. Now that we're able to put in far more transistors than can be used at once, chips are capable of making more heat than the heat sink can dissipate and its the power management and thermal circuits that keep it safe. For example, see Intel Turbo Boost 2.0. Therefore it seems quite possible to melt down a chip if you're able to bypass or raise the limit on the power management and thermal circuit. So, if these are under software control (no idea; maybe it requires a BIOS update?) then you could run a bunch of parallel do-nothing loops, along with integrated GPU work, along with hardware H.264 decoding and encoding, and anything else the chip can do, all at once until the chip overheats and emits the magic blue smoke.

— Dithermaster
แหล่งที่มา

2

I'm most familiar with the STM32 processors, so these apply most to that family. But similar approaches may be possible with other processors also:

There is a permanent write-protect mode. So if you program that bit, and some useless program to the FLASH, the MCU can never be used again. I don't know if this counts as 'bricking', but it does involve a permanent hardware mechanism.
The programming pins are reconfigurable as GPIO. Because the clock pin is driven by the programming device, this could be used to cause a short-circuit. Most probably it would break that single pin, which being a programming pin would be quite bad.
Like mentioned by dirkt, the PLL's can be used to overclock the processor. This could possibly cause it to overheat or otherwise get damaged.

— jpa
แหล่งที่มา

1

Who ever told that that doesn't understand how involved the design process is of such chips. That doesn't mean that slip up don't happen and that the code coverage of the regressions and corner cases sometimes miss things, but to make a statement that ALL or even most processors have this flaw is logically dubious.

Just ask yourself, what happens when an over-clocker exceeds timing requirements (assuming it doesn't overheat). the chip fails, and perhaps corrupts memory and even HDD access but fundamentally the processor will fire back up again and even run the OS again if the corruption is fixed. So what sort of properly designed microcode could possibly cause MORE disruption than this scenario? - answer very likely none.

TLDR; All processors have this fault - NOT

— placeholder
แหล่งที่มา

I believe some/most microcontroller CPUs (by volume, not value) are not microcoded. So does that invalidate your assumption?

— gbulmer

No, whether you're designing a sequencer or a fixed purpose cell the regressions and constraints/tests on the design will be tight.

— placeholder

For a blue puff of smoke to occur the CPU would have overheated one way or another. Either by experiencing very high voltage, experiencing very high current, experiencing reverse polarity or experiencing too may transistors switching at too high a frequency. Only the last method is doable in software. CPUs that run lower than around 500MHz are unlikely to die because of software caused overheat but I've seen CPUs die due to software caused overheat. The assumption you made is exactly what you shouldn't.

— slebetman

@slebetman you are conflating far far too many things here. How does one get "reverse polarity" through software instructions? how does one get "too many switching at too high of a frequency" is there perhaps a magical flaw in all chips that turn them into massively parallel execution pipelines?

— placeholder

@placeholder: I said you can't get reverse polarity through software instructions. Did you read my comment?

— slebetman

1

I believe that it is certainly possible to physically destroy a micro-controller (MC) with software. All that is required, is the combination of the MC to be executing a "tight" loop of instructions that cause 100% utilization, and a "defective" heat-sink which allows the heat inside the chip to buildup. Whether the failure takes seconds, minutes or hours, will depend on how fast the heat builds up.

I have a laptop computer that I can only use it a 50% continuous utilization. If I exceed this, the computer shuts itself down. This means that at 50% usage the MC temperature is below the set trigger point. As the usage increases, the temperature of the MC increases until the trigger point is reached. If the thermal shut down circuit did not work (or did not have one), the temperature of the MC would keep on increasing until it got destroyed.

— Guill
แหล่งที่มา

0

schematic

^{simulate this circuit – Schematic created using CircuitLab}

#include <avr/io.h>

int main(void)
{
    DDRB |= _BV(2) | _BV(4);
    PORTB |= _BV(2);
    PORTB &= ~_BV(4);
    for (;;);
}

The code above causes the MCU to push PB2 high while pull PB4 low, and this creates a short circuit from VDD to PB2 to PB4 to GND and quickly the port drivers of PB2 and/or PB4 will fry. The short circuit may be an innocent error like accidental soldering bridge.

— Maxthon Chan
แหล่งที่มา

I'm skeptical that this would work. IO pins usually can't source or sink large amounts of current. The IO driver transistors would limit the current.

— Adam Haun

@AdamHaun The problem is that there is no current limiting exist. What is happening here is that this circuit can burn those transistors.

— Maxthon Chan

The current limiting is from the size and gate voltage of the output drive transistors. Maybe a 5V AVR could burn out the drivers, but looking at ATMega typical driver strength charts, with 3V Vcc shorting two pins together might not even exceed the absolute max pin current. And the current goes down at high temp! Lower-power MCUs would probably be fine.

— Adam Haun