Meltdown and spectre bugs

Media coverage for meltdown and spectre bugs has been so far pretty awful. Therefore, here I come with my recap.

In Q/A style.

Note

This text is not really meant for hackers. There is a very nice explanation of these bugs on Google Zero blogs, go ahead and read it.

This page is for people who want more of a layman description, that is close to the truth, but without being neither too pessimistic nor too optimistic about the impact of these vulnerabilities.

What is all the fuss about?

Here, I describe how meltdown and spectre work, a "what do I do" section will follow.

What is process isolation?

Let's say you have a computer, and you are running two apps: a browser and a password manager.

One of the (more important roles) of your operating system (Windows/Linux/Whatnot) is to make sure that these programs can not read each other's memory.

Let's now say you are logging to facebook, and your password manager contains a password for your online banking system. You wouldn't want your browser to be able to know password to your bank account unless you paste it yourself. The Operating System makes sure that this is not possible or at least it did that until last week.

You also don't want "facebook" tab to know the password you are entering on your "bank's" tab (this is enforced by the browser though --- still by the OS in case of google-chrome).

What is kernel / userland isolation?

There are two kinds of programs in your computer:

  • programs that are part of the kernel (the operating system itself)
  • the rest - so called userspace programs.

Kernel code has much more privileges than userspace programs. And we don't want to enable userland programs to be able to access kernel memory. The operating system makes sure that this is not possible, or at least it did that until last week.

How these attacks work (background knowledge)

These attacks use so called 'side channel attacks' and 'speculative execution' to break both kernel and process isolation.

CPU Caches

CPU is another name for your processor.

You might think that you have two kinds of memory in your computer, one is "slow" (your drive), and one is "fast" (RAM memory).

This is not exactly right. You actually have at least three kinds of memory:

  • CPU Cache (it is "fast")
  • RAM Memory (it is "slow")
  • Disks ("very slow")

Your processor (CPU) copies parts of RAM into cache to speed-up execution.

What is speculative execution?

Imagine you are a railroad switch operator in, say 1800. You can hear a train coming but you have no idea to which track the train will choose. You can do two things:

  1. Stop the train. But trains in 1800 took a lot of time to brake or speed up! (That's in fact one of the things that haven't changed much since then :) )
  2. Assume that train will go, let's say, right. If the train indeed was intended to go right, you've just saved everybody a lot of time! If the train wanted to go left, you now need to stop the train, put it in reverse, back up, flip the switch and then restart the train to go further on the right track.

Note: I have paraphrased this (very informative) stack overflow answer.

The CPU works somewhat like a train --- they also take a long time to get up to speed (i.e. fill the instruction pipeline), and we also have switching points, when processor needs to make a decision, sometimes this decision is based on memory that is not available (not in CPU cache, since this information might be in RAM or on disk).

CPUs have some specialized hardware called a "branch predictor" that is able to try and guess which "track" it would take. If they are wrong, the CPU is supposed to erase all trace of "wrong" prediction, and then restart the pipeline.

Side channel attacks

This is a broad category of attacks on computer system that use a kind of extra information provided by the system ("side channel").

In most cases its "side channel" is timing information (in other cases "side channel" might be power consumption of the chip embedded in your credit card)

Let's have a simple example: you have a login page on a webpage, user has a password that is psswrd. And the password checking algorithm works as follows

  • If the first letter of the password and user input do not match, return that passwords don't match.
  • If the second letter of the password and user input do not match, return that passwords don't match.
  • ... you get it.

Note

No sane webpage would use this algorithm for a variety of reasons!

Now, this password scheme is very easy to break. The first attacker tries passwords that look like aa, ba, ca, ..., pa, ..., za.

When checking the aa password, the website will make one check whether a equals p (the first letters of both real password and the input provided by the attacker). When it's checking pa it will need to make two checks first to compare p and p and then to compare a and s. Two checks take more time than one.

Website will take slightly longer to check pa password than all others, as the server will need to check two letters while for all other passwords, it will only need to check a single letter.

In this way attacker has established that the first letter of the correct password is p.

Breaking that password this way is, much, much, easier than trying guessing all possible passwords.

Note

These kinds of attacks actually happen in the wild, usually attacker needs to have slightly more complicated setup --- for example try each password thousand times, and recover time it takes the webpage to compare passwords by some not-so-complicated statistics.

How these attacks work

I'll explain a single one example of these attacks, one codenamed: "Meltdown".

Suppose you want to learn whether a single bit of system memory is one or zero. Let's say that this bit is named unknown_bit.

You craft a program that works as follows:

  1. You create a variable.

  2. You make sure that CPU cache of your processor is empty.

  3. You read unknown_bit. You have no right to read this memory, but CPU doesn't know if you can or can't read this memory. It will try 'speculative execution' assuming that you have this right.

  4. You perform some operation on unknown_bit which will result in reading variable if this bit is zero, and reading something else if it is one.

    Now depending on whether unknown_bit is in zero or one, variable either is in CPU's cache or not.

  5. Some time later, the CPU discovers that you were not allowed to read unknown_bit and restores state of the CPU, from step 2. (Just like reversing that train:) ).

    The problem is (and this is a bug) that the CPU doesn't clear its cache.

  6. Now, you measure how long it takes you to read ``variable``, if the time required for the reading operation is short, it means that the variable is in cache! And if the time is long, this means that variable was not in cache. This way, you can guess whether the unknown_bit was zero or one.

Spectre works in a similar way (kinda sorta).

What should I do?

What is broken?

The following things are broken:

  • One program to read memory for other programs;
  • A user program to read memory of kernel;

Do these vulnerabilities have a patch?

  1. There is a patch that denies user to read kernel memory.

  2. There is no patch that denies one program to read the memory of another program.

    And probably, there will never be such patch.

  3. Most probably, CPU manufacturers won't be able to patch this using a microcode update.

What is microcode?

A CPU is an electronic device --- that is: most of it's logic is hardwired on a piece of silicon (this piece of silicon is the processor). However, in modern CPU's some operations are done using a microcode --- these are small programs that execute on the CPU. CPU manufacturers can fix some bugs in their CPUs remotely, by issuing a "microcode update" --- that is by fixing these small programs.

What do I do?

Ultimately, you wait for AMD or Intel to release a CPU without this bug, and buy one.

What do I do in the meantime?

You keep your computer up to date by installing all updates for every program you own! Especially:

  • You keep your Windows/Linux up to date!
  • You keep your web browser up to date!

There are some mitigation techniques that might be employed by each of the programs to make these bugs harder to exploit.

The same exploits are possible on Android/IPhone devices, so keep your device up to date, and if it does not receive updates, then you are out of luck and probably will need to replace.

Is my computer hackable right now?

No, ... and somewhat yes. It's complicated.

I believe the following things are true right now (2018-01-05):

  • No-one is actively using this exploit to target personal computers.

  • These exploits require the attacker to run programs on your computer. Which usually already means "game over" for security for a typical home user.

  • These attacks only allow the attacker to read memory. They can't alter it.

    However, reading of memory of other programs can, for example, allow attacker to:

    • read your passwords stored in a password manager;
    • read private keys of your certificates stored on hard drive;

A worst case scenario, right now, is:

  • You visit a malicious webpage.
  • This webpage is able to read memory of your computer and send it to some nasty people.

Can they easily read passwords from a password manager? Well, again: yes, and somewhat no.

  1. The attacker can read all memory, they can't easily "read memory of password manager"
  2. This exploit is slow --- you can read memory at a speed of about 2kb per second. Dumping the memory of a typical PC can take up to a million seconds.

Browser manufacturers try to harden browser against these exploits. This is why you should keep your browser up to date.

Are some programs especially vulnerable?

Yes, and unfortunately, a browser is especially vulnerable.

To pull off this attack, attacker needs to run a program on your computer. All modern browsers contain a "JavaScript" engine, and most of the sites use JavaScript --- JavaScript scripts are small programs that web pages send to your computer (and that are executed on your computer) to e.g. do nice animations.

Right now, it is possible exploit these bugs using JavaScript code.

Browser manufacturers try to harden their browsers against these exploits, this is why you should keep your browser up to date.

Will browser mitigation be effective?

I doubt it.

To be frank: it is terribly easy to fix this issue --- you'd just need to disable JIT. I doubt that this exploit is doable without JIT. However this would throw JavaScript performance out of the window, up to the point that most of the websites are unusable.

Another fix would be to break high-precision timers, if your code can't measure execution time you can't execute a side channel attacks.

However: There are many ways to create a timer, and each fix will target only a single way to do it. FireFox has disabled JavaScript feature used by Google Zero team to create a timer in their exploit: https://www.mozilla.org/en-US/security/advisories/mfsa2018-01/ .

What is JIT?

Long story short:

  • Most programs are "compiled", that is translated to a format directly executable by the CPU.
  • Some programs are "interpreted", that is: these programs do not run on CPU but are executed using another program: an "interpreter".

Interpreted programs are usually slower, but have some nice properties:

  • It is much easier to create interpreted programs that run in different environments (e.g. operating systems).
  • Interpreter can implement some extra security checks, to provide a safe "sandbox". Usually when you run a malicious compiled program on your computer, this is a "game over" for security, in case interpreted languages with sandbox this is (was) usually safe.

JIT stands for: "Just In Time" compilation. That is interpreter compiles, parts of interpreted program to speed-up execution. Speed increases from JIT are dramatic: programs speedup hundreds of times.

Is this Intel's fault?

No, and somewhat yes.

  1. The most important bug was not found on AMD's CPUs. However, there is a patch for that one.
  2. Less important bugs exist on all CPU's, and there is no patch.
  3. There are some rumors that Intel did cuts in their verification department in 2013. But it's not entirely clear that they would have otherwise found this bug. And these are also rumors.