Benchmarks from M2 Pro to M4 Pro

Long story short, I picked up a new MacBook Pro this week. I got the M4 Pro version with the higher core count and 1TB of internal storage. It's the exact same model in the lineup as the M2 Pro I've been using for the last 2.5 years (14 core CPU, 20 core GPU, 24GB RAM). The M2 Pro remains a beast for most things, but I do have some higher end workflows that demand as much power as I can throw at them, so I do run into situations where I'm just waiting for my computer to do something and I would be able to do more if my computer was faster.

All that is to say, I traded in my old MacBook Pro and an old iPhone to get this, and I have the MacBook trade-in box in my office. I need to choose which laptop to send back to Apple. I'm out about $900 total if I keep the new computer, or I can do a return if I think the old device can hold up until at least the M5 generation which is likely happening later this year.

So you know what that means: benchmark time! These are all real-world tasks I do myself, and are things where I find myself wishing my computer could go faster.

All measurements in this post are in seconds. Also I describe the M4 as "X% faster" in each section, which is not how much time it saved, but refers to how much more work it can do in the same amount of time. So if I say the M4 is 25% faster at rendering video, that means in the time it would take the M2 to render 100 seconds of video, the M4 would have rendered 125 seconds.

Exporting YouTube videos

My first test was exporting a recent YouTube video from Final Cut Pro. This was my video about Zen, which is a 15 minute, 4k, 60fps video. I exported using Final Cut's "Apple Devices" format with the "H.264 Multi-pass (Better)" codec, which delivers very good quality in a format that YouTube will process in just a few minutes.

This is a pretty tough task, taking over 15 minutes on each device, and the M4 Pro is only marginally faster here. All saved time is good, but whether an export takes 15 minutes or 17 minutes isn't a huge deal. Of all my tests, this was the one that showed the least difference.

ScreenFlow export

Next up was testing an export of a 9 minute, 4k, 60fps project from ScreenFlow, the app I used to record my screen in most videos (which is then fed into Final Cut to put the finishing touches on).

This was a quicker job, but still the M4 Pro only went about 13% faster than the M2 Pro. Not a great start for the M4 Pro. The gaps are there, but even amortized over another year, I don't think saving one minute per render is going to be worth it.

Screen Studio export

I promise this is the last video export test, but I do a lot of that, sue me! This was exporting a 13 minute, 4k, 60fps video at maximum quality.

Finally, the M4 Pro makes a meaningful jump! A 32% improvement is pretty significant, and amounted to me getting almost 5 minutes of my life back on an export of this size. That's not bad.

MacWhisper transcription

Next up was transcribing our latest podcast episode of Comfort Zone using MacWhisper using the WhisperKit Large v3 Turbo model.

An even bigger result! 41% is a real gain here, and it s a benefit only amplified when processing larger jobs such as transcribing multiple tracks at once.

MP3 conversion in Fission

Another task I do every week is converting our Comfort Zone episode to the proper MP3 format to optimize for the right mix of quality and file size.

This is the quickest task on this list, and while it is a 41% improvement on an 80-minute file, it's just 14 seconds of time saved, so not a big deal in the grand scheme of things.

Converting video files

This test took 4 minutes of 4K 60fps video I recorded on my PC and converts it to that same "Apple Devices" video format I used in my Final Cut exports.

Maybe not surprisingly, this test showed similarly minor gains in performance. I have to wonder if this one is highly tied to the media engine component in the SoC and that hasn't changed much from the M2 generation. Based on the fact other video rendering formats from ScreenFlow and Screen Studio have larger gains probably backs that up.

Running AI models

Now let's get into the weird improvements…I used Ollama to run the Gemma 3 model (Google's open source, lighter weight model based on Google Genini's tech). I asked it "I want you to give me a list of the 25 most popular baby names for boys in the united states as well as what each one means" and timed how long it took to complete the response. The exact response length varied a little bit, but I ran the query a few times to get an average.

This one is massive, with the M4 Pro rendered the response 11x faster than the M2 Pro, which is…well, it's quite a result to say the least. Of course this was the 12 billion parameter model, which was the biggest one I could get to run on both machines, and the performance gap was so large, I wondered if this was a RAM constraint as the M4 has 24GB RAM and my M2 has 16GB RAM. Inspecting the Activity Monitor when both were running the model, I saw the same ~10GB RAM usage and each device had RAM to spare, but I wanted to omit that variable if I could and ran the smallest Gemma model I could, the 4 billion parameter option.

The gap was halved down to mere 6x advantage (😅) to the M4 Pro, but the performance is still massively in favor of the M4 Pro.

But here's where it gets really weird: a day after running these tests, I just looked at these graphs and something felt off, so I ran the same tests again.

This time the M4 Pro was similar to last time, but the M2 Pro was in a whole different league, almost 10x-ing its performance from last time. It's still taking close to 2x the time, but it's at least looking more like the other benchmarks I've been running now.

Thankfully, I brought receipts on these tests. Here's a side-by-side showing the two machines performing the same task in the updated test.

0:00

/1:05

And while I don't have a recording of the exact tests that generated the 6x and 11x results in the charts above, I did compare doing 100 names which give you an idea of the performance gap I was seeing. The M2 Pro seemed to get off to a decent start, but it quickly ran into some bottleneck that brought it to a crawl.

0:00

/1:59

I'm guess this was an anomaly, but I closed all other apps and was just using the LLM and Cleanshot on both computers in both cases.

Status

One thing not mentioned here is more"real time" performance, which is how well it does at rendering the UI as I work. Of course, most things are instant on all Apple silicon devices (seriously, if someone bought an M1 today I'd tell them they'll be happy for years, and I say this as someone who uses an M1 Mac as my work computer everyday and isn't even thinking about when my upgrade is coming), I do some tougher tasks in apps like Photoshop, Lightroom, and Final Cut Pro that push the M2 Pro to its limits and do feel meaningfully quicker on the M4 Pro. Could I survive on the M2 Pro for another year or two? Absolutely. Does the after-trade-in cost of having an M4 Pro for the next couple of years feel worth it to me? Kind of, yeah.

I haven't made up my mind 100% yet, but I have a pretty good feeling that I'll be sending back the M2 Pro.