MySQL Query Optimization: A Practical Guide for Production Databases​ | SQLFlash

MySQL Query Optimization: A Practical Guide for Production Databases​

Rebooter.S
8 min read
MySQL Query Optimization: A Practical Guide for Production Databases​

The pager went off at 2 AM on a Tuesday. Not ideal. Our checkout database had basically stopped responding. I SSH’d in, ran SHOWLIST, and just PROCESS stared at the screen for a second. There it was—a query running for 847 seconds,hammering a 5 million row table with a full scan.

The developer who wrote it? He swore up and down it worked perfectly in staging. And it did. Staging had 500 orders. Production had 4.7 million. Different universe.

That was probably my biggest wake-up call about what “correct” actually means in MySQL. Your query can return exactly the right data and still absolutely destroy your application. The database couldn’t care less about correctness—it only cares about how you ask for that data.

The Index Thing Everybody Gets Wrong

Here’s the thing about composite indexes. The column order matters, like, a lot. Most people just dump columns in and hope for the best.

Take this simple query:

1
2
3
4
SELECT * FROM orders
WHERE customer_id = 1234
  AND status = 'shipped'
  AND created_at > '2025-01-01';

The index needs to match your WHERE clause:

1
2
3
4
5
6
7
-- Does the job
CREATE INDEX idx_orders_customer_status_date
ON orders (customer_id, status, created_at);

-- This one? Basically useless for this query
CREATE INDEX idx_orders_bad
ON orders (status, created_at, customer_id);

The first one lets MySQL do equality lookups on customer_id and status, then a range scan on created_at. The second one can’t really use any of that efficiently because MySQL reads indexes left to right.

I see this mistake constantly. Literally had a client last month who had an index on (status, user_id, created_at) trying to support WHERE user_id = X AND status = Y. Took me 5 minutes to fix, and their page load times dropped from 3 seconds to 80 milliseconds.

Covering Indexes Are Magic

When your index contains every column your query needs, MySQL never touches the actual table. That’s the magic:

1
2
3
4
5
6
7
8
-- Regular query hits the table
SELECT order_id, customer_id, total
FROM orders
WHERE customer_id = 1234;

-- Covering index - pure index read, no table access
CREATE INDEX idx_orders_covering
ON orders (customer_id, order_id, total, status);

On really busy systems, I’ve seen this trick deliver 10x speedups. Maybe more. The SELECT comes straight from the B-tree index pages. No disk I/O for the table itself. It’s beautiful when it works.

Partial Indexes - Only 8.0+

If you’re lucky enough to be on 8.0 or later, partial indexes are pretty cool:

1
2
3
4
-- Index first 10 chars of email
-- Way smaller, faster writes, still useful for prefix searches
CREATE INDEX idx_users_email_prefix
ON users (email(10));

We cut index size by about 60% on one of our user tables doing this. Trade-off is it only helps with LIKE ‘something%’ queries, not exact matches. But when it applies, it really applies.

The Query Patterns That Will Ruin Your Day

SELECT * Is Killing You

I know, I know. It’s easier to write. Nobody wants to type out all those column names. But let me tell you what happens on a busy production database with a wide table:

1
2
3
4
5
6
-- Pulls ALL 47 columns
SELECT * FROM orders WHERE order_id = 5678;

-- Pulls just what you need
SELECT order_id, status, total, created_at
FROM orders WHERE order_id = 5678;

Those extra 43 columns? They’re going across your network, into your application memory, getting serialized somewhere. On a system handling thousands of queries per second, that overhead adds up. It really does.

Functions On Columns Break Indexes

This one’s sneaky. The query looks innocent, but it’s absolutely destroying your performance:

1
2
3
4
5
6
7
8
-- Can't use index because of DATE()
SELECT * FROM users
WHERE DATE(created_at) = '2025-01-15';

-- Same results, index-friendly
SELECT * FROM users
WHERE created_at >= '2025-01-15 00:00:00'
  AND created_at < '2025-01-16 00:00:00';

The DATE() function forces MySQL to evaluate every single row in the table. Every single one. There’s no way around it. This single mistake has probably caused more production outages than anything else I’ve seen.

EXISTS vs IN For Subqueries

When your subquery returns a lot of rows, this matters a lot:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
-- Runs the whole subquery first
SELECT * FROM products p
WHERE p.category_id IN (
    SELECT category_id FROM categories WHERE active = 1
);

-- Stops at first match
SELECT * FROM products p
WHERE EXISTS (
    SELECT 1 FROM categories c
    WHERE c.category_id = p.category_id
    AND c.active = 1
);

With millions of products, the first one can take seconds. The second one? Milliseconds. MySQL is smart enough to short-circuit EXISTS once it finds a match. It is not smart enough to do that with IN.

OFFSET Pagination Sucks

You’ve probably noticed this already. Page 1 loads fine. Page 50? Might want to go make coffee:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Reads 500 rows to skip, then gives you 10
SELECT * FROM orders
ORDER BY created_at DESC
LIMIT 10 OFFSET 500;

-- Uses the index, consistent speed
SELECT * FROM orders
WHERE created_at < '2025-01-10 14:30:00'
ORDER BY created_at DESC
LIMIT 10;

This is called keyset pagination or cursor-based pagination. Once you start using it, you never go back to OFFSET. The speed is exactly the same whether you’re on page 1 or page 10 million. That’s not an exaggeration.

How To Actually Read EXPLAIN Output

Everyone says “run EXPLAIN.” Here’s what you’re actually looking for:

1
2
3
4
5
6
EXPLAIN FORMAT=JSON
SELECT o.*, u.name, u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
  AND o.created_at > '2025-01-01';

The stuff that should make you nervous:

  • type: ALL → full table scan, bad news
  • Using filesort → MySQL sorting in memory, could be slow
  • Using temporary → temp table created, usually not great
  • rows: 5000000 → yeah that’s a lot of rows

Ideally you want your index in the “key” column, type should say “ref” or “const”, and rows should be in the hundreds, not millions. That’s the sweet spot.

The Version Thing - 5.7 vs 8.0 vs 8.4

Still On 5.7?

You’re missing some nice stuff, but you can still optimize:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
-- Force index when optimizer picks wrong
SELECT * FROM orders
FORCE INDEX (idx_orders_status)
WHERE status = 'cancelled';

-- Virtual columns exist in 5.7 too
ALTER TABLE orders
ADD COLUMN order_month DATE
GENERATED ALWAYS AS (DATE_FORMAT(created_at, '%Y-%m-01')) STORED;

CREATE INDEX idx_order_month ON orders(order_month);

No CTEs. No window functions. You’re writing subqueries like it’s 2015. Which, I mean, it is, so that’s fine.

MySQL 8.0 Changed Everything

This version is when MySQL finally got good:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- CTEs are so much cleaner
WITH recent_orders AS (
    SELECT * FROM orders WHERE created_at > '2025-01-01'
),
high_value AS (
    SELECT * FROM recent_orders WHERE total > 1000
)
SELECT * FROM high_value h
JOIN users u ON h.user_id = u.id;

-- Window functions replace ugly self-joins
SELECT
    order_id,
    user_id,
    total,
    SUM(total) OVER (PARTITION BY user_id ORDER BY created_at) as running_total
FROM orders;

-- Invisible indexes let you test safely
ALTER TABLE orders ALTER INDEX idx_orders_status INVISIBLE;
-- Run your slow query...
-- Still slow? DROP INDEX. Fast? Great.

The skip scan optimization in 8.0 also helps a ton with composite indexes where the first column doesn’t have many unique values. It’s one of those things you don’t notice until you need it.

MySQL 8.4 - Current LTS

If you’re setting up new stuff, go with 8.4:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
-- This survives restarts!
SET PERSIST max_connections = 500;

-- JSON_TABLE is way better now
SELECT j.*
FROM JSON_TABLE(
    '[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"}]',
    '$[*]' COLUMNS (
        id INT PATH '$.id',
        name VARCHAR(100) PATH '$.name'
    )
) AS j;

The optimizer keeps getting smarter in each point release. 8.4.0 was solid, 8.4.3 is even better. If you’re on 8.0, at least stay current on patches.

What Actually Matters - My Priority List

Fifteen years of this. Here’s what actually makes a difference:

The important stuff (fixes most problems):

  1. Add indexes for WHERE clause columns
  2. Stop using SELECT *
  3. Make covering indexes for your hot queries
  4. Fix whatever EXPLAIN shows as a full table scan

The nice-to-haves: 5. Switch to keyset pagination 6. UNION ALL instead of UNION when you don’t need dedup 7. Composite indexes for multi-column filters

The nuclear options (rarely needed): 8. Table partitioning 9. Denormalization 10. Query rewrite plugins

I’ve worked with databases way bigger than most companies will ever see, and honestly? I almost never go past step 4. The basics work. They always have.

Quick Things To Check Right Now

You can find easy wins immediately:

1
2
3
4
5
6
7
8
9
-- Turn on slow query log if you haven't
SHOW VARIABLES LIKE 'slow_query_log';
SHOW VARIABLES LIKE 'long_query_time';

-- What indexes do you actually have?
SHOW INDEX FROM your_table_name;

-- How many table scans happening?
SHOW GLOBAL STATUS LIKE 'Handler_read%';

If Handler_read_rnd_next is huge compared to Handler_read_first, you’re scanning too much. Simple as that.

Look

Here’s the thing. MySQL optimization isn’t about memorizing every feature or trick in the book. It’s about being able to look at a slow query, understand what MySQL is actually doing, and apply the right fix. That’s it.

Start with EXPLAIN. Look at your indexes. Fix the full table scans. That gets you probably 90% of the way to where you need to be.

Your database will be faster. Your users will stop complaining. And maybe, just maybe, you won’t get that 2 AM call.

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.