« Previous Article Next Article »

Getting the read/write ratio for a database

Tags: DMV, Performance, Disks, DBA

EDIT: Please also see in depth comment from Brent Ozar on my main site which extends the query provided below by adding extra columns to give the data more context and general advice when interpreting this type of data.

Just a quick post this time. Recently I needed to find out the read/write ratio for some of our customer databases as part of a bigger project of understanding the workload and performance of the databases. The project actually through up a few surprises, where databases that I thought had fairly low reads or writes actually turned out to be some of the most intensive on the system and conversely databases that I thought were packing a punch weren’t read/writing that much at all in comparison.

At the heart of this project is the DMV sys.dm_io_virtual_file_stats which I am using to gather latency and data throughput metrics and this data can easily be used to find out the read/write balance. There have been some great recent posts on this DMV such as this one by Erin Stellato.

The following is a simple query against this DMV and uses the bytes read and bytes written to calculate the percentage. In this query, the higher the number then the more read heavy it is.

SELECT m.Name
    , m.type_desc AS 'FileType'
    , CEILING(num_of_bytes_read*1.0 / (num_of_bytes_read*1.0 + num_of_bytes_written*1.0) * 100) AS 'Read/Write %'
FROM sys.dm_io_virtual_file_stats(NULL, NULL) v
INNER JOIN sys.master_files m on m.database_id = v.database_id AND v.file_id = m.file_id

Returns

image

Note that the figures in this DMV are cumulative and are reset with an instance restart or database being taken offline/closed etc so to get an accurate numbers you should run this after the databases have been running for a few days under normal working conditions.

Enjoy!

1 Comment

  • Brent Ozar said

    Nice! Below is a tweaked version of this:

    SELECT DB_NAME(m.database_id) AS DatabaseName
    , ds.name AS FileGroupName
    , m.Name AS FileName
    , m.type_desc AS 'FileType'
    , CEILING(num_of_bytes_read*1.0 / (num_of_bytes_read*1.0 + num_of_bytes_written*1.0) * 100) AS 'Read %'
    , CAST((v.size_on_disk_bytes / 1024.0 / 1024 / 1024) AS MONEY) AS 'FileSizeGB'
    , CAST((v.num_of_bytes_read / 1024.0 / 1024 / 1024) AS MONEY) AS 'ReadGB'
    , CAST((v.num_of_bytes_written / 1024.0 / 1024 / 1024) AS MONEY) AS 'WrittenGB'
    , m.physical_name
    FROM sys.dm_io_virtual_file_stats(NULL, NULL) v
    INNER JOIN sys.master_files m on m.database_id = v.database_id AND v.file_id = m.file_id
    INNER JOIN sys.data_spaces ds ON m.data_space_id = ds.data_space_id
    ORDER BY DatabaseName, FileGroupName, FileName

    Here's what I changed:

    I added the database name and file group name, and I sort by those by default. (I've got some databases that have nearly identical file names, so it's tougher to interpret those without the db & fg names.)

    I can't really take any actions based on the read/write % of the log file, so I excluded those with an inner join to sys.data_spaces. If you wanted to include the log files, you could make that a left outer join, and change the select on ds.name to use a coalesce instead.

    I also don't want anybody taking actions on the read/write % alone - you need to know whether the database is small or large, and how much you've read and written to it. For example, you could have a database that looks like 100% reads, but it's just because nobody ever accesses it. I added a few columns for the file size in GB, plus the read and write volumes in GBs.

    I also threw in the file path because sometimes I move files around based on the amount of writes they get, like moving heavily active files to different storage.

Add a Comment

  1. Reload Image