In the previous “black box, white box” blog, I wrote about how the effect of reducing the number of cores and clock speed in a HANA system can be measured.  Information that is useful if one were creating a cost optimised HANA system for non-production.  In this final instalment I have tried to address the question “what storage do I need”.

I have performed a series of tests that I hoped would shed some light on how much storage is enough.  But first I’ll describe the storage requirements for HANA and how they differ for the cost optimised non production systems.

In a production HANA system there are usually three discreet mount points that HANA requires.  These are:

  • Installation directory
  • Log directory
  • Data directory

The installation directory is where HANA is installed.  This directory needs to be big enough to store the binaries, traces and a full core dump of HANA.  It’s normally sized as memory + 1TiB.  On scale out systems the installation directory should be mounted read/write on all nodes, typically done with NFS or GPFS.  Also scale out systems must be able to provide space for core dumps from nodes.

The log directory is where HANA’s transaction logs are stored.  This is usually the most efficient storage available to the system.  Any inserts or updates must be committed to disk in the log directory.  SQL statements that insert or update data are not returned until the data is written to disk.  For smaller systems the log directory is at least equal to the size of memory.  For systems with 1TiB or more of memory, the log directory is at least 1TiB.

The data directory is where all data is eventually written.  Data written to the log directory is eventually merged with the main data in the data directory.  This is an online background process that the database handles automatically.  At the time of the merge the data is already committed, therefore the storage is often a lower speed to the log directory.  However, some systems use the same tier of storage for both.  In production systems the data directory should be 3 times larger than the memory installed on the node.

These directories added together typically create a storage footprint of 4 to 5 times that of the memory.

For non-production work-loads both performance and the size of the storage can be altered.  Lower speed disks can be used and the size of the storage can be as low as 2x memory.  However, the system architect needs to weigh up the costs.  How little performance and speed could you get away with and would you really want to?

To find a good middle ground I was able to test the three discreet disk configurations.  All three were based on a Lenovo x3950 x6 8 socket server with 8 x Intel E7 8850 v2 @ 2.30GHz processors and 3TiB of memory (96 x 32GiB).  This server had SLES for SAP Applications 11.3 installed on 2 x 300GB internal disks.  The storage I tested for the HANA persistence layer was based on a Lenovo EXP2524 external disk enclosure attached to a ServeRAID M5120 SAS RAID controllers with 1GiB of cache.

I tested the same RAID configurations three times with different numbers of disks each time.* *

Config 1Config 2Config 3
RAID Level555
Stripe Size646464
Cache Settingcached, write back, read aheadcached, write back, read aheadcached, write back, read ahead
Disk Type1.2TB, 10K SAS1.2TB, 10K SAS1.2TB, 10K SAS
Number of Disks369
 

For each configuration I created a single disk volume group and two logical volumes, data and log.  An XFS file system was created on each volume.  Finally the data volume was mounted on /hana/data and the log volume mounted on /hana/log.  Like many appliances, there was no distinction between the storage used for the log and data volumes.

I then used the SAP HANA HW Configuration Check Tool to benchmark the file systems using the following JSON configuration file.

{
"report_id":"RAIDTEST",
"use_hdb":false,
"blades":["shanner"],
     "tests": [{
             "package": "LandscapeTest",
             "test_timeout": 0,
             "id": 1,
             "config": { },
             "class": "EvalOs"
          },
          {
             "package": "FilesystemTest",
             "test_timeout": 0,
             "id": 2,
             "config": {"mount":{"shanner":["/hana/data/"]},"duration":"short"},
             "class": "DataVolumeIO"
          },
          {
             "package": "FilesystemTest",
             "test_timeout": 0,
             "id": 3,
             "config": {"mount":{"shanner":["/hana/log/"]},"duration":"short"},
             "class": "LogVolumeIO"
          }
]
}

For a full test the test duration should be set to “long”, but the short test (which gives a maximum test file size of 5GiB) was good enough for this benchmark.

These are the results for the 4KiB block size test against a 5GiB file.

datag

logg

Initially these results surprised me.  Despite three times the number of spindles in the largest configuration, the results are roughly the same across the three configurations.  However, the file test size is only 5GiB and the controller has 1GiB of cache.  With efficient caching and a relatively small file size, it is possible to see how the results would be roughly in line with each other.  (It’s also not clear if the test forces direct IO or not)

The results are not the biggest surprise.  After running the tests I looked up the official KPIs set by SAP which can be found the SAP HANA Admin Guide.  The KPIs storage must meet are surprising low.

kpis

Then it hit me.  Yes, the KPIs are seem low, it is easy to fulfil them with a small number of disks in an external drive enclosure.  These KPIs are for appliances with a productive workloads.  The storage would potentially need to guarantee this performance whilst rebuilding an array and a path down.  For a cost optimised non production workload the KPIs can be met with some ease on a single host based RAID card, but if that card fails, or the one SAS path to the disk is down, then the data is offline.  The opposite to what you’d expect in the enterprise.

In conclusion, the KPIs for the persistence layer are not terribly hard to achieve with directly attached storage, which is often how these systems get configured.  A large cache on the RAID controller and a handful of disks is enough.  But directly attached storage won’t give you the resiliency you’d expect from an appliance.  Cost optimised systems become less cost optimised when you add dual Fibre Channel adapters and provision LUNs from enterprise SANs, but it makes persistence layer more reliable.  Which is best?  Can I give you a consultant answer?  It depends.