Time flies. RAMspeed, a free open source command line utility to measure
cache and memory performance of computer systems, changes as well. It has
evolved successfully from v1.00 released in November of 2002 as a result of
Rhett's personal amusement containing about 100 lines of C code to produce one
simple benchmark, to the latest versions written in assembly language mostly.
There are 3 hardware platforms supported (i386, amd64, alpha) and several most
popular UNIX-like operating systems. A quite popular DOS (and Windows) version
exists as well. Nowadays, the software offers 18 cache and memory benchmarks
for
i386 and
amd64 machines, though 6 only for
alpha ones.
So far, RAMspeed has been tested to compile and run with assembly level
optimisations on:
- Linux (i386, amd64, alpha)
- FreeBSD (i386, amd64, alpha)
- NetBSD (i386, amd64, alpha)
- Digital UNIX (alpha)
No need to explain here in depth all the benchmarking algorithms
implemented in RAMspeed, better look at the documentation supplied and the
source code. In general, there are *mark benchmarks such as
INTmark,
FLOATmark,
MMXmark and
SSEmark. They operate with
linear (sequential) data streams passed through ALU, FPU, MMX and SSE units
respectively. They allocate certain memory space and start either writing to
or reading from it using continuous blocks sized in power of 2 from 1Kb up
to the array boundary. This simple algorithm allows to show how fast are both
cache and memory subsystems. There are also *mem benchmarks such as
INTmem,
FLOATmem,
MMXmem and
SSEmem. These are
supposed to illustrate how fast is actual read\write memory performance. Each
of them includes four subtests called Copy, Scale, Add and Triad. They're
synthetic simulations, but correlate with many real world applications. You
may have seen them already within STREAM and SiSoft Sandra. All *mem benchmarks
support the
BatchRun mode to enable high-precision memory performance
measurement through multiple passes with averages calculated per pass and per
run.
There are also non-temporal versions of MMX and SSE benchmarks. They have
been coded with special instructions to minimise cache pollution on memory
reads and to eliminate it completely on memory writes. In addition, they
operate with a built in aggressive data prefetching algorithm, though actual
behaviour is hardware dependent very much. In a matter of fact, use of
non-temporal code allows for significant performance improvements over regular
MMX and SSE benchmarks. In some cases, non-temporal MMXmark and SSEmark can
deliver almost 100% of theoretical bandwidth while reading.
There is also
RAMspeed/SMP for multiprocessor machines running
UNIX-like operating systems. To be absolutely correct, there are two distinct
branches: 2.x.x features support for POSIX threads, and 3.x.x utilises System V
shared memory for IPC (Inter-Process Communication) and operates with multiple
processes. RAMspeed/SMP v2.x.x is developed no longer due to numerous
compatibility and performance issues.
Here are several screen-shots to illustrate RAMspeed (FreeBSD) v2.5.0 in
action (from 6Kb to 11Kb in size each):
Introduction
Additional Information (CPUinfo)
INTmark [writing]
INTmark [reading]
INTmem
RAMspeed is more accurate than many other benchmarking tools, more
customisable, open source, compact, and gives you much more information to
analyse. Some people may say that the lack of some graphical interface is a
large drawback, but it may be considered as an advantage as well.
RAMspeed (UNIX) v2.6.0 (August, 2009) — for uniprocessor
machines running UNIX-like operating systems. The source code is available for
download (76Kb).
RAMspeed/SMP (UNIX) v3.5.0 (August, 2009) — for
multiprocessor machines running UNIX-like operating systems and supporting
System V IPC extensions. The source code is available for
download (78Kb).
RAMspeed (DOS) v2.5.0 (August, 2009) — for DOS as well as
32-bit Windows operating systems (95 to 2003; i386 only). Both the source code
and a pre-compiled executable are available for
download (109Kb).
RAMspeed (Win32) v1.1.1 (August, 2009) — for 32-bit as well
as 64-bit Windows operating systems (95 to 7; i386 or amd64). Both the
source code and a pre-compiled executable are available for
download (71Kb).
All of them are distributed under the terms of
The Alasir Licence (TAL), a liberal fairly
one.