[dm-devel] Announcement: STEC EnhanceIO SSD caching software for Linux kernel

Discussion:

Amit Kale

2013-01-17 09:52:00 UTC

Hi Joe, Kent,

[Adding Kent as well since bcache is mentioned below as one of the contenders for being integrated into mainline kernel.]

My understanding is that these three caching solutions all have three principle blocks.
1. A cache block lookup - This refers to finding out whether a block was cached or not and the location on SSD, if it was.
2. Block replacement policy - This refers to the algorithm for replacing a block when a new free block can't be found.
3. IO handling - This is about issuing IO requests to SSD and HDD.
4. Dirty data clean-up algorithm (for write-back only) - The dirty data clean-up algorithm decides when to write a dirty block in an SSD to its original location on HDD and executes the copy.

When comparing the three solutions we need to consider these aspects.
1. User interface - This consists of commands used by users for creating, deleting, editing properties and recovering from error conditions.
2. Software interface - Where it interfaces to Linux kernel and applications.
3. Availability - What's the downtime when adding, deleting caches, making changes to cache configuration, conversion between cache modes, recovering after a crash, recovering from an error condition.
4. Security - Security holes, if any.
5. Portability - Which HDDs, SSDs, partitions, other block devices it works with.
6. Persistence of cache configuration - Once created does the cache configuration stay persistent across reboots. How are changes in device sequence or numbering handled.
7. Persistence of cached data - Does cached data remain across reboots/crashes/intermittent failures. Is the "sticky"ness of data configurable.
8. SSD life - Projected SSD life. Does the caching solution cause too much of write amplification leading to an early SSD failure.
9. Performance - Throughput is generally most important. Latency is also one more performance comparison point. Performance under different load classes can be measured.
10. ACID properties - Atomicity, Concurrency, Idempotent, Durability. Does the caching solution have these typical transactional database or filesystem properties. This includes avoiding torn-page problem amongst crash and failure scenarios.
11. Error conditions - Handling power failures, intermittent and permanent device failures.
12. Configuration parameters for tuning according to applications.

We'll soon document EnhanceIO behavior in context of these aspects. We'll appreciate if dm-cache and bcache is also documented.

When comparing performance there are three levels at which it can be measured
1. Architectural elements
1.1. Throughput for 100% cache hit case (in absence of dirty data clean-up)
1.2. Throughput for 0% cache hit case (in absence of dirty data clean-up)
1.3. Dirty data clean-up rate (in absence of IO)
2. Performance of architectural elements combined
2.1. Varying mix of read/write, sustained performance.
3. Application level testing - The more real-life like benchmark we work with, the better it is.

Thanks.
-Amit

-----Original Message-----
Sent: Wednesday, January 16, 2013 4:16 PM
To: device-mapper development
Cc: Mike Snitzer; LKML
Subject: Re: [dm-devel] Announcement: STEC EnhanceIO SSD caching
software for Linux kernel
Hi Amit,
I'll look through EnhanceIO this week.
There are several cache solutions out there; bcache, my dm-cache and
EnhanceIO seeming to be the favourites. In suspect none of them are
without drawbacks, so I'd like to see if we can maybe work together.
I think the first thing we need to do is make it easy to compare the
performance of these impls.
I'll create a branch in my github tree with all three caches in. So
it's easy to build a kernel with them. (Mike's already combined dm-
cache and bcache and done some preliminary testing).
We've got some small test scenarios in our test suite that we run [1].
They certainly flatter dm-cache since it was developed using these.
It would be really nice if you could describe and provide scripts for
your test scenarios. I'll integrate them with the test suite, and then
I can have some confidence that I'm seeing EnhanceIO in its best light.
The 'transparent' cache issue is a valid one, but to be honest a bit
orthogonal to cache. Integrating dm more closely with the block layer
such that a dm stack can replace any device has been discussed for
years and I know Alasdair has done some preliminary design work on
this. Perhaps we can use your requirement to bump up the priority on
this work.

5. We have designed our writeback architecture from scratch.
Coalescing/bunching together of metadata writes and cleanup is much
improved after redesigning of the EnhanceIO-SSD interface. The DM
interface would have been too restrictive for this. EnhanceIO uses

set

level locking, which improves parallelism of IO, particularly for
writeback.

I sympathise with this; dm-cache would also like to see a higher level
view of the io, rather than being given the ios to remap one by one.
Let's start by working out how much of a benefit you've gained from
this and then go from there.

recipient

named above. If you received this electronic message in error, please
notify the sender and delete the electronic message. Any disclosure,
copying, distribution, or use of the contents of information received
in error is strictly prohibited, and violators will be pursued
legally.

Please do not use this signature when sending to dm-devel. If there's
proprietary information in the email you need to tell people up front
so they can choose not to read it.
- Joe
[1] https://github.com/jthornber/thinp-test-
suite/tree/master/tests/cache
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

PROPRIETARY-CONFIDENTIAL INFORMATION INCLUDED

This electronic transmission, and any documents attached hereto, may contain confidential, proprietary and/or legally privileged information. The information is intended only for use by the recipient named above. If you received this electronic message in error, please notify the sender and delete the electronic message. Any disclosure, copying, distribution, or use of the contents of information received in error is strictly prohibited, and violators will be pursued legally.

Kent Overstreet

2013-01-17 11:39:40 UTC

Permalink

Suppose I could fill out the bcache version...

Post by Amit Kale
Hi Joe, Kent,
[Adding Kent as well since bcache is mentioned below as one of the contenders for being integrated into mainline kernel.]
My understanding is that these three caching solutions all have three principle blocks.
1. A cache block lookup - This refers to finding out whether a block was cached or not and the location on SSD, if it was.
2. Block replacement policy - This refers to the algorithm for replacing a block when a new free block can't be found.
3. IO handling - This is about issuing IO requests to SSD and HDD.
4. Dirty data clean-up algorithm (for write-back only) - The dirty data clean-up algorithm decides when to write a dirty block in an SSD to its original location on HDD and executes the copy.
When comparing the three solutions we need to consider these aspects.
1. User interface - This consists of commands used by users for creating, deleting, editing properties and recovering from error conditions.
2. Software interface - Where it interfaces to Linux kernel and applications.

Both done with sysfs, at least for now.

Post by Amit Kale
3. Availability - What's the downtime when adding, deleting caches, making changes to cache configuration, conversion between cache modes, recovering after a crash, recovering from an error condition.

All of that is done at runtime, without any interruption. bcache doesn't
distinguish between clean and unclean shutdown, which is nice because it
means the recovery code gets tested. Registering a cache device takes on
the order of half a second, for a large (half terabyte) cache.

Post by Amit Kale
4. Security - Security holes, if any.

Hope there aren't any!

Post by Amit Kale
5. Portability - Which HDDs, SSDs, partitions, other block devices it works with.

Any block device.

Post by Amit Kale
6. Persistence of cache configuration - Once created does the cache configuration stay persistent across reboots. How are changes in device sequence or numbering handled.

Persistent. Device nodes are not stable across reboots, same as say scsi
devices if they get probed in a different order. It does persist a label
in the backing device superblock which can be used to implement stable
device nodes.

Post by Amit Kale
7. Persistence of cached data - Does cached data remain across reboots/crashes/intermittent failures. Is the "sticky"ness of data configurable.

Persists across reboots. Can't be switched off, though it could be if
there was any demand.

Post by Amit Kale
8. SSD life - Projected SSD life. Does the caching solution cause too much of write amplification leading to an early SSD failure.

With LRU, there's only so much you can do to work around the SSD's FTL,
though bcache does try; allocation is done in terms of buckets, which
are on the order of a megabyte (configured when you format the cache
device). Buckets are written to sequentially, then rewritten later all
at once (and it'll issue a discard before rewriting a bucket if you flip
it on, it's not on by default because TRIM = slow).

Bcache also implements fifo cache replacement, and with that write
amplification should never be an issue.

Post by Amit Kale
9. Performance - Throughput is generally most important. Latency is also one more performance comparison point. Performance under different load classes can be measured.
10. ACID properties - Atomicity, Concurrency, Idempotent, Durability. Does the caching solution have these typical transactional database or filesystem properties. This includes avoiding torn-page problem amongst crash and failure scenarios.

Yes.

Post by Amit Kale
11. Error conditions - Handling power failures, intermittent and permanent device failures.

Power failures and device failures yes, intermittent failures are not
explicitly handled.

Post by Amit Kale
12. Configuration parameters for tuning according to applications.

Lots. The most important one is probably sequential bypass - you don't
typically want to cache your big sequential IO, because rotating disks
do fine at that. So bcache detects sequential IO and bypasses it with a
configurable threshold.

There's also stuff for bypassing more data if the SSD is overloaded - if
you're caching many disks with a single SSD, you don't want the SSD to
be the bottleneck. So it tracks latency to the SSD and cranks down the
sequential bypass threshold if it gets too high.

Post by Amit Kale
We'll soon document EnhanceIO behavior in context of these aspects. We'll appreciate if dm-cache and bcache is also documented.
When comparing performance there are three levels at which it can be measured
1. Architectural elements
1.1. Throughput for 100% cache hit case (in absence of dirty data clean-up)

North of a million iops.

Post by Amit Kale
1.2. Throughput for 0% cache hit case (in absence of dirty data clean-up)

Also relevant whether you're adding the data to the cache. I'm sure
bcache is slightly slower than the raw backing device here, but if it's
noticable it's a bug (I haven't benchmarked that specifically in ages).

Post by Amit Kale
1.3. Dirty data clean-up rate (in absence of IO)

Background writeback is done by scanning the btree in the background for
dirty data, and then writing it out in lba order - so the writes are as
sequential as they're going to get. It's fast.

Post by Amit Kale
2. Performance of architectural elements combined
2.1. Varying mix of read/write, sustained performance.

Random write performance is definitely important, as there you've got to
keep an index up to date on stable storage (if you want to handle
unclean shutdown, anyways). Making that fast is non trivial. Bcache is
about as efficient as you're going to get w.r.t. metadata writes,
though.

Post by Amit Kale
3. Application level testing - The more real-life like benchmark we work with, the better it is.

Amit Kale

2013-01-17 17:17:17 UTC

Permalink

Thanks for a prompt reply.

Post by Kent Overstreet
Suppose I could fill out the bcache version...

principle blocks.

Post by Amit Kale
1. A cache block lookup - This refers to finding out whether a block

was cached or not and the location on SSD, if it was.

Post by Amit Kale
2. Block replacement policy - This refers to the algorithm for

replacing a block when a new free block can't be found.

Post by Amit Kale
3. IO handling - This is about issuing IO requests to SSD and HDD.
4. Dirty data clean-up algorithm (for write-back only) - The dirty

data clean-up algorithm decides when to write a dirty block in an SSD
to its original location on HDD and executes the copy.

Post by Amit Kale
When comparing the three solutions we need to consider these aspects.
1. User interface - This consists of commands used by users for

creating, deleting, editing properties and recovering from error
conditions.

Post by Amit Kale
2. Software interface - Where it interfaces to Linux kernel and

applications.
Both done with sysfs, at least for now.

sysfs is the user interface. Bcache creates a new block device. So it interfaces to Linux kernel at block device layer. HDD and SSD interfaces would at using submit_bio (pl. correct if this is wrong).

Post by Kent Overstreet

Post by Amit Kale
3. Availability - What's the downtime when adding, deleting caches,

making changes to cache configuration, conversion between cache modes,
recovering after a crash, recovering from an error condition.
All of that is done at runtime, without any interruption. bcache
doesn't distinguish between clean and unclean shutdown, which is nice
because it means the recovery code gets tested. Registering a cache
device takes on the order of half a second, for a large (half terabyte)
cache.

Since a new device is created, you need to bring down applications the first time a cache is created. There-onwards it would be online. Similarly applications need to be brought down when deleting a cache. Fstab changes etc also need to be done. My guess is all this requires some effort and understanding by a system administrator. Does fstab work without any manual editing if it contains labes instead of device paths?

Post by Kent Overstreet

Post by Amit Kale
4. Security - Security holes, if any.

Hope there aren't any!

All the three caches can be operated only as root. So as long as there are no bugs, there is no need to worry about security loopholes.

Post by Kent Overstreet

Post by Amit Kale
5. Portability - Which HDDs, SSDs, partitions, other block devices it

works with.
Any block device.

Post by Amit Kale
6. Persistence of cache configuration - Once created does the cache

configuration stay persistent across reboots. How are changes in device
sequence or numbering handled.
Persistent. Device nodes are not stable across reboots, same as say
scsi devices if they get probed in a different order. It does persist a
label in the backing device superblock which can be used to implement
stable device nodes.

Can this be embedded in a udev script so that the configuration becomes persistent regardless of probing order? What happens if either SSD or HDD are absent when a system comes up? Does it work with iSCSI HDDs? iSCSi HDDs can be tricky during shutdown, specifically if the iSCSI device goes offline before a cache saves metadata.

Post by Kent Overstreet

Post by Amit Kale
7. Persistence of cached data - Does cached data remain across

reboots/crashes/intermittent failures. Is the "sticky"ness of data
configurable.
Persists across reboots. Can't be switched off, though it could be if
there was any demand.

Believe me, enterprise customers do require a cache to be non-persistent. This is because of a paranoia that HDD and SSD may go out of sync after a shutdown and before a reboot. This is primarily in an environment with a large number of HDDs accessed through a complicated iSCSI based setup perhaps with software RAID.

Post by Kent Overstreet

Post by Amit Kale
8. SSD life - Projected SSD life. Does the caching solution cause too

much of write amplification leading to an early SSD failure.
With LRU, there's only so much you can do to work around the SSD's FTL,
though bcache does try; allocation is done in terms of buckets, which
are on the order of a megabyte (configured when you format the cache
device). Buckets are written to sequentially, then rewritten later all
at once (and it'll issue a discard before rewriting a bucket if you
flip it on, it's not on by default because TRIM = slow).
Bcache also implements fifo cache replacement, and with that write
amplification should never be an issue.

Most SSDs contain a fairly sophisticated FTL doing wear-leveling. Wear-leveling only helps by evenly balancing over-writes across an entire SSD. Do you have statistics on how many SSD writes are generated per block read from or written to HDD? Metadata writes should be done only for the affected sectors, or else they contribute to more SSD internal writes. There is also a common debate on whether writing a single sector is more beneficial compared writing a whole block containing that sector.

Post by Kent Overstreet

Post by Amit Kale
9. Performance - Throughput is generally most important. Latency is

also one more performance comparison point. Performance under different
load classes can be measured.

Post by Amit Kale
10. ACID properties - Atomicity, Concurrency, Idempotent, Durability.

Does the caching solution have these typical transactional database or
filesystem properties. This includes avoiding torn-page problem amongst
crash and failure scenarios.
Yes.

Post by Amit Kale
11. Error conditions - Handling power failures, intermittent and

permanent device failures.
Power failures and device failures yes, intermittent failures are not
explicitly handled.

The IO completion guarantee offered on intermittent failures should be as good as HDD.

Post by Kent Overstreet

Post by Amit Kale
12. Configuration parameters for tuning according to applications.

Lots. The most important one is probably sequential bypass - you don't
typically want to cache your big sequential IO, because rotating disks
do fine at that. So bcache detects sequential IO and bypasses it with a
configurable threshold.
There's also stuff for bypassing more data if the SSD is overloaded -
if you're caching many disks with a single SSD, you don't want the SSD
to be the bottleneck. So it tracks latency to the SSD and cranks down
the sequential bypass threshold if it gets too high.

That's interesting. I'll definitely want to read this part of the source code.

Post by Kent Overstreet

Post by Amit Kale
We'll soon document EnhanceIO behavior in context of these aspects.

We'll appreciate if dm-cache and bcache is also documented.

Post by Amit Kale
When comparing performance there are three levels at which it can be
measured 1. Architectural elements 1.1. Throughput for 100% cache hit
case (in absence of dirty data clean-up)

North of a million iops.

Post by Amit Kale
1.2. Throughput for 0% cache hit case (in absence of dirty data clean-up)

Post by Amit Kale
1.3. Dirty data clean-up rate (in absence of IO)

Background writeback is done by scanning the btree in the background
for dirty data, and then writing it out in lba order - so the writes
are as sequential as they're going to get. It's fast.

Great.

Thanks.
-Amit

Post by Kent Overstreet

Post by Amit Kale
2. Performance of architectural elements combined 2.1. Varying mix of
read/write, sustained performance.

Random write performance is definitely important, as there you've got
to keep an index up to date on stable storage (if you want to handle
unclean shutdown, anyways). Making that fast is non trivial. Bcache is
about as efficient as you're going to get w.r.t. metadata writes,
though.

Post by Amit Kale
3. Application level testing - The more real-life like benchmark we

work with, the better it is.

Kent Overstreet

2013-01-24 23:45:24 UTC

Permalink

Post by Kent Overstreet
Suppose I could fill out the bcache version...

Post by Amit Kale
11. Error conditions - Handling power failures, intermittent and permanent device failures.

Power failures and device failures yes, intermittent failures are not
explicitly handled.

Coworker pointed out bcache actually does handle some intermittent io errors. I
just added error handling to the documentation:
http://atlas.evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt?h=bcache-dev

To cut and paste,

Bcache tries to transparently handle IO errors to/from the cache device without
affecting normal operation; if it sees too many errors (the threshold is
configurable, and defaults to 0) it shuts down the cache device and switches all
the backing devices to passthrough mode.

- For reads from the cache, if they error we just retry the read from the
backing device.

- For writethrough writes, if the write to the cache errors we just switch to
invalidating the data at that lba in the cache (i.e. the same thing we do for
a write that bypasses the cache)

- For writeback writes, we currently pass that error back up to the
filesystem/userspace. This could be improved - we could retry it as a write
that skips the cache so we don't have to error the write.

- When we detach, we first try to flush any dirty data (if we were running in
writeback mode). It currently doesn't do anything intelligent if it fails to
read some of the dirty data, though.

thornber-H+wXaHxf7aLQT0dZR+

2013-01-17 13:26:21 UTC

Permalink

Let me try and explain how dm-cache works.

Post by Amit Kale
1. A cache block lookup - This refers to finding out whether a block was cached or not and the location on SSD, if it was.

Of course we have this, but it's part of the policy plug-in. I've
done this because the policy nearly always needs to do some book
keeping (eg, update a hit count when accessed).

Post by Amit Kale
2. Block replacement policy - This refers to the algorithm for replacing a block when a new free block can't be found.

I think there's more than just this. These are the tasks that I hand
over to the policy:

a) _Which_ blocks should be promoted to the cache. This seems to be
the key decision in terms of performance. Blindly trying to
promote every io or even just every write will lead to some very
bad performance in certain situations.

The mq policy uses a multiqueue (effectively a partially sorted
lru list) to keep track of candidate block hit counts. When
candidates get enough hits they're promoted. The promotion
threshold his periodically recalculated by looking at the hit
counts for the blocks already in the cache.

The hit counts should degrade over time (for some definition of
time; eg. io volume). I've experimented with this, but not yet
come up with a satisfactory method.

I read through EnhanceIO yesterday, and think this is where
you're lacking.

b) When should a block be promoted. If you're swamped with io, then
adding copy io is probably not a good idea. Current dm-cache
just has a configurable threshold for the promotion/demotion io
volume. If you or Kent have some ideas for how to approximate
the bandwidth of the devices I'd really like to hear about it.

c) Which blocks should be demoted?

This is the bit that people commonly think of when they say
'caching algorithm'. Examples are lru, arc, etc. Such
descriptions are fine when describing a cache where elements
_have_ to be promoted before they can be accessed, for example a
cpu memory cache. But we should be aware that 'lru' for example
really doesn't tell us much in the context of our policies.

The mq policy uses a blend of lru and lfu for eviction, it seems
to work well.

A couple of other things I should mention; dm-cache uses a large block
size compared to eio. eg, 64k - 1m. This is a mixed blessing;

- our copy io is more efficient (we don't have to worry about
batching migrations together so much. Something eio is careful to
do).

- we have fewer blocks to hold stats about, so can keep more info per
block in the same amount of memory.

- We trigger more copying. For example if an incoming write triggers
a promotion from the origin to the cache, and the io covers a block
we can avoid any copy from the origin to cache. With a bigger
block size this optmisation happens less frequently.

- We waste SSD space. eg, a 4k hotspot could trigger a whole block
to be moved to the cache.

We do not keep the dirty state of cache blocks up to date on the
metadata device. Instead we have a 'mounted' flag that's set in the
metadata when opened. When a clean shutdown occurs (eg, dmsetup
suspend my-cache) the dirty bits are written out and the mounted flag
cleared. On a crash the mounted flag will still be set on reopen and
all dirty flags degrade to 'dirty'. Correct me if I'm wrong, but I
think eio is holding io completion until the dirty bits have been
committed to disk?

I really view dm-cache as a slow moving hotspot optimiser. Whereas I
think eio and bcache are much more of a heirarchical storage approach,
where writes go through the cache if possible?

Post by Amit Kale
3. IO handling - This is about issuing IO requests to SSD and HDD.

I get most of this for free via dm and kcopyd. I'm really keen to
see how bcache does; it's more invasive of the block layer, so I'm
expecting it to show far better performance than dm-cache.

Post by Amit Kale
4. Dirty data clean-up algorithm (for write-back only) - The dirty

data clean-up algorithm decides when to write a dirty block in an
SSD to its original location on HDD and executes the copy.

Yep.

Post by Amit Kale
When comparing the three solutions we need to consider these aspects.
1. User interface - This consists of commands used by users for

creating, deleting, editing properties and recovering from error
conditions.

I was impressed how easy eio was to use yesterday when I was playing
with it. Well done.

Driving dm-cache through dm-setup isn't much more of a hassle
though. Though we've decided to pass policy specific params on the
target line, and tweak via a dm message (again simple via dmsetup).
I don't think this is as simple as exposing them through something
like sysfs, but it is more in keeping with the device-mapper way.

Post by Amit Kale
2. Software interface - Where it interfaces to Linux kernel and applications.

See above.

Post by Amit Kale
3. Availability - What's the downtime when adding, deleting caches,

making changes to cache configuration, conversion between cache
modes, recovering after a crash, recovering from an error condition.

Normal dm suspend, alter table, resume cycle. The LVM tools do this
all the time.

Post by Amit Kale
4. Security - Security holes, if any.

Well I saw the comment in your code describing the security flaw you
think you've got. I hope we don't have any, I'd like to understand
your case more.

Post by Amit Kale
5. Portability - Which HDDs, SSDs, partitions, other block devices it works with.

I think we all work with any block device. But eio and bcache can
overlay any device node, not just a dm one. As mentioned in earlier
email I really think this is a dm issue, not specific to dm-cache.

Post by Amit Kale
6. Persistence of cache configuration - Once created does the cache

configuration stay persistent across reboots. How are changes in
device sequence or numbering handled.

We've gone for no persistence of policy parameters. Instead
everything is handed into the kernel when the target is setup. This
decision was made by the LVM team who wanted to store this
information themselves (we certainly shouldn't store it in two
places at once). I don't feel strongly either way, and could
persist the policy params v. easily (eg, 1 days work).

One thing I do provide is a 'hint' array for the policy to use and
persist. The policy specifies how much data it would like to store
per cache block, and then writes it on clean shutdown (hence 'hint',
it has to cope without this, possibly with temporarily degraded
performance). The mq policy uses the hints to store hit counts.

Post by Amit Kale
7. Persistence of cached data - Does cached data remain across

reboots/crashes/intermittent failures. Is the "sticky"ness of data
configurable.

Surely this is a given? A cache would be trivial to write if it
didn't need to be crash proof.

Post by Amit Kale
8. SSD life - Projected SSD life. Does the caching solution cause

too much of write amplification leading to an early SSD failure.

No, I decided years ago that life was too short to start optimising
for specific block devices. By the time you get it right the
hardware characteristics will have moved on. Doesn't the firmware
on SSDs try and even out io wear these days?

That said I think we evenly use the SSD. Except for the superblock
on the metadata device.

Post by Amit Kale
9. Performance - Throughput is generally most important. Latency is

also one more performance comparison point. Performance under
different load classes can be measured.

I think latency is more important than throughput. Spindles are
pretty good at throughput. In fact the mq policy tries to spot when
we're doing large linear ios and stops hit counting; best leave this
stuff on the spindle.

Post by Amit Kale
10. ACID properties - Atomicity, Concurrency, Idempotent,

Durability. Does the caching solution have these typical
transactional database or filesystem properties. This includes
avoiding torn-page problem amongst crash and failure scenarios.

Could you expand on the torn-page issue please?

Post by Amit Kale
11. Error conditions - Handling power failures, intermittent and permanent device failures.

I think the area where dm-cache is currently lacking is intermittent
failures. For example if a cache read fails we just pass that error
up, whereas eio sees if the block is clean and if so tries to read
off the origin. I'm not sure which behaviour is correct; I like to
know about disk failure early.

Post by Amit Kale
12. Configuration parameters for tuning according to applications.

Discussed above.

Post by Amit Kale
We'll soon document EnhanceIO behavior in context of these

aspects. We'll appreciate if dm-cache and bcache is also documented.

I hope the above helps. Please ask away if you're unsure about
something.

Post by Amit Kale
When comparing performance there are three levels at which it can be measured

Developing these caches is tedious. Test runs take time, and really
slow the dev cycle down. So I suspect we've all been using
microbenchmarks that run in a few minutes.

Let's get our pool of microbenchmarks together, then work on some
application level ones (we're happy to put some time into developing
these).

- Joe

Amit Kale

2013-01-17 17:53:11 UTC