|
|
| Next: [PATCH] i386-pda: Initialize the PDA early, befor.. |
| Author |
Message |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 1:40 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: linux>kernel (more info?) |
|
|
Dan Williams wrote:
> Neil,
>
> The following patches implement hardware accelerated raid5 for the Intel
> XscaleŽ series of I/O Processors. The MD changes allow stripe
> operations to run outside the spin lock in a work queue. Hardware
> acceleration is achieved by using a dma-engine-aware work queue routine
> instead of the default software only routine.
>
> Since the last release of the raid5 changes many bug fixes and other
> improvements have been made as a result of stress testing. See the per
> patch change logs for more information about what was fixed. This
> release is the first release of the full dma implementation.
>
> The patches touch 3 areas, the md-raid5 driver, the generic dmaengine
> interface, and a platform device driver for IOPs. The raid5 changes
> follow your comments concerning making the acceleration implementation
> similar to how the stripe cache handles I/O requests. The dmaengine
> changes are the second release of this code. They expand the interface
> to handle more than memcpy operations, and add a generic raid5-dma
> client. The iop-adma driver supports dma memcpy, xor, xor zero sum, and
> memset across all IOP architectures (32x, 33x, and 13xx).
>
> Concerning the context switching performance concerns raised at the
> previous release, I have observed the following. For the hardware
> accelerated case it appears that performance is always better with the
> work queue than without since it allows multiple stripes to be operated
> on simultaneously. I expect the same for an SMP platform, but so far my
> testing has been limited to IOPs. For a single-processor
> non-accelerated configuration I have not observed performance
> degradation with work queue support enabled, but in the Kconfig option
> help text I recommend disabling it (CONFIG_MD_RAID456_WORKQUEUE).
>
> Please consider the patches for -mm.
>
> -Dan
>
> [PATCH 01/19] raid5: raid5_do_soft_block_ops
> [PATCH 02/19] raid5: move write operations to a workqueue
> [PATCH 03/19] raid5: move check parity operations to a workqueue
> [PATCH 04/19] raid5: move compute block operations to a workqueue
> [PATCH 05/19] raid5: move read completion copies to a workqueue
> [PATCH 06/19] raid5: move the reconstruct write expansion operation to a workqueue
> [PATCH 07/19] raid5: remove compute_block and compute_parity5
> [PATCH 08/19] dmaengine: enable multiple clients and operations
> [PATCH 09/19] dmaengine: reduce backend address permutations
> [PATCH 10/19] dmaengine: expose per channel dma mapping characteristics to clients
> [PATCH 11/19] dmaengine: add memset as an asynchronous dma operation
> [PATCH 12/19] dmaengine: dma_async_memcpy_err for DMA engines that do not support memcpy
> [PATCH 13/19] dmaengine: add support for dma xor zero sum operations
> [PATCH 14/19] dmaengine: add dma_sync_wait
> [PATCH 15/19] dmaengine: raid5 dma client
> [PATCH 16/19] dmaengine: Driver for the Intel IOP 32x, 33x, and 13xx RAID engines
> [PATCH 17/19] iop3xx: define IOP3XX_REG_ADDR[32|16|8] and clean up DMA/AAU defs
> [PATCH 18/19] iop3xx: Give Linux control over PCI (ATU) initialization
> [PATCH 19/19] iop3xx: IOP 32x and 33x support for the iop-adma driver
Can devices like drivers/scsi/sata_sx4.c or drivers/scsi/sata_promise.c
take advantage of this? Promise silicon supports RAID5 XOR offload.
If so, how? If not, why not?
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 1:40 am Post subject: Re: [PATCH 02/19] raid5: move write operations to a workqueue [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams.TakeThisOut@intel.com>
>
> Enable handle_stripe5 to pass off write operations to
> raid5_do_soft_blocks_ops (which can be run as a workqueue). The operations
> moved are reconstruct-writes and read-modify-writes formerly handled by
> compute_parity5.
>
> Changelog:
> * moved raid5_do_soft_block_ops changes into a separate patch
> * changed handle_write_operations5 to only initiate write operations, which
> prevents new writes from being requested while the current one is in flight
> * all blocks undergoing a write are now marked locked and !uptodate at the
> beginning of the write operation
> * blocks undergoing a read-modify-write need a request flag to distinguish
> them from blocks that are locked for reading. Reconstruct-writes still use
> the R5_LOCKED bit to select blocks for the operation
> * integrated the work queue Kconfig option
>
> Signed-off-by: Dan Williams <dan.j.williams.TakeThisOut@intel.com>
> ---
>
> drivers/md/Kconfig | 21 +++++
> drivers/md/raid5.c | 192 ++++++++++++++++++++++++++++++++++++++------
> include/linux/raid/raid5.h | 3 +
> 3 files changed, 190 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
> index bf869ed..2a16b3b 100644
> --- a/drivers/md/Kconfig
> +++ b/drivers/md/Kconfig
> @@ -162,6 +162,27 @@ config MD_RAID5_RESHAPE
> There should be enough spares already present to make the new
> array workable.
>
> +config MD_RAID456_WORKQUEUE
> + depends on MD_RAID456
> + bool "Offload raid work to a workqueue from raid5d"
> + ---help---
> + This option enables raid work (block copy and xor operations)
> + to run in a workqueue. If your platform has a high context
> + switch penalty say N. If you are using hardware offload or
> + are running on an SMP platform say Y.
> +
> + If unsure say, Y.
> +
> +config MD_RAID456_WORKQUEUE_MULTITHREAD
> + depends on MD_RAID456_WORKQUEUE && SMP
> + bool "Enable multi-threaded raid processing"
> + default y
> + ---help---
> + This option controls whether the raid workqueue will be multi-
> + threaded or single threaded.
> +
> + If unsure say, Y.
In the final patch that gets merged, these configuration options should
go away. We are very anti-#ifdef in Linux, for a variety of reasons.
In this particular instance, code complexity increases and
maintainability decreases as the #ifdef forest grows.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 1:50 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> @@ -759,8 +755,10 @@ #endif
> device->common.device_memcpy_buf_to_buf = ioat_dma_memcpy_buf_to_buf;
> device->common.device_memcpy_buf_to_pg = ioat_dma_memcpy_buf_to_pg;
> device->common.device_memcpy_pg_to_pg = ioat_dma_memcpy_pg_to_pg;
> - device->common.device_memcpy_complete = ioat_dma_is_complete;
> - device->common.device_memcpy_issue_pending = ioat_dma_memcpy_issue_pending;
> + device->common.device_operation_complete = ioat_dma_is_complete;
> + device->common.device_xor_pgs_to_pg = dma_async_xor_pgs_to_pg_err;
> + device->common.device_issue_pending = ioat_dma_memcpy_issue_pending;
> + device->common.capabilities = DMA_MEMCPY;
Are we really going to add a set of hooks for each DMA engine whizbang
feature?
That will get ugly when DMA engines support memcpy, xor, crc32, sha1,
aes, and a dozen other transforms.
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index c94d8f1..3599472 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -20,7 +20,7 @@
> */
> #ifndef DMAENGINE_H
> #define DMAENGINE_H
> -
> +#include <linux/config.h>
> #ifdef CONFIG_DMA_ENGINE
>
> #include <linux/device.h>
> @@ -65,6 +65,27 @@ enum dma_status {
> };
>
> /**
> + * enum dma_capabilities - DMA operational capabilities
> + * @DMA_MEMCPY: src to dest copy
> + * @DMA_XOR: src*n to dest xor
> + * @DMA_DUAL_XOR: src*n to dest_diag and dest_horiz xor
> + * @DMA_PQ_XOR: src*n to dest_q and dest_p gf/xor
> + * @DMA_MEMCPY_CRC32C: src to dest copy and crc-32c sum
> + * @DMA_SHARE: multiple clients can use this channel
> + */
> +enum dma_capabilities {
> + DMA_MEMCPY = 0x1,
> + DMA_XOR = 0x2,
> + DMA_PQ_XOR = 0x4,
> + DMA_DUAL_XOR = 0x8,
> + DMA_PQ_UPDATE = 0x10,
> + DMA_ZERO_SUM = 0x20,
> + DMA_PQ_ZERO_SUM = 0x40,
> + DMA_MEMSET = 0x80,
> + DMA_MEMCPY_CRC32C = 0x100,
Please use the more readable style that explicitly lists bits:
DMA_MEMCPY = (1 << 0),
DMA_XOR = (1 << 1),
...
> +/**
> * struct dma_chan_percpu - the per-CPU part of struct dma_chan
> * @refcount: local_t used for open-coded "bigref" counting
> * @memcpy_count: transaction counter
> @@ -75,27 +96,32 @@ struct dma_chan_percpu {
> local_t refcount;
> /* stats */
> unsigned long memcpy_count;
> + unsigned long xor_count;
> unsigned long bytes_transferred;
> + unsigned long bytes_xor;
Clearly, each operation needs to be more compartmentalized.
This just isn't scalable, when you consider all the possible transforms.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 11/19] dmaengine: add memset as an asynchronous dma operation [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams RemoveThis @intel.com>
>
> Changelog:
> * make the dmaengine api EXPORT_SYMBOL_GPL
> * zero sum support should be standalone, not integrated into xor
>
> Signed-off-by: Dan Williams <dan.j.williams RemoveThis @intel.com>
> ---
>
> drivers/dma/dmaengine.c | 15 ++++++++++
> drivers/dma/ioatdma.c | 5 +++
> include/linux/dmaengine.h | 68 +++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 88 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index e78ce89..fe62237 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -604,6 +604,17 @@ dma_cookie_t dma_async_do_xor_err(struct
> return -ENXIO;
> }
>
> +/**
> + * dma_async_do_memset_err - default function for dma devices that
> + * do not support memset
> + */
> +dma_cookie_t dma_async_do_memset_err(struct dma_chan *chan,
> + union dmaengine_addr dest, unsigned int dest_off,
> + int val, size_t len, unsigned long flags)
> +{
> + return -ENXIO;
> +}
> +
> static int __init dma_bus_init(void)
> {
> mutex_init(&dma_list_mutex);
> @@ -621,6 +632,9 @@ EXPORT_SYMBOL_GPL(dma_async_memcpy_pg_to
> EXPORT_SYMBOL_GPL(dma_async_memcpy_dma_to_dma);
> EXPORT_SYMBOL_GPL(dma_async_memcpy_pg_to_dma);
> EXPORT_SYMBOL_GPL(dma_async_memcpy_dma_to_pg);
> +EXPORT_SYMBOL_GPL(dma_async_memset_buf);
> +EXPORT_SYMBOL_GPL(dma_async_memset_page);
> +EXPORT_SYMBOL_GPL(dma_async_memset_dma);
> EXPORT_SYMBOL_GPL(dma_async_xor_pgs_to_pg);
> EXPORT_SYMBOL_GPL(dma_async_xor_dma_list_to_dma);
> EXPORT_SYMBOL_GPL(dma_async_operation_complete);
> @@ -629,6 +643,7 @@ EXPORT_SYMBOL_GPL(dma_async_device_regis
> EXPORT_SYMBOL_GPL(dma_async_device_unregister);
> EXPORT_SYMBOL_GPL(dma_chan_cleanup);
> EXPORT_SYMBOL_GPL(dma_async_do_xor_err);
> +EXPORT_SYMBOL_GPL(dma_async_do_memset_err);
> EXPORT_SYMBOL_GPL(dma_async_chan_init);
> EXPORT_SYMBOL_GPL(dma_async_map_page);
> EXPORT_SYMBOL_GPL(dma_async_map_single);
> diff --git a/drivers/dma/ioatdma.c b/drivers/dma/ioatdma.c
> index 0159d14..231247c 100644
> --- a/drivers/dma/ioatdma.c
> +++ b/drivers/dma/ioatdma.c
> @@ -637,6 +637,10 @@ extern dma_cookie_t dma_async_do_xor_err
> union dmaengine_addr src, unsigned int src_cnt,
> unsigned int src_off, size_t len, unsigned long flags);
>
> +extern dma_cookie_t dma_async_do_memset_err(struct dma_chan *chan,
> + union dmaengine_addr dest, unsigned int dest_off,
> + int val, size_t size, unsigned long flags);
> +
> static dma_addr_t ioat_map_page(struct dma_chan *chan, struct page *page,
> unsigned long offset, size_t size,
> int direction)
> @@ -748,6 +752,7 @@ #endif
> device->common.capabilities = DMA_MEMCPY;
> device->common.device_do_dma_memcpy = do_ioat_dma_memcpy;
> device->common.device_do_dma_xor = dma_async_do_xor_err;
> + device->common.device_do_dma_memset = dma_async_do_memset_err;
> device->common.map_page = ioat_map_page;
> device->common.map_single = ioat_map_single;
> device->common.unmap_page = ioat_unmap_page;
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index cb4cfcf..8d53b08 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -260,6 +260,7 @@ struct dma_chan_client_ref {
> * @device_issue_pending: push appended descriptors to hardware
> * @device_do_dma_memcpy: perform memcpy with a dma engine
> * @device_do_dma_xor: perform block xor with a dma engine
> + * @device_do_dma_memset: perform block fill with a dma engine
> */
> struct dma_device {
>
> @@ -284,6 +285,9 @@ struct dma_device {
> union dmaengine_addr src, unsigned int src_cnt,
> unsigned int src_off, size_t len,
> unsigned long flags);
> + dma_cookie_t (*device_do_dma_memset)(struct dma_chan *chan,
> + union dmaengine_addr dest, unsigned int dest_off,
> + int value, size_t len, unsigned long flags);
Same comment as for XOR: adding operations in this way just isn't scalable.
Operations need to be more compartmentalized.
Maybe a client could do:
struct adma_transaction adma_xact;
/* fill in hooks with XOR-specific info */
init_XScale_xor(adma_device, &adma_xact, my_completion_func);
/* initiate transaction */
adma_go(&adma_xact);
/* callback signals completion asynchronously */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 14/19] dmaengine: add dma_sync_wait [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams.RemoveThis@intel.com>
>
> dma_sync_wait is a common routine to live wait for a dma operation to
> complete.
>
> Signed-off-by: Dan Williams <dan.j.williams.RemoveThis@intel.com>
> ---
>
> include/linux/dmaengine.h | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 9fd6cbd..0a70c9e 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -750,6 +750,18 @@ static inline void dma_async_unmap_singl
> chan->device->unmap_single(chan, handle, size, direction);
> }
>
> +static inline enum dma_status dma_sync_wait(struct dma_chan *chan,
> + dma_cookie_t cookie)
> +{
> + enum dma_status status;
> + dma_async_issue_pending(chan);
> + do {
> + status = dma_async_operation_complete(chan, cookie, NULL, NULL);
> + } while (status == DMA_IN_PROGRESS);
> +
> + return status;
Where are the timeouts, etc.? Looks like an infinite loop to me, in the
worst case.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 18/19] iop3xx: Give Linux control over PCI (ATU) initialization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams DeleteThis @intel.com>
>
> Currently the iop3xx platform support code assumes that RedBoot is the
> bootloader and has already initialized the ATU. Linux should handle this
> initialization for three reasons:
>
> 1/ The memory map that RedBoot sets up is not optimal (page_to_dma and
> virt_to_phys return different addresses). The effect of this is that using
> the dma mapping API for the internal bus dma units generates pci bus
> addresses that are incorrect for the internal bus.
>
> 2/ Not all iop platforms use RedBoot
>
> 3/ If the ATU is already initialized it indicates that the iop is an add-in
> card in another host, it does not own the PCI bus, and should not be
> re-initialized.
>
> Signed-off-by: Dan Williams <dan.j.williams DeleteThis @intel.com>
> ---
>
> arch/arm/mach-iop32x/Kconfig | 8 ++
> arch/arm/mach-iop32x/ep80219.c | 4 +
> arch/arm/mach-iop32x/iq31244.c | 5 +
> arch/arm/mach-iop32x/iq80321.c | 5 +
> arch/arm/mach-iop33x/Kconfig | 8 ++
> arch/arm/mach-iop33x/iq80331.c | 5 +
> arch/arm/mach-iop33x/iq80332.c | 4 +
> arch/arm/plat-iop/pci.c | 140 ++++++++++++++++++++++++++++++++++
> include/asm-arm/arch-iop32x/iop32x.h | 9 ++
> include/asm-arm/arch-iop32x/memory.h | 4 -
> include/asm-arm/arch-iop33x/iop33x.h | 10 ++
> include/asm-arm/arch-iop33x/memory.h | 4 -
> include/asm-arm/hardware/iop3xx.h | 20 ++++-
> 13 files changed, 214 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm/mach-iop32x/Kconfig b/arch/arm/mach-iop32x/Kconfig
> index 05549a5..b2788e3 100644
> --- a/arch/arm/mach-iop32x/Kconfig
> +++ b/arch/arm/mach-iop32x/Kconfig
> @@ -22,6 +22,14 @@ config ARCH_IQ80321
> Say Y here if you want to run your kernel on the Intel IQ80321
> evaluation kit for the IOP321 processor.
>
> +config IOP3XX_ATU
> + bool "Enable the PCI Controller"
> + default y
> + help
> + Say Y here if you want the IOP to initialize its PCI Controller.
> + Say N if the IOP is an add in card, the host system owns the PCI
> + bus in this case.
> +
> endmenu
>
> endif
> diff --git a/arch/arm/mach-iop32x/ep80219.c b/arch/arm/mach-iop32x/ep80219.c
> index f616d3e..1a5c586 100644
> --- a/arch/arm/mach-iop32x/ep80219.c
> +++ b/arch/arm/mach-iop32x/ep80219.c
> @@ -100,7 +100,7 @@ ep80219_pci_map_irq(struct pci_dev *dev,
>
> static struct hw_pci ep80219_pci __initdata = {
> .swizzle = pci_std_swizzle,
> - .nr_controllers = 1,
> + .nr_controllers = 0,
> .setup = iop3xx_pci_setup,
> .preinit = iop3xx_pci_preinit,
> .scan = iop3xx_pci_scan_bus,
> @@ -109,6 +109,8 @@ static struct hw_pci ep80219_pci __initd
>
> static int __init ep80219_pci_init(void)
> {
> + if (iop3xx_get_init_atu() == IOP3XX_INIT_ATU_ENABLE)
> + ep80219_pci.nr_controllers = 1;
> #if 0
> if (machine_is_ep80219())
> pci_common_init(&ep80219_pci);
> diff --git a/arch/arm/mach-iop32x/iq31244.c b/arch/arm/mach-iop32x/iq31244.c
> index 967a696..25d5d62 100644
> --- a/arch/arm/mach-iop32x/iq31244.c
> +++ b/arch/arm/mach-iop32x/iq31244.c
> @@ -97,7 +97,7 @@ iq31244_pci_map_irq(struct pci_dev *dev,
>
> static struct hw_pci iq31244_pci __initdata = {
> .swizzle = pci_std_swizzle,
> - .nr_controllers = 1,
> + .nr_controllers = 0,
> .setup = iop3xx_pci_setup,
> .preinit = iop3xx_pci_preinit,
> .scan = iop3xx_pci_scan_bus,
> @@ -106,6 +106,9 @@ static struct hw_pci iq31244_pci __initd
>
> static int __init iq31244_pci_init(void)
> {
> + if (iop3xx_get_init_atu() == IOP3XX_INIT_ATU_ENABLE)
> + iq31244_pci.nr_controllers = 1;
> +
> if (machine_is_iq31244())
> pci_common_init(&iq31244_pci);
>
> diff --git a/arch/arm/mach-iop32x/iq80321.c b/arch/arm/mach-iop32x/iq80321.c
> index ef4388c..cdd2265 100644
> --- a/arch/arm/mach-iop32x/iq80321.c
> +++ b/arch/arm/mach-iop32x/iq80321.c
> @@ -97,7 +97,7 @@ iq80321_pci_map_irq(struct pci_dev *dev,
>
> static struct hw_pci iq80321_pci __initdata = {
> .swizzle = pci_std_swizzle,
> - .nr_controllers = 1,
> + .nr_controllers = 0,
> .setup = iop3xx_pci_setup,
> .preinit = iop3xx_pci_preinit,
> .scan = iop3xx_pci_scan_bus,
> @@ -106,6 +106,9 @@ static struct hw_pci iq80321_pci __initd
>
> static int __init iq80321_pci_init(void)
> {
> + if (iop3xx_get_init_atu() == IOP3XX_INIT_ATU_ENABLE)
> + iq80321_pci.nr_controllers = 1;
> +
> if (machine_is_iq80321())
> pci_common_init(&iq80321_pci);
>
> diff --git a/arch/arm/mach-iop33x/Kconfig b/arch/arm/mach-iop33x/Kconfig
> index 9aa016b..45598e0 100644
> --- a/arch/arm/mach-iop33x/Kconfig
> +++ b/arch/arm/mach-iop33x/Kconfig
> @@ -16,6 +16,14 @@ config MACH_IQ80332
> Say Y here if you want to run your kernel on the Intel IQ80332
> evaluation kit for the IOP332 chipset.
>
> +config IOP3XX_ATU
> + bool "Enable the PCI Controller"
> + default y
> + help
> + Say Y here if you want the IOP to initialize its PCI Controller.
> + Say N if the IOP is an add in card, the host system owns the PCI
> + bus in this case.
> +
> endmenu
>
> endif
> diff --git a/arch/arm/mach-iop33x/iq80331.c b/arch/arm/mach-iop33x/iq80331.c
> index 7714c94..3807000 100644
> --- a/arch/arm/mach-iop33x/iq80331.c
> +++ b/arch/arm/mach-iop33x/iq80331.c
> @@ -78,7 +78,7 @@ iq80331_pci_map_irq(struct pci_dev *dev,
>
> static struct hw_pci iq80331_pci __initdata = {
> .swizzle = pci_std_swizzle,
> - .nr_controllers = 1,
> + .nr_controllers = 0,
> .setup = iop3xx_pci_setup,
> .preinit = iop3xx_pci_preinit,
> .scan = iop3xx_pci_scan_bus,
> @@ -87,6 +87,9 @@ static struct hw_pci iq80331_pci __initd
>
> static int __init iq80331_pci_init(void)
> {
> + if (iop3xx_get_init_atu() == IOP3XX_INIT_ATU_ENABLE)
> + iq80331_pci.nr_controllers = 1;
> +
> if (machine_is_iq80331())
> pci_common_init(&iq80331_pci);
>
> diff --git a/arch/arm/mach-iop33x/iq80332.c b/arch/arm/mach-iop33x/iq80332.c
> index a3fa7f8..8780d55 100644
> --- a/arch/arm/mach-iop33x/iq80332.c
> +++ b/arch/arm/mach-iop33x/iq80332.c
> @@ -93,6 +93,10 @@ static struct hw_pci iq80332_pci __initd
>
> static int __init iq80332_pci_init(void)
> {
> +
> + if (iop3xx_get_init_atu() == IOP3XX_INIT_ATU_ENABLE)
> + iq80332_pci.nr_controllers = 1;
> +
> if (machine_is_iq80332())
> pci_common_init(&iq80332_pci);
>
> diff --git a/arch/arm/plat-iop/pci.c b/arch/arm/plat-iop/pci.c
> index e647812..19aace9 100644
> --- a/arch/arm/plat-iop/pci.c
> +++ b/arch/arm/plat-iop/pci.c
> @@ -55,7 +55,7 @@ static u32 iop3xx_cfg_address(struct pci
> * This routine checks the status of the last configuration cycle. If an error
> * was detected it returns a 1, else it returns a 0. The errors being checked
> * are parity, master abort, target abort (master and target). These types of
> - * errors occure during a config cycle where there is no device, like during
> + * errors occur during a config cycle where there is no device, like during
> * the discovery stage.
> */
> static int iop3xx_pci_status(void)
> @@ -223,8 +223,111 @@ struct pci_bus *iop3xx_pci_scan_bus(int
> return pci_scan_bus(sys->busnr, &iop3xx_ops, sys);
> }
>
> +void __init iop3xx_atu_setup(void)
> +{
> + /* BAR 0 ( Disabled ) */
> + *IOP3XX_IAUBAR0 = 0x0;
> + *IOP3XX_IABAR0 = 0x0;
> + *IOP3XX_IATVR0 = 0x0;
> + *IOP3XX_IALR0 = 0x0;
> +
> + /* BAR 1 ( Disabled ) */
> + *IOP3XX_IAUBAR1 = 0x0;
> + *IOP3XX_IABAR1 = 0x0;
> + *IOP3XX_IALR1 = 0x0;
> +
> + /* BAR 2 (1:1 mapping with Physical RAM) */
> + /* Set limit and enable */
> + *IOP3XX_IALR2 = ~((u32)IOP3XX_MAX_RAM_SIZE - 1) & ~0x1;
> + *IOP3XX_IAUBAR2 = 0x0;
> +
> + /* Align the inbound bar with the base of memory */
> + *IOP3XX_IABAR2 = PHYS_OFFSET |
> + PCI_BASE_ADDRESS_MEM_TYPE_64 |
> + PCI_BASE_ADDRESS_MEM_PREFETCH;
> +
> + *IOP3XX_IATVR2 = PHYS_OFFSET;
> +
> + /* Outbound window 0 */
> + *IOP3XX_OMWTVR0 = IOP3XX_PCI_LOWER_MEM_PA;
> + *IOP3XX_OUMWTVR0 = 0;
> +
> + /* Outbound window 1 */
> + *IOP3XX_OMWTVR1 = IOP3XX_PCI_LOWER_MEM_PA + IOP3XX_PCI_MEM_WINDOW_SIZE;
> + *IOP3XX_OUMWTVR1 = 0;
> +
> + /* BAR 3 ( Disabled ) */
> + *IOP3XX_IAUBAR3 = 0x0;
> + *IOP3XX_IABAR3 = 0x0;
> + *IOP3XX_IATVR3 = 0x0;
> + *IOP3XX_IALR3 = 0x0;
> +
> + /* Setup the I/O Bar
> + */
> + *IOP3XX_OIOWTVR = IOP3XX_PCI_LOWER_IO_PA;;
> +
> + /* Enable inbound and outbound cycles
> + */
> + *IOP3XX_ATUCMD |= PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
> + PCI_COMMAND_PARITY | PCI_COMMAND_SERR;
> + *IOP3XX_ATUCR |= IOP3XX_ATUCR_OUT_EN;
> +}
> +
> +void __init iop3xx_atu_disable(void)
> +{
> + *IOP3XX_ATUCMD = 0;
> + *IOP3XX_ATUCR = 0;
> +
> + /* wait for cycles to quiesce */
> + while (*IOP3XX_PCSR & (IOP3XX_PCSR_OUT_Q_BUSY |
> + IOP3XX_PCSR_IN_Q_BUSY))
> + cpu_relax();
> +
> + /* BAR 0 ( Disabled ) */
> + *IOP3XX_IAUBAR0 = 0x0;
> + *IOP3XX_IABAR0 = 0x0;
> + *IOP3XX_IATVR0 = 0x0;
> + *IOP3XX_IALR0 = 0x0;
> +
> + /* BAR 1 ( Disabled ) */
> + *IOP3XX_IAUBAR1 = 0x0;
> + *IOP3XX_IABAR1 = 0x0;
> + *IOP3XX_IALR1 = 0x0;
> +
> + /* BAR 2 ( Disabled ) */
> + *IOP3XX_IAUBAR2 = 0x0;
> + *IOP3XX_IABAR2 = 0x0;
> + *IOP3XX_IATVR2 = 0x0;
> + *IOP3XX_IALR2 = 0x0;
> +
> + /* BAR 3 ( Disabled ) */
> + *IOP3XX_IAUBAR3 = 0x0;
> + *IOP3XX_IABAR3 = 0x0;
> + *IOP3XX_IATVR3 = 0x0;
> + *IOP3XX_IALR3 = 0x0;
> +
> + /* Clear the outbound windows */
> + *IOP3XX_OIOWTVR = 0;
> +
> + /* Outbound window 0 */
> + *IOP3XX_OMWTVR0 = 0;
> + *IOP3XX_OUMWTVR0 = 0;
> +
> + /* Outbound window 1 */
> + *IOP3XX_OMWTVR1 = 0;
> + *IOP3XX_OUMWTVR1 = 0;
You should be using readl(), writel() variants rather than writing C
code that appears to be normal, but in reality has hardware side-effects.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 17/19] iop3xx: define IOP3XX_REG_ADDR[32|16|8] and clean up DMA/AAU defs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams.RemoveThis@intel.com>
>
> Also brings the iop3xx registers in line with the format of the iop13xx
> register definitions.
>
> Signed-off-by: Dan Williams <dan.j.williams.RemoveThis@intel.com>
> ---
>
> include/asm-arm/arch-iop32x/entry-macro.S | 2
> include/asm-arm/arch-iop32x/iop32x.h | 14 +
> include/asm-arm/arch-iop33x/entry-macro.S | 2
> include/asm-arm/arch-iop33x/iop33x.h | 38 ++-
> include/asm-arm/hardware/iop3xx.h | 347 +++++++++++++----------------
> 5 files changed, 188 insertions(+), 215 deletions(-)
Another Linux mantra: "volatile" == hiding a bug. Avoid, please.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 12/19] dmaengine: dma_async_memcpy_err for DMA engines that do not support memcpy [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams.TakeThisOut@intel.com>
>
> Default virtual function that returns an error if the user attempts a
> memcpy operation. An XOR engine is an example of a DMA engine that does
> not support memcpy.
>
> Signed-off-by: Dan Williams <dan.j.williams.TakeThisOut@intel.com>
> ---
>
> drivers/dma/dmaengine.c | 13 +++++++++++++
> 1 files changed, 13 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index fe62237..33ad690 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -593,6 +593,18 @@ void dma_async_device_unregister(struct
> }
>
> /**
> + * dma_async_do_memcpy_err - default function for dma devices that
> + * do not support memcpy
> + */
> +dma_cookie_t dma_async_do_memcpy_err(struct dma_chan *chan,
> + union dmaengine_addr dest, unsigned int dest_off,
> + union dmaengine_addr src, unsigned int src_off,
> + size_t len, unsigned long flags)
> +{
> + return -ENXIO;
> +}
Further illustration of how this API growth is going wrong. You should
create an API such that it is impossible for an XOR transform to ever
call non-XOR-transform hooks.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 15/19] dmaengine: raid5 dma client [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> From: Dan Williams <dan.j.williams.DeleteThis@intel.com>
>
> Adds a dmaengine client that is the hardware accelerated version of
> raid5_do_soft_block_ops. It utilizes the raid5 workqueue implementation to
> operate on multiple stripes simultaneously. See the iop-adma.c driver for
> an example of a driver that enables hardware accelerated raid5.
>
> Changelog:
> * mark operations as _Dma rather than _Done until all outstanding
> operations have completed. Once all operations have completed update the
> state and return it to the handle list
> * add a helper routine to retrieve the last used cookie
> * use dma_async_zero_sum_dma_list for checking parity which optionally
> allows parity check operations to not dirty the parity block in the cache
> (if 'disks' is less than 'MAX_ADMA_XOR_SOURCES')
> * remove dependencies on iop13xx
> * take into account the fact that dma engines have a staging buffer so we
> can perform 1 less block operation compared to software xor
> * added __arch_raid5_dma_chan_request __arch_raid5_dma_next_channel and
> __arch_raid5_dma_check_channel to make the driver architecture independent
> * added channel switching capability for architectures that implement
> different operations (i.e. copy & xor) on individual channels
> * added initial support for "non-blocking" channel switching
>
> Signed-off-by: Dan Williams <dan.j.williams.DeleteThis@intel.com>
> ---
>
> drivers/dma/Kconfig | 9 +
> drivers/dma/Makefile | 1
> drivers/dma/raid5-dma.c | 730 ++++++++++++++++++++++++++++++++++++++++++++
> drivers/md/Kconfig | 11 +
> drivers/md/raid5.c | 66 ++++
> include/linux/dmaengine.h | 5
> include/linux/raid/raid5.h | 24 +
> 7 files changed, 839 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index 30d021d..fced8c3 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -22,6 +22,15 @@ config NET_DMA
> Since this is the main user of the DMA engine, it should be enabled;
> say Y here.
>
> +config RAID5_DMA
> + tristate "MD raid5: block operations offload"
> + depends on INTEL_IOP_ADMA && MD_RAID456
> + default y
> + ---help---
> + This enables the use of DMA engines in the MD-RAID5 driver to
> + offload stripe cache operations, freeing CPU cycles.
> + say Y here
> +
> comment "DMA Devices"
>
> config INTEL_IOATDMA
> diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
> index bdcfdbd..4e36d6e 100644
> --- a/drivers/dma/Makefile
> +++ b/drivers/dma/Makefile
> @@ -1,3 +1,4 @@
> obj-$(CONFIG_DMA_ENGINE) += dmaengine.o
> obj-$(CONFIG_NET_DMA) += iovlock.o
> +obj-$(CONFIG_RAID5_DMA) += raid5-dma.o
> obj-$(CONFIG_INTEL_IOATDMA) += ioatdma.o
> diff --git a/drivers/dma/raid5-dma.c b/drivers/dma/raid5-dma.c
> new file mode 100644
> index 0000000..04a1790
> --- /dev/null
> +++ b/drivers/dma/raid5-dma.c
> @@ -0,0 +1,730 @@
> +/*
> + * Offload raid5 operations to hardware RAID engines
> + * Copyright(c) 2006 Intel Corporation. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the Free
> + * Software Foundation; either version 2 of the License, or (at your option)
> + * any later version.
> + *
> + * This program is distributed in the hope that it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59
> + * Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * The full GNU General Public License is included in this distribution in the
> + * file called COPYING.
> + */
> +
> +#include <linux/raid/raid5.h>
> +#include <linux/dmaengine.h>
> +
> +static struct dma_client *raid5_dma_client;
> +static atomic_t raid5_count;
> +extern void release_stripe(struct stripe_head *sh);
> +extern void __arch_raid5_dma_chan_request(struct dma_client *client);
> +extern struct dma_chan *__arch_raid5_dma_next_channel(struct dma_client *client);
> +
> +#define MAX_HW_XOR_SRCS 16
> +
> +#ifndef STRIPE_SIZE
> +#define STRIPE_SIZE PAGE_SIZE
> +#endif
> +
> +#ifndef STRIPE_SECTORS
> +#define STRIPE_SECTORS (STRIPE_SIZE>>9)
> +#endif
> +
> +#ifndef r5_next_bio
> +#define r5_next_bio(bio, sect) ( ( (bio)->bi_sector + ((bio)->bi_size>>9) < sect + STRIPE_SECTORS) ? (bio)->bi_next : NULL)
> +#endif
> +
> +#define DMA_RAID5_DEBUG 0
> +#define PRINTK(x...) ((void)(DMA_RAID5_DEBUG && printk(x)))
> +
> +/*
> + * Copy data between a page in the stripe cache, and one or more bion
> + * The page could align with the middle of the bio, or there could be
> + * several bion, each with several bio_vecs, which cover part of the page
> + * Multiple bion are linked together on bi_next. There may be extras
> + * at the end of this list. We ignore them.
> + */
> +static dma_cookie_t dma_raid_copy_data(int frombio, struct bio *bio,
> + dma_addr_t dma, sector_t sector, struct dma_chan *chan,
> + dma_cookie_t cookie)
> +{
> + struct bio_vec *bvl;
> + struct page *bio_page;
> + int i;
> + int dma_offset;
> + dma_cookie_t last_cookie = cookie;
> +
> + if (bio->bi_sector >= sector)
> + dma_offset = (signed)(bio->bi_sector - sector) * 512;
> + else
> + dma_offset = (signed)(sector - bio->bi_sector) * -512;
> + bio_for_each_segment(bvl, bio, i) {
> + int len = bio_iovec_idx(bio,i)->bv_len;
> + int clen;
> + int b_offset = 0;
> +
> + if (dma_offset < 0) {
> + b_offset = -dma_offset;
> + dma_offset += b_offset;
> + len -= b_offset;
> + }
> +
> + if (len > 0 && dma_offset + len > STRIPE_SIZE)
> + clen = STRIPE_SIZE - dma_offset;
> + else clen = len;
> +
> + if (clen > 0) {
> + b_offset += bio_iovec_idx(bio,i)->bv_offset;
> + bio_page = bio_iovec_idx(bio,i)->bv_page;
> + if (frombio)
> + do {
> + cookie = dma_async_memcpy_pg_to_dma(chan,
> + dma + dma_offset,
> + bio_page,
> + b_offset,
> + clen);
> + if (cookie == -ENOMEM)
> + dma_sync_wait(chan, last_cookie);
> + else
> + WARN_ON(cookie <= 0);
> + } while (cookie == -ENOMEM);
> + else
> + do {
> + cookie = dma_async_memcpy_dma_to_pg(chan,
> + bio_page,
> + b_offset,
> + dma + dma_offset,
> + clen);
> + if (cookie == -ENOMEM)
> + dma_sync_wait(chan, last_cookie);
> + else
> + WARN_ON(cookie <= 0);
> + } while (cookie == -ENOMEM);
> + }
> + last_cookie = cookie;
> + if (clen < len) /* hit end of page */
> + break;
> + dma_offset += len;
> + }
> +
> + return last_cookie;
> +}
> +
> +#define issue_xor() do { \
> + do { \
> + cookie = dma_async_xor_dma_list_to_dma( \
> + sh->ops.dma_chan, \
> + xor_destination_addr, \
> + dma, \
> + count, \
> + STRIPE_SIZE); \
> + if (cookie == -ENOMEM) \
> + dma_sync_wait(sh->ops.dma_chan, \
> + sh->ops.dma_cookie); \
> + else \
> + WARN_ON(cookie <= 0); \
> + } while (cookie == -ENOMEM); \
> + sh->ops.dma_cookie = cookie; \
> + dma[0] = xor_destination_addr; \
> + count = 1; \
> + } while(0)
> +#define check_xor() do { \
> + if (count == MAX_HW_XOR_SRCS) \
> + issue_xor(); \
> + } while (0)
> +
> +#ifdef CONFIG_RAID5_DMA_ARCH_NEEDS_CHAN_SWITCH
> +extern struct dma_chan *__arch_raid5_dma_check_channel(struct dma_chan *chan,
> + dma_cookie_t cookie,
> + struct dma_client *client,
> + unsigned long capabilities);
> +
> +#ifdef CONFIG_RAID5_DMA_WAIT_VIA_REQUEUE
> +#define check_channel(cap, bookmark) do { \
> +bookmark: \
> + next_chan = __arch_raid5_dma_check_channel(sh->ops.dma_chan, \
> + sh->ops.dma_cookie, \
> + raid5_dma_client, \
> + (cap)); \
> + if (!next_chan) { \
> + BUG_ON(sh->ops.ops_bookmark); \
> + sh->ops.ops_bookmark = &&bookmark; \
> + goto raid5_dma_retry; \
> + } else { \
> + sh->ops.dma_chan = next_chan; \
> + sh->ops.dma_cookie = dma_async_get_last_cookie( \
> + next_chan); \
> + sh->ops.ops_bookmark = NULL; \
> + } \
> +} while (0)
> +#else
> +#define check_channel(cap, bookmark) do { \
> +bookmark: \
> + next_chan = __arch_raid5_dma_check_channel(sh->ops.dma_chan, \
> + sh->ops.dma_cookie, \
> + raid5_dma_client, \
> + (cap)); \
> + if (!next_chan) { \
> + dma_sync_wait(sh->ops.dma_chan, sh->ops.dma_cookie); \
> + goto bookmark; \
> + } else { \
> + sh->ops.dma_chan = next_chan; \
> + sh->ops.dma_cookie = dma_async_get_last_cookie( \
> + next_chan); \
> + } \
> +} while (0)
> +#endif /* CONFIG_RAID5_DMA_WAIT_VIA_REQUEUE */
> +#else
> +#define check_channel(cap, bookmark) do { } while (0)
> +#endif /* CONFIG_RAID5_DMA_ARCH_NEEDS_CHAN_SWITCH */
The above seems a bit questionable and overengineered.
Linux mantra: Do What You Must, And No More.
In this case, just code and note that it's IOP-specific. Don't bother
to support cases that doesn't exist yet.
> + * dma_do_raid5_block_ops - perform block memory operations on stripe data
> + * outside the spin lock with dma engines
> + *
> + * A note about the need for __arch_raid5_dma_check_channel:
> + * This function is only needed to support architectures where a single raid
> + * operation spans multiple hardware channels. For example on a reconstruct
> + * write, memory copy operations are submitted to a memcpy channel and then
> + * the routine must switch to the xor channel to complete the raid operation.
> + * __arch_raid5_dma_check_channel makes sure the previous operation has
> + * completed before returning the new channel.
> + * Some efficiency can be gained by putting the stripe back on the work
> + * queue rather than spin waiting. This code is a work in progress and is
> + * available via the 'broken' option CONFIG_RAID5_DMA_WAIT_VIA_REQUEUE.
> + * If 'wait via requeue' is not defined the check_channel macro live waits
> + * for the next channel.
> + */
> +static void dma_do_raid5_block_ops(void *stripe_head_ref)
> +{
Another way-too-big function that should be split up.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Dan Williams External

Since: Sep 12, 2006 Posts: 26
|
Posted: Tue Sep 12, 2006 2:00 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 9/11/06, Jeff Garzik <jeff.TakeThisOut@garzik.org> wrote:
> Dan Williams wrote:
> > Neil,
> >
> > The following patches implement hardware accelerated raid5 for the Intel
> > Xscale(r) series of I/O Processors. The MD changes allow stripe
> > operations to run outside the spin lock in a work queue. Hardware
> > acceleration is achieved by using a dma-engine-aware work queue routine
> > instead of the default software only routine.
> >
> > Since the last release of the raid5 changes many bug fixes and other
> > improvements have been made as a result of stress testing. See the per
> > patch change logs for more information about what was fixed. This
> > release is the first release of the full dma implementation.
> >
> > The patches touch 3 areas, the md-raid5 driver, the generic dmaengine
> > interface, and a platform device driver for IOPs. The raid5 changes
> > follow your comments concerning making the acceleration implementation
> > similar to how the stripe cache handles I/O requests. The dmaengine
> > changes are the second release of this code. They expand the interface
> > to handle more than memcpy operations, and add a generic raid5-dma
> > client. The iop-adma driver supports dma memcpy, xor, xor zero sum, and
> > memset across all IOP architectures (32x, 33x, and 13xx).
> >
> > Concerning the context switching performance concerns raised at the
> > previous release, I have observed the following. For the hardware
> > accelerated case it appears that performance is always better with the
> > work queue than without since it allows multiple stripes to be operated
> > on simultaneously. I expect the same for an SMP platform, but so far my
> > testing has been limited to IOPs. For a single-processor
> > non-accelerated configuration I have not observed performance
> > degradation with work queue support enabled, but in the Kconfig option
> > help text I recommend disabling it (CONFIG_MD_RAID456_WORKQUEUE).
> >
> > Please consider the patches for -mm.
> >
> > -Dan
> >
> > [PATCH 01/19] raid5: raid5_do_soft_block_ops
> > [PATCH 02/19] raid5: move write operations to a workqueue
> > [PATCH 03/19] raid5: move check parity operations to a workqueue
> > [PATCH 04/19] raid5: move compute block operations to a workqueue
> > [PATCH 05/19] raid5: move read completion copies to a workqueue
> > [PATCH 06/19] raid5: move the reconstruct write expansion operation to a workqueue
> > [PATCH 07/19] raid5: remove compute_block and compute_parity5
> > [PATCH 08/19] dmaengine: enable multiple clients and operations
> > [PATCH 09/19] dmaengine: reduce backend address permutations
> > [PATCH 10/19] dmaengine: expose per channel dma mapping characteristics to clients
> > [PATCH 11/19] dmaengine: add memset as an asynchronous dma operation
> > [PATCH 12/19] dmaengine: dma_async_memcpy_err for DMA engines that do not support memcpy
> > [PATCH 13/19] dmaengine: add support for dma xor zero sum operations
> > [PATCH 14/19] dmaengine: add dma_sync_wait
> > [PATCH 15/19] dmaengine: raid5 dma client
> > [PATCH 16/19] dmaengine: Driver for the Intel IOP 32x, 33x, and 13xx RAID engines
> > [PATCH 17/19] iop3xx: define IOP3XX_REG_ADDR[32|16|8] and clean up DMA/AAU defs
> > [PATCH 18/19] iop3xx: Give Linux control over PCI (ATU) initialization
> > [PATCH 19/19] iop3xx: IOP 32x and 33x support for the iop-adma driver
>
> Can devices like drivers/scsi/sata_sx4.c or drivers/scsi/sata_promise.c
> take advantage of this? Promise silicon supports RAID5 XOR offload.
>
> If so, how? If not, why not?
This is a frequently asked question, Alan Cox had the same one at OLS.
The answer is "probably." The only complication I currently see is
where/how the stripe cache is maintained. With the IOPs its easy
because the DMA engines operate directly on kernel memory. With the
Promise card I believe they have memory on the card and it's not clear
to me if the XOR engines on the card can deal with host memory. Also,
MD would need to be modified to handle a stripe cache located on a
device, or somehow synchronize its local cache with card in a manner
that is still able to beat software only MD.
> Jeff
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Dan Williams External

Since: Sep 12, 2006 Posts: 26
|
Posted: Tue Sep 12, 2006 2:20 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 9/11/06, Jeff Garzik <jeff RemoveThis @garzik.org> wrote:
> Dan Williams wrote:
> > @@ -759,8 +755,10 @@ #endif
> > device->common.device_memcpy_buf_to_buf = ioat_dma_memcpy_buf_to_buf;
> > device->common.device_memcpy_buf_to_pg = ioat_dma_memcpy_buf_to_pg;
> > device->common.device_memcpy_pg_to_pg = ioat_dma_memcpy_pg_to_pg;
> > - device->common.device_memcpy_complete = ioat_dma_is_complete;
> > - device->common.device_memcpy_issue_pending = ioat_dma_memcpy_issue_pending;
> > + device->common.device_operation_complete = ioat_dma_is_complete;
> > + device->common.device_xor_pgs_to_pg = dma_async_xor_pgs_to_pg_err;
> > + device->common.device_issue_pending = ioat_dma_memcpy_issue_pending;
> > + device->common.capabilities = DMA_MEMCPY;
>
>
> Are we really going to add a set of hooks for each DMA engine whizbang
> feature?
What's the alternative? But, also see patch 9 "dmaengine: reduce
backend address permutations" it relieves some of this pain.
>
> That will get ugly when DMA engines support memcpy, xor, crc32, sha1,
> aes, and a dozen other transforms.
>
>
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index c94d8f1..3599472 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -20,7 +20,7 @@
> > */
> > #ifndef DMAENGINE_H
> > #define DMAENGINE_H
> > -
> > +#include <linux/config.h>
> > #ifdef CONFIG_DMA_ENGINE
> >
> > #include <linux/device.h>
> > @@ -65,6 +65,27 @@ enum dma_status {
> > };
> >
> > /**
> > + * enum dma_capabilities - DMA operational capabilities
> > + * @DMA_MEMCPY: src to dest copy
> > + * @DMA_XOR: src*n to dest xor
> > + * @DMA_DUAL_XOR: src*n to dest_diag and dest_horiz xor
> > + * @DMA_PQ_XOR: src*n to dest_q and dest_p gf/xor
> > + * @DMA_MEMCPY_CRC32C: src to dest copy and crc-32c sum
> > + * @DMA_SHARE: multiple clients can use this channel
> > + */
> > +enum dma_capabilities {
> > + DMA_MEMCPY = 0x1,
> > + DMA_XOR = 0x2,
> > + DMA_PQ_XOR = 0x4,
> > + DMA_DUAL_XOR = 0x8,
> > + DMA_PQ_UPDATE = 0x10,
> > + DMA_ZERO_SUM = 0x20,
> > + DMA_PQ_ZERO_SUM = 0x40,
> > + DMA_MEMSET = 0x80,
> > + DMA_MEMCPY_CRC32C = 0x100,
>
> Please use the more readable style that explicitly lists bits:
>
> DMA_MEMCPY = (1 << 0),
> DMA_XOR = (1 << 1),
> ...
I prefer this as well, although at one point I was told (not by you)
the absolute number was preferred when I was making changes to
drivers/scsi/sata_vsc.c. In any event I'll change it...
>
> > +/**
> > * struct dma_chan_percpu - the per-CPU part of struct dma_chan
> > * @refcount: local_t used for open-coded "bigref" counting
> > * @memcpy_count: transaction counter
> > @@ -75,27 +96,32 @@ struct dma_chan_percpu {
> > local_t refcount;
> > /* stats */
> > unsigned long memcpy_count;
> > + unsigned long xor_count;
> > unsigned long bytes_transferred;
> > + unsigned long bytes_xor;
>
> Clearly, each operation needs to be more compartmentalized.
>
> This just isn't scalable, when you consider all the possible transforms.
Ok, one set of counters per op is probably overkill what about lumping
operations into groups and just tracking at the group level? i.e.
memcpy, memset -> string_count, string_bytes_transferred
crc, sha1, aes -> hash_count, hash_transferred
xor, pq_xor -> sum_count, sum_transferred
>
> Jeff
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Roland Dreier External

Since: Sep 05, 2006 Posts: 460
|
Posted: Tue Sep 12, 2006 3:00 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Jeff> Are we really going to add a set of hooks for each DMA
Jeff> engine whizbang feature?
Dan> What's the alternative? But, also see patch 9 "dmaengine:
Dan> reduce backend address permutations" it relieves some of this
Dan> pain.
I guess you can pass an opcode into a common "start operation" function.
With all the memcpy / xor / crypto / etc. hardware out there already,
we definitely have to get this interface right.
- R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Tue Sep 12, 2006 4:50 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> This is a frequently asked question, Alan Cox had the same one at OLS.
> The answer is "probably." The only complication I currently see is
> where/how the stripe cache is maintained. With the IOPs its easy
> because the DMA engines operate directly on kernel memory. With the
> Promise card I believe they have memory on the card and it's not clear
> to me if the XOR engines on the card can deal with host memory. Also,
> MD would need to be modified to handle a stripe cache located on a
> device, or somehow synchronize its local cache with card in a manner
> that is still able to beat software only MD.
sata_sx4 operates through [standard PC] memory on the card, and you use
a DMA engine to copy memory to/from the card.
[select chipsets supported by] sata_promise operates directly on host
memory.
So, while sata_sx4 is farther away from your direct-host-memory model,
it also has much more potential for RAID acceleration: ideally, RAID1
just copies data to the card once, then copies the data to multiple
drives from there. Similarly with RAID5, you can eliminate copies and
offload XOR, presuming the drives are all connected to the same card.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Dan Williams External

Since: Aug 26, 2006 Posts: 11
|
Posted: Tue Sep 12, 2006 7:50 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 9/11/06, Jeff Garzik <jeff RemoveThis @garzik.org> wrote:
> Dan Williams wrote:
> > This is a frequently asked question, Alan Cox had the same one at OLS.
> > The answer is "probably." The only complication I currently see is
> > where/how the stripe cache is maintained. With the IOPs its easy
> > because the DMA engines operate directly on kernel memory. With the
> > Promise card I believe they have memory on the card and it's not clear
> > to me if the XOR engines on the card can deal with host memory. Also,
> > MD would need to be modified to handle a stripe cache located on a
> > device, or somehow synchronize its local cache with card in a manner
> > that is still able to beat software only MD.
>
> sata_sx4 operates through [standard PC] memory on the card, and you use
> a DMA engine to copy memory to/from the card.
>
> [select chipsets supported by] sata_promise operates directly on host
> memory.
>
> So, while sata_sx4 is farther away from your direct-host-memory model,
> it also has much more potential for RAID acceleration: ideally, RAID1
> just copies data to the card once, then copies the data to multiple
> drives from there. Similarly with RAID5, you can eliminate copies and
> offload XOR, presuming the drives are all connected to the same card.
In the sata_promise case its straight forward, all that is needed is
dmaengine drivers for the xor and memcpy engines. This would be
similar to the current I/OAT model where dma resources are provided by
a PCI function. The sata_sx4 case would need a different flavor of
the dma_do_raid5_block_ops routine, one that understands where the
cache is located. MD would also need the capability to bypass the
block layer since the data will have already been transferred to the
card by a stripe cache operation
The RAID1 case give me pause because it seems any work along these
lines requires that the implementation work for both MD and DM, which
then eventually leads to being tasked with merging the two.
> Jeff
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Dan Williams External

Since: Aug 26, 2006 Posts: 11
|
Posted: Tue Sep 12, 2006 8:20 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 9/11/06, Roland Dreier <rdreier.TakeThisOut@cisco.com> wrote:
> Jeff> Are we really going to add a set of hooks for each DMA
> Jeff> engine whizbang feature?
....ok, but at some level we are going to need a file that has:
EXPORT_SYMBOL_GPL(dma_whizbang_op1)
.. . .
EXPORT_SYMBOL_GPL(dma_whizbang_opX)
correct?
> Dan> What's the alternative? But, also see patch 9 "dmaengine:
> Dan> reduce backend address permutations" it relieves some of this
> Dan> pain.
>
> I guess you can pass an opcode into a common "start operation" function.
But then we still have the problem of being able to request a memory
copy operation of a channel that only understands xor, a la Jeff's
comment to patch 12:
"Further illustration of how this API growth is going wrong. You should
create an API such that it is impossible for an XOR transform to ever
call non-XOR-transform hooks."
> With all the memcpy / xor / crypto / etc. hardware out there already,
> we definitely have to get this interface right.
>
> - R.
I understand what you are saying Jeff, the implementation can be made
better, but something I think is valuable is the ability to write
clients once like NET_DMA and RAID5_DMA and have them run without
modification on any platform that can provide the engine interface
rather than needing a client per architecture
IOP_RAID5_DMA...FOO_X_RAID5_DMA.
Or is this an example of the where "Do What You Must, And No More"
comes in, i.e. don't worry about making a generic RAID5_DMA while
there is only one implementation existence?
I also want to pose the question of whether the dmaengine interface
should handle cryptographic transforms? We already have Acrypto:
http://tservice.net.ru/~s0mbre/blog/devel/acrypto/index.html. At the
same time since IOPs can do Galois Field multiplication and XOR it
would be nice to take advantage of that for crypto acceleration, but
this does not fit the model of a device that Acrypto supports.
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Evgeniy Polyakov External

Since: Oct 31, 2006 Posts: 326
|
Posted: Tue Sep 12, 2006 11:20 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Mon, Sep 11, 2006 at 11:18:59PM -0700, Dan Williams (dan.j.williams@gmail.com) wrote:
> Or is this an example of the where "Do What You Must, And No More"
> comes in, i.e. don't worry about making a generic RAID5_DMA while
> there is only one implementation existence?
>
> I also want to pose the question of whether the dmaengine interface
> should handle cryptographic transforms? We already have Acrypto:
> http://tservice.net.ru/~s0mbre/blog/devel/acrypto/index.html. At the
> same time since IOPs can do Galois Field multiplication and XOR it
> would be nice to take advantage of that for crypto acceleration, but
> this does not fit the model of a device that Acrypto supports.
Each acrypto crypto device provides set of capabilities it supports, and
when user requests some operation, acrypto core selects device with the
maximum speed for given capabilities, so one can easily add there GF
multiplication devices. Acrypto supports "sync" mode too in case your
hardware is synchronous (i.e. it does not provide interrupt or other
async event when operation is completed).
P.S. acrypto homepage with some design notes and supported features
can be found here:
http://tservice.net.ru/~s0mbre/old/?section=projects&item=acrypto
> Dan
--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Wed Sep 13, 2006 6:10 am Post subject: Re: [PATCH 08/19] dmaengine: enable multiple clients and operations [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> On 9/11/06, Roland Dreier <rdreier.TakeThisOut@cisco.com> wrote:
>> Jeff> Are we really going to add a set of hooks for each DMA
>> Jeff> engine whizbang feature?
> ...ok, but at some level we are going to need a file that has:
> EXPORT_SYMBOL_GPL(dma_whizbang_op1)
> . . .
> EXPORT_SYMBOL_GPL(dma_whizbang_opX)
> correct?
If properly modularized, you'll have multiple files with such exports.
Or perhaps you won't have such exports at all, if it is hidden inside a
module-specific struct-of-hooks.
> I understand what you are saying Jeff, the implementation can be made
> better, but something I think is valuable is the ability to write
> clients once like NET_DMA and RAID5_DMA and have them run without
> modification on any platform that can provide the engine interface
> rather than needing a client per architecture
> IOP_RAID5_DMA...FOO_X_RAID5_DMA.
It depends on the situation.
The hardware capabilities exported by each platform[or device] vary
greatly, not only in the raw capabilities provided, but also in the
level of offload.
In general, we don't want to see hardware-specific stuff in generic
code, though...
> Or is this an example of the where "Do What You Must, And No More"
> comes in, i.e. don't worry about making a generic RAID5_DMA while
> there is only one implementation existence?
> I also want to pose the question of whether the dmaengine interface
> should handle cryptographic transforms? We already have Acrypto:
> http://tservice.net.ru/~s0mbre/blog/devel/acrypto/index.html. At the
> same time since IOPs can do Galois Field multiplication and XOR it
> would be nice to take advantage of that for crypto acceleration, but
> this does not fit the model of a device that Acrypto supports.
It would be quite interesting to see where the synergies are between the
two, at the very least. "async [transform|sum]" is a superset of "async
crypto" after all.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeff Garzik External

Since: Mar 05, 2006 Posts: 1465
|
Posted: Wed Sep 13, 2006 6:10 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Dan Williams wrote:
> On 9/11/06, Jeff Garzik <jeff.TakeThisOut@garzik.org> wrote:
>> Dan Williams wrote:
>> > This is a frequently asked question, Alan Cox had the same one at OLS.
>> > The answer is "probably." The only complication I currently see is
>> > where/how the stripe cache is maintained. With the IOPs its easy
>> > because the DMA engines operate directly on kernel memory. With the
>> > Promise card I believe they have memory on the card and it's not clear
>> > to me if the XOR engines on the card can deal with host memory. Also,
>> > MD would need to be modified to handle a stripe cache located on a
>> > device, or somehow synchronize its local cache with card in a manner
>> > that is still able to beat software only MD.
>>
>> sata_sx4 operates through [standard PC] memory on the card, and you use
>> a DMA engine to copy memory to/from the card.
>>
>> [select chipsets supported by] sata_promise operates directly on host
>> memory.
>>
>> So, while sata_sx4 is farther away from your direct-host-memory model,
>> it also has much more potential for RAID acceleration: ideally, RAID1
>> just copies data to the card once, then copies the data to multiple
>> drives from there. Similarly with RAID5, you can eliminate copies and
>> offload XOR, presuming the drives are all connected to the same card.
> In the sata_promise case its straight forward, all that is needed is
> dmaengine drivers for the xor and memcpy engines. This would be
> similar to the current I/OAT model where dma resources are provided by
> a PCI function. The sata_sx4 case would need a different flavor of
> the dma_do_raid5_block_ops routine, one that understands where the
> cache is located. MD would also need the capability to bypass the
> block layer since the data will have already been transferred to the
> card by a stripe cache operation
>
> The RAID1 case give me pause because it seems any work along these
> lines requires that the implementation work for both MD and DM, which
> then eventually leads to being tasked with merging the two.
RAID5 has similar properties. If all devices in a RAID5 array are
attached to a single SX4 card, then a high level write to the RAID5
array is passed directly to the card, which then performs XOR, striping,
etc.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jakob Oestergaard External

Since: May 16, 2006 Posts: 16
|
Posted: Wed Sep 13, 2006 9:20 am Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Mon, Sep 11, 2006 at 04:00:32PM -0700, Dan Williams wrote:
> Neil,
>
....
>
> Concerning the context switching performance concerns raised at the
> previous release, I have observed the following. For the hardware
> accelerated case it appears that performance is always better with the
> work queue than without since it allows multiple stripes to be operated
> on simultaneously. I expect the same for an SMP platform, but so far my
> testing has been limited to IOPs. For a single-processor
> non-accelerated configuration I have not observed performance
> degradation with work queue support enabled, but in the Kconfig option
> help text I recommend disabling it (CONFIG_MD_RAID456_WORKQUEUE).
Out of curiosity; how does accelerated compare to non-accelerated?
--
/ jakob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Dan Williams External

Since: Aug 26, 2006 Posts: 11
|
Posted: Wed Sep 13, 2006 9:20 pm Post subject: Re: [PATCH 00/19] Hardware Accelerated MD RAID5: Introduction [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 9/13/06, Jakob Oestergaard <jakob RemoveThis @unthought.net> wrote:
> On Mon, Sep 11, 2006 at 04:00:32PM -0700, Dan Williams wrote:
> > Neil,
> >
> ...
> >
> > Concerning the context switching performance concerns raised at the
> > previous release, I have observed the following. For the hardware
> > accelerated case it appears that performance is always better with the
> > work queue than without since it allows multiple stripes to be operated
> > on simultaneously. I expect the same for an SMP platform, but so far my
> > testing has been limited to IOPs. For a single-processor
> > non-accelerated configuration I have not observed performance
> > degradation with work queue support enabled, but in the Kconfig option
> > help text I recommend disabling it (CONFIG_MD_RAID456_WORKQUEUE).
>
> Out of curiosity; how does accelerated compare to non-accelerated?
One quick example:
4-disk SATA array rebuild on iop321 without acceleration - 'top'
reports md0_resync and md0_raid5 dueling for the CPU each at ~50%
utilization.
With acceleration - 'top' reports md0_resync cpu utilization at ~90%
with the rest split between md0_raid5 and md0_raid5_ops.
The sync speed reported by /proc/mdstat is ~40% higher in the accelerated case.
That being said, array resync is a special case, so your mileage may
vary with other applications.
I will put together some data from bonnie++, iozone, maybe contest,
and post it on SourceForge.
> / jakob
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
|
|
|
You can post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
| |
|
|