Discussion:
[dpdk-dev] [PATCH 00/28] introduce I/O device memory read/write operations
Jerin Jacob
2016-12-14 01:55:30 UTC
Permalink
Based on the disussion in the below-mentioned thread,
http://dev.dpdk.narkive.com/DpIRqDuy/dpdk-dev-patch-v2-i40e-fix-eth-i40e-dev-init-sequence-on-thunderx

This patchset introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the eal abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture-agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

This patchset split into three functional set:

patchset 1-9: Introduce I/O device memory barrier eal abstraction and
implement it for all the architectures.

patchset 10-13: Introduce I/O device memory read/write operations eal abstraction
and implement it for all the architectures using previoud I/O device memory
barrier.

patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[b/w/l/q], rte_write[b/w/l/q] eal abstraction

Note:

1) We couldn't test the patch on all the Hardwares due to unavailability.
Appreciate the feedback from ARCH and PMD maintainers.

2) patch 13/28 has flase positive check patch error with asm syntax

ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#92: FILE: lib/librte_eal/common/include/arch/arm/rte_io_64.h:54:
+ : [val] "=r" (val)

Jerin Jacob (14):
eal: introduce I/O device memory barriers
eal/x86: define I/O device memory barriers for IA
eal/tile: define I/O device memory barriers for tile
eal/ppc64: define I/O device memory barriers for ppc64
eal/arm: separate smp barrier definition for ARMv7 and ARMv8
eal/armv7: define I/O device memory barriers for ARMv7
eal/arm64: fix memory barrier definition for arm64
eal/arm64: define smp barrier definition for arm64
eal/arm64: define I/O device memory barriers for arm64
eal: introduce I/O device memory read/write operations
eal: generic implementation for I/O device read/write access
eal: let all architectures use generic I/O implementation
eal/arm64: override I/O device read/write access for arm64
net/thunderx: use eal I/O device memory read/write API

Santosh Shukla (14):
crypto/qat: use eal I/O device memory read/write API
net/bnx2x: use eal I/O device memory read/write API
net/bnxt: use eal I/O device memory read/write API
net/cxgbe: use eal I/O device memory read/write API
net/e1000: use eal I/O device memory read/write API
net/ena: use eal I/O device memory read/write API
net/enic: use eal I/O device memory read/write API
net/fm10k: use eal I/O device memory read/write API
net/i40e: use eal I/O device memory read/write API
net/ixgbe: use eal I/O device memory read/write API
net/nfp: use eal I/O device memory read/write API
net/qede: use eal I/O device memory read/write API
net/virtio: use eal I/O device memory read/write API
net/vmxnet3: use eal I/O device memory read/write API

doc/api/doxy-api-index.md | 3 +-
.../qat/qat_adf/adf_transport_access_macros.h | 15 +-
drivers/net/bnx2x/bnx2x.h | 32 +--
drivers/net/bnxt/bnxt_hwrm.c | 8 +-
drivers/net/cxgbe/base/adapter.h | 13 +-
drivers/net/cxgbe/cxgbe_compat.h | 3 +-
drivers/net/e1000/base/e1000_osdep.h | 25 +-
drivers/net/ena/base/ena_plat_dpdk.h | 5 +-
drivers/net/enic/enic_compat.h | 17 +-
drivers/net/fm10k/base/fm10k_osdep.h | 27 +-
drivers/net/i40e/base/i40e_osdep.h | 14 +-
drivers/net/ixgbe/base/ixgbe_osdep.h | 13 +-
drivers/net/nfp/nfp_net_pmd.h | 9 +-
drivers/net/qede/base/bcm_osal.h | 18 +-
drivers/net/thunderx/base/nicvf_plat.h | 45 +--
drivers/net/virtio/virtio_pci.c | 14 +-
drivers/net/vmxnet3/vmxnet3_ethdev.h | 14 +-
lib/librte_eal/common/Makefile | 3 +-
.../common/include/arch/arm/rte_atomic.h | 6 -
.../common/include/arch/arm/rte_atomic_32.h | 12 +
.../common/include/arch/arm/rte_atomic_64.h | 21 +-
lib/librte_eal/common/include/arch/arm/rte_io.h | 51 ++++
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 ++++++++++++
.../common/include/arch/ppc_64/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/ppc_64/rte_io.h | 47 +++
.../common/include/arch/tile/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/tile/rte_io.h | 47 +++
.../common/include/arch/x86/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/x86/rte_io.h | 47 +++
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++
lib/librte_eal/common/include/generic/rte_io.h | 317 +++++++++++++++++++++
31 files changed, 928 insertions(+), 126 deletions(-)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/tile/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_io.h
create mode 100644 lib/librte_eal/common/include/generic/rte_io.h
--
2.5.5
Jerin Jacob
2016-12-14 01:55:31 UTC
Permalink
This commit introduce rte_io_mb(), rte_io_wmb() and rte_io_rmb(), in
order to enable memory barriers between I/O device and CPU.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
index 43a704e..7b81705 100644
--- a/lib/librte_eal/common/include/generic/rte_atomic.h
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -100,6 +100,33 @@ static inline void rte_smp_wmb(void);
*/
static inline void rte_smp_rmb(void);

+/**
+ * General memory barrier for I/O device
+ *
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_io_mb() call are visible to I/O device or CPU before the
+ * LOAD and STORE operations that follow it.
+ */
+static inline void rte_io_mb(void);
+
+/**
+ * Write memory barrier for I/O device
+ *
+ * Guarantees that the STORE operations that precede the
+ * rte_io_wmb() call are visible to I/O device before the STORE
+ * operations that follow it.
+ */
+static inline void rte_io_wmb(void);
+
+/**
+ * Read memory barrier for IO device
+ *
+ * Guarantees that the LOAD operations on I/O device that precede the
+ * rte_io_rmb() call are visible to CPU before the LOAD
+ * operations that follow it.
+ */
+static inline void rte_io_rmb(void);
+
#endif /* __DOXYGEN__ */

/**
--
2.5.5
Jerin Jacob
2016-12-14 01:55:33 UTC
Permalink
The patch does not provide any functional change for tile.
I/O barriers are mapped to existing smp barriers.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Zhigang Lu <***@ezchip.com>
---
lib/librte_eal/common/include/arch/tile/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/tile/rte_atomic.h b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
index 28825ff..1f332ee 100644
--- a/lib/librte_eal/common/include/arch/tile/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
@@ -85,6 +85,12 @@ static inline void rte_rmb(void)

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-14 01:55:34 UTC
Permalink
The patch does not provide any functional change for ppc_64.
I/O barriers are mapped to existing smp barriers.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Chao Zhu <***@linux.vnet.ibm.com>
---
lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
index fb4fccb..150810c 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -87,6 +87,12 @@ extern "C" {

#define rte_smp_rmb() rte_rmb()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
/*------------------------- 16 bit atomic operations -------------------------*/
/* To be compatible with Power7, use GCC built-in functions for 16 bit
* operations */
--
2.5.5
Jerin Jacob
2016-12-14 01:55:35 UTC
Permalink
Separate the smp barrier definition for arm and arm64 for fine
control on smp barrier definition for each architecture.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic.h | 6 ------
lib/librte_eal/common/include/arch/arm/rte_atomic_32.h | 6 ++++++
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
index 454a12b..f3f3b6e 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -39,10 +39,4 @@
#include <rte_atomic_32.h>
#endif

-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#endif /* _RTE_ATOMIC_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
index 9ae1e78..dd627a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
@@ -67,6 +67,12 @@ extern "C" {
*/
#define rte_rmb() __sync_synchronize()

+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_wmb()
+
+#define rte_smp_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 671caa7..d854aac 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -81,6 +81,12 @@ static inline void rte_rmb(void)
dmb(ishld);
}

+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_wmb()
+
+#define rte_smp_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-14 01:55:32 UTC
Permalink
The patch does not provide any functional change for IA.
I/O barriers are mapped to existing smp barriers.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Bruce Richardson <***@intel.com>
CC: Konstantin Ananyev <***@intel.com>
---
lib/librte_eal/common/include/arch/x86/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
index 00b1cdf..4eac666 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -61,6 +61,12 @@ extern "C" {

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
/*------------------------- 16 bit atomic operations -------------------------*/

#ifndef RTE_FORCE_INTRINSICS
--
2.5.5
Jerin Jacob
2016-12-14 01:55:36 UTC
Permalink
The patch does not provide any functional change for ARMv7.
I/O barriers are mapped to existing smp barriers.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Jan Viktorin <***@rehivetech.com>
CC: Jianbo Liu <***@linaro.org>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_32.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
index dd627a0..14c0486 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
@@ -73,6 +73,12 @@ extern "C" {

#define rte_smp_rmb() rte_rmb()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-14 01:55:37 UTC
Permalink
dsb instruction based barrier is used for non smp
version of memory barrier.

Fixes: d708f01b7102 ("eal/arm: add atomic operations for ARMv8")

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Jianbo Liu <***@linaro.org>
CC: ***@dpdk.org
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index d854aac..bc7de64 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -43,7 +43,8 @@ extern "C" {

#include "generic/rte_atomic.h"

-#define dmb(opt) do { asm volatile("dmb " #opt : : : "memory"); } while (0)
+#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
+#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }

/**
* General memory barrier.
@@ -54,7 +55,7 @@ extern "C" {
*/
static inline void rte_mb(void)
{
- dmb(ish);
+ dsb(sy);
}

/**
@@ -66,7 +67,7 @@ static inline void rte_mb(void)
*/
static inline void rte_wmb(void)
{
- dmb(ishst);
+ dsb(st);
}

/**
@@ -78,7 +79,7 @@ static inline void rte_wmb(void)
*/
static inline void rte_rmb(void)
{
- dmb(ishld);
+ dsb(ld);
}

#define rte_smp_mb() rte_mb()
--
2.5.5
Jerin Jacob
2016-12-14 01:55:38 UTC
Permalink
dmb instruction based barrier is used for smp version of memory barrier.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index bc7de64..78ebea2 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -82,11 +82,11 @@ static inline void rte_rmb(void)
dsb(ld);
}

-#define rte_smp_mb() rte_mb()
+#define rte_smp_mb() dmb(ish)

-#define rte_smp_wmb() rte_wmb()
+#define rte_smp_wmb() dmb(ishst)

-#define rte_smp_rmb() rte_rmb()
+#define rte_smp_rmb() dmb(ishld)

#ifdef __cplusplus
}
--
2.5.5
Jianbo Liu
2016-12-15 08:13:33 UTC
Permalink
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
dmb instruction based barrier is used for smp version of memory barrier.
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index bc7de64..78ebea2 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -82,11 +82,11 @@ static inline void rte_rmb(void)
dsb(ld);
}
-#define rte_smp_mb() rte_mb()
+#define rte_smp_mb() dmb(ish)
-#define rte_smp_wmb() rte_wmb()
+#define rte_smp_wmb() dmb(ishst)
-#define rte_smp_rmb() rte_rmb()
+#define rte_smp_rmb() dmb(ishld)
rte_*mb are inline functions, while rte_smp_*mb are macro. As they are
all derived from dsb/dmb, can you keep them consistent?
Post by Jerin Jacob
#ifdef __cplusplus
}
--
2.5.5
Jerin Jacob
2016-12-15 08:20:43 UTC
Permalink
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
dmb instruction based barrier is used for smp version of memory barrier.
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index bc7de64..78ebea2 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -82,11 +82,11 @@ static inline void rte_rmb(void)
dsb(ld);
}
-#define rte_smp_mb() rte_mb()
+#define rte_smp_mb() dmb(ish)
-#define rte_smp_wmb() rte_wmb()
+#define rte_smp_wmb() dmb(ishst)
-#define rte_smp_rmb() rte_rmb()
+#define rte_smp_rmb() dmb(ishld)
rte_*mb are inline functions, while rte_smp_*mb are macro. As they are
all derived from dsb/dmb, can you keep them consistent?
OK.I will add a separate patch in v2 series to change existing inline to
marco to keep consistent.
Post by Jianbo Liu
Post by Jerin Jacob
#ifdef __cplusplus
}
--
2.5.5
Jerin Jacob
2016-12-14 01:55:39 UTC
Permalink
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Jianbo Liu <***@linaro.org>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)

#define rte_smp_rmb() dmb(ishld)

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-14 01:55:40 UTC
Permalink
This commit introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the eal abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
doc/api/doxy-api-index.md | 3 +-
lib/librte_eal/common/Makefile | 3 +-
lib/librte_eal/common/include/generic/rte_io.h | 263 +++++++++++++++++++++++++
3 files changed, 267 insertions(+), 2 deletions(-)
create mode 100644 lib/librte_eal/common/include/generic/rte_io.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 02d3a46..0ad3367 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -68,7 +68,8 @@ There are many libraries, so their headers may be grouped by topics:
[branch prediction] (@ref rte_branch_prediction.h),
[cache prefetch] (@ref rte_prefetch.h),
[byte order] (@ref rte_byteorder.h),
- [CPU flags] (@ref rte_cpuflags.h)
+ [CPU flags] (@ref rte_cpuflags.h),
+ [I/O access] (@ref rte_io.h)

- **CPU multicore**:
[interrupts] (@ref rte_interrupts.h),
diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index a92c984..6498c15 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -43,7 +43,8 @@ INC += rte_pci_dev_feature_defs.h rte_pci_dev_features.h
INC += rte_malloc.h rte_keepalive.h rte_time.h

GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
-GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h rte_rwlock.h
+GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h rte_rwlock.h rte_io.h
+
# defined in mk/arch/$(RTE_ARCH)/rte.vars.mk
ARCH_DIR ?= $(RTE_ARCH)
ARCH_INC := $(notdir $(wildcard $(RTE_SDK)/lib/librte_eal/common/include/arch/$(ARCH_DIR)/*.h))
diff --git a/lib/librte_eal/common/include/generic/rte_io.h b/lib/librte_eal/common/include/generic/rte_io.h
new file mode 100644
index 0000000..d7ffbcd
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_io.h
@@ -0,0 +1,263 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_H_
+#define _RTE_IO_H_
+
+/**
+ * @file
+ * I/O device memory operations
+ *
+ * This file defines the generic API for I/O device memory read/write operations
+ */
+
+#include <stdint.h>
+#include <rte_common.h>
+#include <rte_atomic.h>
+
+#ifdef __DOXYGEN__
+
+/**
+ * Read a 8-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint8_t
+rte_readb_relaxed(const volatile void *addr);
+
+/**
+ * Read a 16-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint16_t
+rte_readw_relaxed(const volatile void *addr);
+
+/**
+ * Read a 32-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint32_t
+rte_readl_relaxed(const volatile void *addr);
+
+/**
+ * Read a 64-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint64_t
+rte_readq_relaxed(const volatile void *addr);
+
+/**
+ * Write a 8-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+
+static inline void
+rte_writeb_relaxed(uint8_t value, volatile void *addr);
+
+/**
+ * Write a 16-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writew_relaxed(uint16_t value, volatile void *addr);
+
+/**
+ * Write a 32-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writel_relaxed(uint32_t value, volatile void *addr);
+
+/**
+ * Write a 64-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writeq_relaxed(uint64_t value, volatile void *addr);
+
+/**
+ * Read a 8-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint8_t
+rte_readb(const volatile void *addr);
+
+/**
+ * Read a 16-bit value from I/O device memory address *addr*.
+ *
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint16_t
+rte_readw(const volatile void *addr);
+
+/**
+ * Read a 32-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint32_t
+rte_readl(const volatile void *addr);
+
+/**
+ * Read a 64-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint64_t
+rte_readq(const volatile void *addr);
+
+/**
+ * Write a 8-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+
+static inline void
+rte_writeb(uint8_t value, volatile void *addr);
+
+/**
+ * Write a 16-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writew(uint16_t value, volatile void *addr);
+
+/**
+ * Write a 32-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writel(uint32_t value, volatile void *addr);
+
+/**
+ * Write a 64-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_writeq(uint64_t value, volatile void *addr);
+
+#endif /* __DOXYGEN__ */
+
+#endif /* _RTE_IO_H_ */
--
2.5.5
Jerin Jacob
2016-12-14 01:55:41 UTC
Permalink
This patch implements the generic version of rte_read[b/w/l/q]_[relaxed]
and rte_write[b/w/l/q]_[relaxed] using rte_io_wmb() and rte_io_rmb()

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/generic/rte_io.h | 54 ++++++++++++++++++++++++++
1 file changed, 54 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_io.h b/lib/librte_eal/common/include/generic/rte_io.h
index d7ffbcd..f34c131 100644
--- a/lib/librte_eal/common/include/generic/rte_io.h
+++ b/lib/librte_eal/common/include/generic/rte_io.h
@@ -34,6 +34,8 @@
#ifndef _RTE_IO_H_
#define _RTE_IO_H_

+#include <rte_atomic.h>
+
/**
* @file
* I/O device memory operations
@@ -260,4 +262,56 @@ rte_writeq(uint64_t value, volatile void *addr);

#endif /* __DOXYGEN__ */

+#ifndef RTE_OVERRIDE_IO_H
+
+#define rte_readb_relaxed(addr) \
+ ({ uint8_t __v = *(const volatile uint8_t *)addr; __v; })
+
+#define rte_readw_relaxed(addr) \
+ ({ uint16_t __v = *(const volatile uint16_t *)addr; __v; })
+
+#define rte_readl_relaxed(addr) \
+ ({ uint32_t __v = *(const volatile uint32_t *)addr; __v; })
+
+#define rte_readq_relaxed(addr) \
+ ({ uint64_t __v = *(const volatile uint64_t *)addr; __v; })
+
+#define rte_writeb_relaxed(value, addr) \
+ ({ *(volatile uint8_t *)addr = value; })
+
+#define rte_writew_relaxed(value, addr) \
+ ({ *(volatile uint16_t *)addr = value; })
+
+#define rte_writel_relaxed(value, addr) \
+ ({ *(volatile uint32_t *)addr = value; })
+
+#define rte_writeq_relaxed(value, addr) \
+ ({ *(volatile uint64_t *)addr = value; })
+
+#define rte_readb(addr) \
+ ({ uint8_t __v = *(const volatile uint8_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_readw(addr) \
+ ({uint16_t __v = *(const volatile uint16_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_readl(addr) \
+ ({uint32_t __v = *(const volatile uint32_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_readq(addr) \
+ ({uint64_t __v = *(const volatile uint64_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_writeb(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint8_t *)addr = value; })
+
+#define rte_writew(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint16_t *)addr = value; })
+
+#define rte_writel(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint32_t *)addr = value; })
+
+#define rte_writeq(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint64_t *)addr = value; })
+
+#endif /* RTE_OVERRIDE_IO_H */
+
#endif /* _RTE_IO_H_ */
--
2.5.5
Jerin Jacob
2016-12-14 01:55:43 UTC
Permalink
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif

+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif

#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+#define rte_readb_relaxed(addr) \
+ ({ uint8_t __v = __rte_arm64_readb(addr); __v; })
+
+#define rte_readw_relaxed(addr) \
+ ({ uint16_t __v = __rte_arm64_readw(addr); __v; })
+
+#define rte_readl_relaxed(addr) \
+ ({ uint32_t __v = __rte_arm64_readl(addr); __v; })
+
+#define rte_readq_relaxed(addr) \
+ ({ uint64_t __v = __rte_arm64_readq(addr); __v; })
+
+#define rte_writeb_relaxed(value, addr) \
+ ({ __rte_arm64_writeb(value, addr); })
+
+#define rte_writew_relaxed(value, addr) \
+ ({ __rte_arm64_writew(value, addr); })
+
+#define rte_writel_relaxed(value, addr) \
+ ({ __rte_arm64_writel(value, addr); })
+
+#define rte_writeq_relaxed(value, addr) \
+ ({ __rte_arm64_writeq(value, addr); })
+
+#define rte_readb(addr) \
+ ({ uint8_t __v = __rte_arm64_readb(addr); rte_io_rmb(); __v; })
+
+#define rte_readw(addr) \
+ ({ uint16_t __v = __rte_arm64_readw(addr); rte_io_rmb(); __v; })
+
+#define rte_readl(addr) \
+ ({ uint32_t __v = __rte_arm64_readl(addr); rte_io_rmb(); __v; })
+
+#define rte_readq(addr) \
+ ({ uint64_t __v = __rte_arm64_readq(addr); rte_io_rmb(); __v; })
+
+#define rte_writeb(value, addr) \
+ ({ rte_io_wmb(); rte_writeb_relaxed(value, addr); })
+
+#define rte_writew(value, addr) \
+ ({ rte_io_wmb(); rte_writew_relaxed(value, addr); })
+
+#define rte_writel(value, addr) \
+ ({ rte_io_wmb(); rte_writel_relaxed(value, addr); })
+
+#define rte_writeq(value, addr) \
+ ({ rte_io_wmb(); rte_writeq_relaxed(value, addr); })
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_ARM64_H_ */
--
2.5.5
Jianbo Liu
2016-12-15 09:53:05 UTC
Permalink
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Post by Jerin Jacob
+
+#define rte_readb_relaxed(addr) \
+ ({ uint8_t __v = __rte_arm64_readb(addr); __v; })
+
+#define rte_readw_relaxed(addr) \
+ ({ uint16_t __v = __rte_arm64_readw(addr); __v; })
+
+#define rte_readl_relaxed(addr) \
+ ({ uint32_t __v = __rte_arm64_readl(addr); __v; })
+
+#define rte_readq_relaxed(addr) \
+ ({ uint64_t __v = __rte_arm64_readq(addr); __v; })
+
+#define rte_writeb_relaxed(value, addr) \
+ ({ __rte_arm64_writeb(value, addr); })
+
+#define rte_writew_relaxed(value, addr) \
+ ({ __rte_arm64_writew(value, addr); })
+
+#define rte_writel_relaxed(value, addr) \
+ ({ __rte_arm64_writel(value, addr); })
+
+#define rte_writeq_relaxed(value, addr) \
+ ({ __rte_arm64_writeq(value, addr); })
+
+#define rte_readb(addr) \
+ ({ uint8_t __v = __rte_arm64_readb(addr); rte_io_rmb(); __v; })
+
+#define rte_readw(addr) \
+ ({ uint16_t __v = __rte_arm64_readw(addr); rte_io_rmb(); __v; })
+
+#define rte_readl(addr) \
+ ({ uint32_t __v = __rte_arm64_readl(addr); rte_io_rmb(); __v; })
+
+#define rte_readq(addr) \
+ ({ uint64_t __v = __rte_arm64_readq(addr); rte_io_rmb(); __v; })
+
+#define rte_writeb(value, addr) \
+ ({ rte_io_wmb(); rte_writeb_relaxed(value, addr); })
+
+#define rte_writew(value, addr) \
+ ({ rte_io_wmb(); rte_writew_relaxed(value, addr); })
+
+#define rte_writel(value, addr) \
+ ({ rte_io_wmb(); rte_writel_relaxed(value, addr); })
+
+#define rte_writeq(value, addr) \
+ ({ rte_io_wmb(); rte_writeq_relaxed(value, addr); })
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_ARM64_H_ */
--
2.5.5
Jerin Jacob
2016-12-15 10:04:24 UTC
Permalink
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Better to be native if there is option. That all. Do you see any issue?
or what is the real concern?
Jianbo Liu
2016-12-15 10:17:32 UTC
Permalink
On 15 December 2016 at 18:04, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Better to be native if there is option. That all. Do you see any issue?
or what is the real concern?
I think it's the same as the generic c version after compiling. Am I right?
If there is no apparent benefit, I don't think we need the overriding.
Jerin Jacob
2016-12-15 11:08:09 UTC
Permalink
Post by Jianbo Liu
On 15 December 2016 at 18:04, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Better to be native if there is option. That all. Do you see any issue?
or what is the real concern?
I think it's the same as the generic c version after compiling. Am I right?
I really don't that is the case for all the scenarios like compiler may
combine two 16bit reads one 32bit read etc and which will impact on IO
register access.

But, I am sure the proposed scheme generates correct instruction in all the cases.
Jianbo Liu
2016-12-16 10:12:13 UTC
Permalink
On 15 December 2016 at 19:08, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 15 December 2016 at 18:04, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Better to be native if there is option. That all. Do you see any issue?
or what is the real concern?
I think it's the same as the generic c version after compiling. Am I right?
I really don't that is the case for all the scenarios like compiler may
combine two 16bit reads one 32bit read etc and which will impact on IO
I wonder which compiler will do that as armv8 is 32/64 bit system?
Post by Jerin Jacob
register access.
But, I am sure the proposed scheme generates correct instruction in all the cases.
Jerin Jacob
2016-12-16 10:25:53 UTC
Permalink
Post by Jianbo Liu
On 15 December 2016 at 19:08, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 15 December 2016 at 18:04, Jerin Jacob
Post by Jerin Jacob
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 183 +++++++++++++++++++++
2 files changed, 187 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif
+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif
#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..09e7a89
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,183 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+__rte_arm64_readb(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+__rte_arm64_readw(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+__rte_arm64_readl(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+__rte_arm64_readq(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeb(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writew(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writel(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+__rte_arm64_writeq(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ : [val] "r" (val), [addr] "r" (addr));
+}
I'm not quite sure about these overridings. Can you explain the
benefit to do so?
Better to be native if there is option. That all. Do you see any issue?
or what is the real concern?
I think it's the same as the generic c version after compiling. Am I right?
I really don't that is the case for all the scenarios like compiler may
combine two 16bit reads one 32bit read etc and which will impact on IO
I wonder which compiler will do that as armv8 is 32/64 bit system?
Not specific to armv8.
Two consecutive continues 16bits reads one 32bit read for optimization.
Any idea why Linux kernel doing explicit instructions for readl/writel?
obviously not for fun.
Post by Jianbo Liu
Post by Jerin Jacob
register access.
But, I am sure the proposed scheme generates correct instruction in all the cases.
Jerin Jacob
2016-12-14 01:55:42 UTC
Permalink
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/ppc_64/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/tile/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/x86/rte_io.h | 47 ++++++++++++++++++++++
4 files changed, 188 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/tile/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_io.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
new file mode 100644
index 0000000..74c1f2c
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM_H_
+#define _RTE_IO_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_io.h b/lib/librte_eal/common/include/arch/ppc_64/rte_io.h
new file mode 100644
index 0000000..be192da
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_PPC_64_H_
+#define _RTE_IO_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_PPC_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/tile/rte_io.h b/lib/librte_eal/common/include/arch/tile/rte_io.h
new file mode 100644
index 0000000..9c8588f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/tile/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_TILE_H_
+#define _RTE_IO_TILE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_TILE_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_io.h b/lib/librte_eal/common/include/arch/x86/rte_io.h
new file mode 100644
index 0000000..c8d1404
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_X86_H_
+#define _RTE_IO_X86_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_X86_H_ */
--
2.5.5
Jerin Jacob
2016-12-14 01:55:44 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: John Griffin <***@intel.com>
CC: Fiona Trahe <***@intel.com>
CC: Deepak Kumar Jain <***@intel.com>
---
drivers/crypto/qat/qat_adf/adf_transport_access_macros.h | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h b/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
index 47f1c91..a6e407d 100644
--- a/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
+++ b/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
@@ -47,14 +47,19 @@
#ifndef ADF_TRANSPORT_ACCESS_MACROS_H
#define ADF_TRANSPORT_ACCESS_MACROS_H

+#include <rte_io.h>
+
/* CSR write macro */
-#define ADF_CSR_WR(csrAddr, csrOffset, val) \
- (void)((*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset)) \
- = (val)))
+#define ADF_CSR_WR(csrAddr, csrOffset, val) ({ \
+ rte_writel(val, (((uint8_t *)csrAddr) + csrOffset)); \
+})

/* CSR read macro */
-#define ADF_CSR_RD(csrAddr, csrOffset) \
- (*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset)))
+#define ADF_CSR_RD(csrAddr, csrOffset) ({ \
+ uint32_t __val; \
+ __val = rte_readl(((uint8_t *)csrAddr) + csrOffset); \
+ __val; \
+})

#define ADF_BANK_INT_SRC_SEL_MASK_0 0x4444444CUL
#define ADF_BANK_INT_SRC_SEL_MASK_X 0x44444444UL
--
2.5.5
Jerin Jacob
2016-12-14 01:55:45 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Harish Patil <***@cavium.com>
CC: Rasesh Mody <***@cavium.com>
---
drivers/net/bnx2x/bnx2x.h | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 5cefea4..9b6e49a 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -40,6 +40,7 @@
#include "bnx2x_vfpf.h"

#include "elink.h"
+#include <rte_io.h>

#ifndef __FreeBSD__
#include <linux/pci_regs.h>
@@ -1419,8 +1420,7 @@ bnx2x_reg_write8(struct bnx2x_softc *sc, size_t offset, uint8_t val)
{
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%02x",
(unsigned long)offset, val);
- *((volatile uint8_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_writeb(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
}

static inline void
@@ -1433,8 +1433,8 @@ bnx2x_reg_write16(struct bnx2x_softc *sc, size_t offset, uint16_t val)
#endif
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%04x",
(unsigned long)offset, val);
- *((volatile uint16_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_writew(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
+
}

static inline void
@@ -1448,8 +1448,7 @@ bnx2x_reg_write32(struct bnx2x_softc *sc, size_t offset, uint32_t val)

PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);
- *((volatile uint32_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_writel(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
}

static inline uint8_t
@@ -1457,8 +1456,7 @@ bnx2x_reg_read8(struct bnx2x_softc *sc, size_t offset)
{
uint8_t val;

- val = (uint8_t)(*((volatile uint8_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_readb((uint8_t *)sc->bar[BAR0].base_addr + offset);
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%02x",
(unsigned long)offset, val);

@@ -1476,8 +1474,7 @@ bnx2x_reg_read16(struct bnx2x_softc *sc, size_t offset)
(unsigned long)offset);
#endif

- val = (uint16_t)(*((volatile uint16_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_readw(((uint8_t *)sc->bar[BAR0].base_addr + offset));
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);

@@ -1495,8 +1492,7 @@ bnx2x_reg_read32(struct bnx2x_softc *sc, size_t offset)
(unsigned long)offset);
#endif

- val = (uint32_t)(*((volatile uint32_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_readl(((uint8_t *)sc->bar[BAR0].base_addr + offset));
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);

@@ -1560,11 +1556,15 @@ bnx2x_reg_read32(struct bnx2x_softc *sc, size_t offset)
#define DPM_TRIGGER_TYPE 0x40

/* Doorbell macro */
-#define BNX2X_DB_WRITE(db_bar, val) \
- *((volatile uint32_t *)(db_bar)) = (val)
+#define BNX2X_DB_WRITE(db_bar, val) ({ \
+ rte_writel(val, db_bar); \
+})

-#define BNX2X_DB_READ(db_bar) \
- *((volatile uint32_t *)(db_bar))
+#define BNX2X_DB_READ(db_bar) ({ \
+ uint32_t __val; \
+ __val = rte_readl(db_bar); \
+ __val; \
+})

#define DOORBELL_ADDR(sc, offset) \
(volatile uint32_t *)(((char *)(sc)->bar[BAR1].base_addr + (offset)))
--
2.5.5
Jerin Jacob
2016-12-14 01:55:46 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Stephen Hurd <***@broadcom.com>
CC: Ajit Khaparde <***@broadcom.com>
---
drivers/net/bnxt/bnxt_hwrm.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 07e7124..2067e15 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -50,6 +50,8 @@
#include "bnxt_vnic.h"
#include "hsi_struct_def_dpdk.h"

+#include <rte_io.h>
+
#define HWRM_CMD_TIMEOUT 2000

/*
@@ -72,19 +74,19 @@ static int bnxt_hwrm_send_message_locked(struct bnxt *bp, void *msg,
/* Write request msg to hwrm channel */
for (i = 0; i < msg_len; i += 4) {
bar = (uint8_t *)bp->bar0 + i;
- *(volatile uint32_t *)bar = *data;
+ rte_writel(*data, bar);
data++;
}

/* Zero the rest of the request space */
for (; i < bp->max_req_len; i += 4) {
bar = (uint8_t *)bp->bar0 + i;
- *(volatile uint32_t *)bar = 0;
+ rte_writel(0, bar);
}

/* Ring channel doorbell */
bar = (uint8_t *)bp->bar0 + 0x100;
- *(volatile uint32_t *)bar = 1;
+ rte_writel(1, bar);

/* Poll for the valid bit */
for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
--
2.5.5
Jerin Jacob
2016-12-14 01:55:47 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Rahul Lakkireddy <***@chelsio.com>
---
drivers/net/cxgbe/base/adapter.h | 13 +++++++++----
drivers/net/cxgbe/cxgbe_compat.h | 3 ++-
2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 5e3bd50..0ae4513 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -40,6 +40,7 @@

#include "cxgbe_compat.h"
#include "t4_regs_values.h"
+#include "rte_io.h"

enum {
MAX_ETH_QSETS = 64, /* # of Ethernet Tx/Rx queue sets */
@@ -324,7 +325,11 @@ struct adapter {
int use_unpacked_mode; /* unpacked rx mode state */
};

-#define CXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define CXGBE_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})

static inline uint64_t cxgbe_read_addr64(volatile void *addr)
{
@@ -351,15 +356,15 @@ static inline uint32_t cxgbe_read_addr(volatile void *addr)
cxgbe_read_addr64(CXGBE_PCI_REG_ADDR((adap), (reg)))

#define CXGBE_PCI_REG_WRITE(reg, value) ({ \
- CXGBE_PCI_REG((reg)) = (value); })
+ rte_writel(value, reg); })

#define CXGBE_WRITE_REG(adap, reg, value) \
CXGBE_PCI_REG_WRITE(CXGBE_PCI_REG_ADDR((adap), (reg)), (value))

static inline uint64_t cxgbe_write_addr64(volatile void *addr, uint64_t val)
{
- CXGBE_PCI_REG(addr) = val;
- CXGBE_PCI_REG(((volatile uint8_t *)(addr) + 4)) = (val >> 32);
+ CXGBE_PCI_REG_WRITE(addr, val);
+ CXGBE_PCI_REG_WRITE(((volatile uint8_t *)(addr) + 4), (val >> 32));
return val;
}

diff --git a/drivers/net/cxgbe/cxgbe_compat.h b/drivers/net/cxgbe/cxgbe_compat.h
index e68f8f5..95d8f27 100644
--- a/drivers/net/cxgbe/cxgbe_compat.h
+++ b/drivers/net/cxgbe/cxgbe_compat.h
@@ -45,6 +45,7 @@
#include <rte_cycles.h>
#include <rte_spinlock.h>
#include <rte_log.h>
+#include <rte_io.h>

#define dev_printf(level, fmt, args...) \
RTE_LOG(level, PMD, "rte_cxgbe_pmd: " fmt, ## args)
@@ -254,7 +255,7 @@ static inline unsigned long ilog2(unsigned long n)

static inline void writel(unsigned int val, volatile void __iomem *addr)
{
- *(volatile unsigned int *)addr = val;
+ rte_writel(val, addr);
}

static inline void writeq(u64 val, volatile void __iomem *addr)
--
2.5.5
Jerin Jacob
2016-12-14 01:55:48 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Wenzhuo Lu <***@intel.com>
---
drivers/net/e1000/base/e1000_osdep.h | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/net/e1000/base/e1000_osdep.h b/drivers/net/e1000/base/e1000_osdep.h
index 47a1948..dd9a2d8 100644
--- a/drivers/net/e1000/base/e1000_osdep.h
+++ b/drivers/net/e1000/base/e1000_osdep.h
@@ -44,6 +44,7 @@
#include <rte_log.h>
#include <rte_debug.h>
#include <rte_byteorder.h>
+#include <rte_io.h>

#include "../e1000_logs.h"

@@ -94,17 +95,25 @@ typedef int bool;

#define E1000_WRITE_FLUSH(a) E1000_READ_REG(a, E1000_STATUS)

-#define E1000_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define E1000_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})

-#define E1000_PCI_REG16(reg) (*((volatile uint16_t *)(reg)))
+#define E1000_PCI_REG16(reg) ({ \
+ uint16_t __val; \
+ __val = rte_readw(reg); \
+ __val; \
+})

-#define E1000_PCI_REG_WRITE(reg, value) do { \
- E1000_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while (0)
+#define E1000_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(rte_cpu_to_le_32(value), reg); \
+})

-#define E1000_PCI_REG_WRITE16(reg, value) do { \
- E1000_PCI_REG16((reg)) = (rte_cpu_to_le_16(value)); \
-} while (0)
+#define E1000_PCI_REG_WRITE16(reg, value) ({ \
+ rte_writew(rte_cpu_to_le_16(value), reg); \
+})

#define E1000_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
--
2.5.5
Jerin Jacob
2016-12-14 01:55:49 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Jan Medala <***@semihalf.com>
CC: Jakub Palider <***@semihalf.com>
---
drivers/net/ena/base/ena_plat_dpdk.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index 87c3bf1..4db07c7 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -50,6 +50,7 @@
#include <rte_spinlock.h>

#include <sys/time.h>
+#include <rte_io.h>

typedef uint64_t u64;
typedef uint32_t u32;
@@ -226,12 +227,12 @@ typedef uint64_t dma_addr_t;

static inline void writel(u32 value, volatile void *addr)
{
- *(volatile u32 *)addr = value;
+ rte_writel(value, addr);
}

static inline u32 readl(const volatile void *addr)
{
- return *(const volatile u32 *)addr;
+ return rte_readl(addr);
}

#define ENA_REG_WRITE32(value, reg) writel((value), (reg))
--
2.5.5
Jan Mędala
2016-12-14 14:36:02 UTC
Permalink
Despite the issue with naming convention (either it will be writel or
write32), I'm fine with this change and new API.

Acked-by: Jan Medala <***@semihalf.com>

Jan
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
---
drivers/net/ena/base/ena_plat_dpdk.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h
b/drivers/net/ena/base/ena_plat_dpdk.h
index 87c3bf1..4db07c7 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -50,6 +50,7 @@
#include <rte_spinlock.h>
#include <sys/time.h>
+#include <rte_io.h>
typedef uint64_t u64;
typedef uint32_t u32;
@@ -226,12 +227,12 @@ typedef uint64_t dma_addr_t;
static inline void writel(u32 value, volatile void *addr)
{
- *(volatile u32 *)addr = value;
+ rte_writel(value, addr);
}
static inline u32 readl(const volatile void *addr)
{
- return *(const volatile u32 *)addr;
+ return rte_readl(addr);
}
#define ENA_REG_WRITE32(value, reg) writel((value), (reg))
--
2.5.5
Jerin Jacob
2016-12-14 01:55:50 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: John Daley <***@cisco.com>
CC: Nelson Escobar <***@cisco.com>
---
drivers/net/enic/enic_compat.h | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 5dbd983..1c9cdc6 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -41,6 +41,7 @@
#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
+#include <rte_io.h>

#define ENIC_PAGE_ALIGN 4096UL
#define ENIC_ALIGN ENIC_PAGE_ALIGN
@@ -95,42 +96,42 @@ typedef unsigned long long dma_addr_t;

static inline uint32_t ioread32(volatile void *addr)
{
- return *(volatile uint32_t *)addr;
+ return rte_readl(addr);
}

static inline uint16_t ioread16(volatile void *addr)
{
- return *(volatile uint16_t *)addr;
+ return rte_readw(addr);
}

static inline uint8_t ioread8(volatile void *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_readb(addr);
}

static inline void iowrite32(uint32_t val, volatile void *addr)
{
- *(volatile uint32_t *)addr = val;
+ rte_writel(val, addr);
}

static inline void iowrite16(uint16_t val, volatile void *addr)
{
- *(volatile uint16_t *)addr = val;
+ rte_writew(val, addr);
}

static inline void iowrite8(uint8_t val, volatile void *addr)
{
- *(volatile uint8_t *)addr = val;
+ rte_writeb(val, addr);
}

static inline unsigned int readl(volatile void __iomem *addr)
{
- return *(volatile unsigned int *)addr;
+ return rte_readl(addr);
}

static inline void writel(unsigned int val, volatile void __iomem *addr)
{
- *(volatile unsigned int *)addr = val;
+ rte_writel(val, addr);
}

#define min_t(type, x, y) ({ \
--
2.5.5
Jerin Jacob
2016-12-14 01:55:51 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Jing Chen <***@intel.com>
---
drivers/net/fm10k/base/fm10k_osdep.h | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/net/fm10k/base/fm10k_osdep.h b/drivers/net/fm10k/base/fm10k_osdep.h
index a21daa2..d91ff41 100644
--- a/drivers/net/fm10k/base/fm10k_osdep.h
+++ b/drivers/net/fm10k/base/fm10k_osdep.h
@@ -39,6 +39,7 @@ POSSIBILITY OF SUCH DAMAGE.
#include <rte_atomic.h>
#include <rte_byteorder.h>
#include <rte_cycles.h>
+#include <rte_io.h>
#include "../fm10k_logs.h"

/* TODO: this does not look like it should be used... */
@@ -88,17 +89,27 @@ typedef int bool;
#endif

/* offsets are WORD offsets, not BYTE offsets */
-#define FM10K_WRITE_REG(hw, reg, val) \
- ((((volatile uint32_t *)(hw)->hw_addr)[(reg)]) = ((uint32_t)(val)))
-#define FM10K_READ_REG(hw, reg) \
- (((volatile uint32_t *)(hw)->hw_addr)[(reg)])
+#define FM10K_WRITE_REG(hw, reg, val) ({ \
+ rte_writel(val, ((hw)->hw_addr + (reg))); \
+})
+
+#define FM10K_READ_REG(hw, reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl((hw)->hw_addr + (reg)); \
+ __val; \
+})
+
#define FM10K_WRITE_FLUSH(a) FM10K_READ_REG(a, FM10K_CTRL)

-#define FM10K_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define FM10K_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})

-#define FM10K_PCI_REG_WRITE(reg, value) do { \
- FM10K_PCI_REG((reg)) = (value); \
-} while (0)
+#define FM10K_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(value, reg); \
+})

/* not implemented */
#define FM10K_READ_PCI_WORD(hw, reg) 0
--
2.5.5
Jerin Jacob
2016-12-14 01:55:52 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Satha Rao <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Helin Zhang <***@intel.com>
CC: Jingjing Wu <***@intel.com>
---
drivers/net/i40e/base/i40e_osdep.h | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_osdep.h b/drivers/net/i40e/base/i40e_osdep.h
index 38e7ba5..8d5045a 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_spinlock.h>
#include <rte_log.h>
+#include <rte_io.h>

#include "../i40e_logs.h"

@@ -153,15 +154,22 @@ do { \
* I40E_PRTQF_FD_MSK
*/

-#define I40E_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define I40E_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})
+
#define I40E_PCI_REG_ADDR(a, reg) \
((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
static inline uint32_t i40e_read_addr(volatile void *addr)
{
return rte_le_to_cpu_32(I40E_PCI_REG(addr));
}
-#define I40E_PCI_REG_WRITE(reg, value) \
- do { I40E_PCI_REG((reg)) = rte_cpu_to_le_32(value); } while (0)
+
+#define I40E_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(rte_cpu_to_le_32(value), reg); \
+})

#define I40E_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_GLGEN_STAT)
#define I40EVF_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_VFGEN_RSTAT)
--
2.5.5
Jerin Jacob
2016-12-14 01:55:53 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Helin Zhang <***@intel.com>
CC: Konstantin Ananyev <***@intel.com>
---
drivers/net/ixgbe/base/ixgbe_osdep.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h b/drivers/net/ixgbe/base/ixgbe_osdep.h
index 77f0af5..9d16c21 100644
--- a/drivers/net/ixgbe/base/ixgbe_osdep.h
+++ b/drivers/net/ixgbe/base/ixgbe_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_log.h>
#include <rte_byteorder.h>
+#include <rte_io.h>

#include "../ixgbe_logs.h"
#include "../ixgbe_bypass_defines.h"
@@ -121,16 +122,20 @@ typedef int bool;

#define prefetch(x) rte_prefetch0(x)

-#define IXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define IXGBE_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})

static inline uint32_t ixgbe_read_addr(volatile void* addr)
{
return rte_le_to_cpu_32(IXGBE_PCI_REG(addr));
}

-#define IXGBE_PCI_REG_WRITE(reg, value) do { \
- IXGBE_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while(0)
+#define IXGBE_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(rte_cpu_to_le_32(value), reg); \
+})

#define IXGBE_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
--
2.5.5
Jianbo Liu
2016-12-15 08:37:12 UTC
Permalink
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
---
drivers/net/ixgbe/base/ixgbe_osdep.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h b/drivers/net/ixgbe/base/ixgbe_osdep.h
index 77f0af5..9d16c21 100644
--- a/drivers/net/ixgbe/base/ixgbe_osdep.h
+++ b/drivers/net/ixgbe/base/ixgbe_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_log.h>
#include <rte_byteorder.h>
+#include <rte_io.h>
#include "../ixgbe_logs.h"
#include "../ixgbe_bypass_defines.h"
@@ -121,16 +122,20 @@ typedef int bool;
#define prefetch(x) rte_prefetch0(x)
-#define IXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define IXGBE_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})
static inline uint32_t ixgbe_read_addr(volatile void* addr)
{
return rte_le_to_cpu_32(IXGBE_PCI_REG(addr));
}
-#define IXGBE_PCI_REG_WRITE(reg, value) do { \
- IXGBE_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while(0)
+#define IXGBE_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(rte_cpu_to_le_32(value), reg); \
+})
memory barrier operation is put inside IXGBE_PCI_REG_READ/WRITE in
your change, but I found rte_*mb is called before these macros in some
places.
Can you remove all these redundant calls? And please do the same
checking for other drivers.
Post by Jerin Jacob
#define IXGBE_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
--
2.5.5
Santosh Shukla
2016-12-16 04:40:19 UTC
Permalink
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
---
drivers/net/ixgbe/base/ixgbe_osdep.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h b/drivers/net/ixgbe/base/ixgbe_osdep.h
index 77f0af5..9d16c21 100644
--- a/drivers/net/ixgbe/base/ixgbe_osdep.h
+++ b/drivers/net/ixgbe/base/ixgbe_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_log.h>
#include <rte_byteorder.h>
+#include <rte_io.h>
#include "../ixgbe_logs.h"
#include "../ixgbe_bypass_defines.h"
@@ -121,16 +122,20 @@ typedef int bool;
#define prefetch(x) rte_prefetch0(x)
-#define IXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define IXGBE_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})
static inline uint32_t ixgbe_read_addr(volatile void* addr)
{
return rte_le_to_cpu_32(IXGBE_PCI_REG(addr));
}
-#define IXGBE_PCI_REG_WRITE(reg, value) do { \
- IXGBE_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while(0)
+#define IXGBE_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(rte_cpu_to_le_32(value), reg); \
+})
memory barrier operation is put inside IXGBE_PCI_REG_READ/WRITE in
your change, but I found rte_*mb is called before these macros in some
places.
Can you remove all these redundant calls? And please do the same
checking for other drivers.
Ok.

Thinking of adding _relaxed_rd/wr style macro agnostic to arch for ixgbe case
in particular. Such that for those code incident:
x86 case> first default barrier + relaxed call.
arm case> first default barrier + relaxed call.

Does that make sense to you? If so then will take care in v2.

Santosh.
Post by Jianbo Liu
Post by Jerin Jacob
#define IXGBE_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
--
2.5.5
Santosh Shukla
2016-12-22 12:36:16 UTC
Permalink
Hi Jiangbo,
Post by Santosh Shukla
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
memory barrier operation is put inside IXGBE_PCI_REG_READ/WRITE in
your change, but I found rte_*mb is called before these macros in some
places.
Can you remove all these redundant calls? And please do the same
checking for other drivers.
Ok.
Thinking of adding _relaxed_rd/wr style macro agnostic to arch for ixgbe case
x86 case> first default barrier + relaxed call.
arm case> first default barrier + relaxed call.
Does that make sense to you? If so then will take care in v2.
Santosh.
We spend time looking at drivers code where double barrier
may happen. Most of them are in driver init path,
configuration/control path code. So keeping double
barrier won't impact performance.

We plan to replace only fast path code with _relaxed
style API's. That way we won't impact each driver
performance and we'll have the clean port.

Does it make sense? Thought?
Post by Santosh Shukla
Post by Jianbo Liu
Post by Jerin Jacob
#define IXGBE_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
--
2.5.5
Jianbo Liu
2016-12-23 01:42:23 UTC
Permalink
Hi Santosh,

On 22 December 2016 at 20:36, Santosh Shukla
Post by Santosh Shukla
Hi Jiangbo,
Post by Santosh Shukla
Post by Jianbo Liu
On 14 December 2016 at 09:55, Jerin Jacob
memory barrier operation is put inside IXGBE_PCI_REG_READ/WRITE in
your change, but I found rte_*mb is called before these macros in some
places.
Can you remove all these redundant calls? And please do the same
checking for other drivers.
Ok.
Thinking of adding _relaxed_rd/wr style macro agnostic to arch for ixgbe case
x86 case> first default barrier + relaxed call.
arm case> first default barrier + relaxed call.
Does that make sense to you? If so then will take care in v2.
Santosh.
We spend time looking at drivers code where double barrier
may happen. Most of them are in driver init path,
configuration/control path code. So keeping double
barrier won't impact performance.
We plan to replace only fast path code with _relaxed
style API's. That way we won't impact each driver
performance and we'll have the clean port.
Does it make sense? Thought?
Yes, please continue your work.
Jerin Jacob
2016-12-14 01:55:54 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Alejandro Lucero <***@netronome.com>
---
drivers/net/nfp/nfp_net_pmd.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index c180972..ec3d35e 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -121,25 +121,26 @@ struct nfp_net_adapter;
#define NFD_CFG_MINOR_VERSION_of(x) (((x) >> 0) & 0xff)

#include <linux/types.h>
+#include <rte_io.h>

static inline uint8_t nn_readb(volatile const void *addr)
{
- return *((volatile const uint8_t *)(addr));
+ return rte_readb(addr);
}

static inline void nn_writeb(uint8_t val, volatile void *addr)
{
- *((volatile uint8_t *)(addr)) = val;
+ rte_writeb(val, addr);
}

static inline uint32_t nn_readl(volatile const void *addr)
{
- return *((volatile const uint32_t *)(addr));
+ return rte_readl(addr);
}

static inline void nn_writel(uint32_t val, volatile void *addr)
{
- *((volatile uint32_t *)(addr)) = val;
+ rte_writel(val, addr);
}

static inline uint64_t nn_readq(volatile void *addr)
--
2.5.5
Jerin Jacob
2016-12-14 01:55:55 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Harish Patil <***@cavium.com>
CC: Rasesh Mody <***@cavium.com>
---
drivers/net/qede/base/bcm_osal.h | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/qede/base/bcm_osal.h b/drivers/net/qede/base/bcm_osal.h
index 0b446f2..925660e 100644
--- a/drivers/net/qede/base/bcm_osal.h
+++ b/drivers/net/qede/base/bcm_osal.h
@@ -18,6 +18,7 @@
#include <rte_cycles.h>
#include <rte_debug.h>
#include <rte_ether.h>
+#include <rte_io.h>

/* Forward declaration */
struct ecore_dev;
@@ -113,18 +114,23 @@ void *osal_dma_alloc_coherent_aligned(struct ecore_dev *, dma_addr_t *,

/* HW reads/writes */

-#define DIRECT_REG_RD(_dev, _reg_addr) \
- (*((volatile u32 *) (_reg_addr)))
+#define DIRECT_REG_RD(_dev, _reg_addr) ({ \
+ uint32_t __val; \
+ __val = rte_readl((_reg_addr)); \
+ __val; \
+})

#define REG_RD(_p_hwfn, _reg_offset) \
DIRECT_REG_RD(_p_hwfn, \
((u8 *)(uintptr_t)(_p_hwfn->regview) + (_reg_offset)))

-#define DIRECT_REG_WR16(_reg_addr, _val) \
- (*((volatile u16 *)(_reg_addr)) = _val)
+#define DIRECT_REG_WR16(_reg_addr, _val) ({ \
+ rte_writew((_val), (_reg_addr)); \
+})

-#define DIRECT_REG_WR(_dev, _reg_addr, _val) \
- (*((volatile u32 *)(_reg_addr)) = _val)
+#define DIRECT_REG_WR(_dev, _reg_addr, _val) ({ \
+ rte_writel((_val), (_reg_addr)); \
+})

#define REG_WR(_p_hwfn, _reg_offset, _val) \
DIRECT_REG_WR(NULL, \
--
2.5.5
Jerin Jacob
2016-12-14 01:55:56 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Huawei Xie <***@intel.com>
CC: Yuanhan Liu <***@linux.intel.com>
---
drivers/net/virtio/virtio_pci.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 9b47165..47c5a2e 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -41,6 +41,8 @@
#include "virtio_logs.h"
#include "virtqueue.h"

+#include <rte_io.h>
+
/*
* Following macros are derived from linux/pci_regs.h, however,
* we can't simply include that header here, as there is no such
@@ -320,37 +322,37 @@ static const struct virtio_pci_ops legacy_ops = {
static inline uint8_t
io_read8(uint8_t *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_readb(addr);
}

static inline void
io_write8(uint8_t val, uint8_t *addr)
{
- *(volatile uint8_t *)addr = val;
+ rte_writeb(val, addr);
}

static inline uint16_t
io_read16(uint16_t *addr)
{
- return *(volatile uint16_t *)addr;
+ return rte_readw(addr);
}

static inline void
io_write16(uint16_t val, uint16_t *addr)
{
- *(volatile uint16_t *)addr = val;
+ rte_writew(val, addr);
}

static inline uint32_t
io_read32(uint32_t *addr)
{
- return *(volatile uint32_t *)addr;
+ return rte_readl(addr);
}

static inline void
io_write32(uint32_t val, uint32_t *addr)
{
- *(volatile uint32_t *)addr = val;
+ rte_writel(val, addr);
}

static inline void
--
2.5.5
Yuanhan Liu
2016-12-14 02:46:57 UTC
Permalink
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
Not a big deal, but I think we normally put the 'Cc' above the SoB.

--yliu
Yuanhan Liu
2016-12-14 03:02:23 UTC
Permalink
Post by Jerin Jacob
* Following macros are derived from linux/pci_regs.h, however,
* we can't simply include that header here, as there is no such
@@ -320,37 +322,37 @@ static const struct virtio_pci_ops legacy_ops = {
static inline uint8_t
io_read8(uint8_t *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_readb(addr);
}
Oh, one more comments: why not replacing io_read8 with rte_readb(),
and do similar for others? Then we don't have to define those wrappers.

I think you can also do something similar for other patches?

--yliu
Post by Jerin Jacob
static inline void
io_write8(uint8_t val, uint8_t *addr)
{
- *(volatile uint8_t *)addr = val;
+ rte_writeb(val, addr);
}
static inline uint16_t
io_read16(uint16_t *addr)
{
- return *(volatile uint16_t *)addr;
+ return rte_readw(addr);
}
static inline void
io_write16(uint16_t val, uint16_t *addr)
{
- *(volatile uint16_t *)addr = val;
+ rte_writew(val, addr);
}
static inline uint32_t
io_read32(uint32_t *addr)
{
- return *(volatile uint32_t *)addr;
+ return rte_readl(addr);
}
static inline void
io_write32(uint32_t val, uint32_t *addr)
{
- *(volatile uint32_t *)addr = val;
+ rte_writel(val, addr);
}
static inline void
--
2.5.5
Santosh Shukla
2016-12-15 05:45:34 UTC
Permalink
Post by Yuanhan Liu
Post by Jerin Jacob
* Following macros are derived from linux/pci_regs.h, however,
* we can't simply include that header here, as there is no such
@@ -320,37 +322,37 @@ static const struct virtio_pci_ops legacy_ops = {
static inline uint8_t
io_read8(uint8_t *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_readb(addr);
}
Oh, one more comments: why not replacing io_read8 with rte_readb(),
and do similar for others? Then we don't have to define those wrappers.
I think you can also do something similar for other patches?
Make sense for the virtio-pci case where API name io_read/write as good as
rte_read/write. However, IMO for other drivers for example ADF_CSR_RD/WR
improves code readability compared to plain rte_read/write.

Also IMO replacing code incident like below

static inline void writel(unsigned int val, volatile void __iomem *addr)
{
- *(volatile unsigned int *)addr = val;
+ rte_writel(val, addr);
}

with direct rte_read/write more appropriate. does above said make sense
to you?
If so then I will take care for all such driver in V2.

--Santosh.
Post by Yuanhan Liu
--yliu
Post by Jerin Jacob
static inline void
io_write8(uint8_t val, uint8_t *addr)
{
- *(volatile uint8_t *)addr = val;
+ rte_writeb(val, addr);
}
static inline uint16_t
io_read16(uint16_t *addr)
{
- return *(volatile uint16_t *)addr;
+ return rte_readw(addr);
}
static inline void
io_write16(uint16_t val, uint16_t *addr)
{
- *(volatile uint16_t *)addr = val;
+ rte_writew(val, addr);
}
static inline uint32_t
io_read32(uint32_t *addr)
{
- return *(volatile uint32_t *)addr;
+ return rte_readl(addr);
}
static inline void
io_write32(uint32_t val, uint32_t *addr)
{
- *(volatile uint32_t *)addr = val;
+ rte_writel(val, addr);
}
static inline void
--
2.5.5
Yuanhan Liu
2016-12-16 02:12:56 UTC
Permalink
Post by Santosh Shukla
Post by Yuanhan Liu
Post by Jerin Jacob
* Following macros are derived from linux/pci_regs.h, however,
* we can't simply include that header here, as there is no such
@@ -320,37 +322,37 @@ static const struct virtio_pci_ops legacy_ops = {
static inline uint8_t
io_read8(uint8_t *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_readb(addr);
}
Oh, one more comments: why not replacing io_read8 with rte_readb(),
and do similar for others? Then we don't have to define those wrappers.
I think you can also do something similar for other patches?
Make sense for the virtio-pci case where API name io_read/write as good as
rte_read/write.
Yes, and I think there are few others like this in your example.
Post by Santosh Shukla
However, IMO for other drivers for example ADF_CSR_RD/WR
improves code readability compared to plain rte_read/write.
Sure, for such case, we should not replace the macro.

--yliu
Jerin Jacob
2016-12-14 01:55:57 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
CC: Yong Wang <***@vmware.com>
---
drivers/net/vmxnet3/vmxnet3_ethdev.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.h b/drivers/net/vmxnet3/vmxnet3_ethdev.h
index 7d3b11e..5b6501b 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.h
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.h
@@ -34,6 +34,8 @@
#ifndef _VMXNET3_ETHDEV_H_
#define _VMXNET3_ETHDEV_H_

+#include <rte_io.h>
+
#define VMXNET3_MAX_MAC_ADDRS 1

/* UPT feature to negotiate */
@@ -120,7 +122,11 @@ struct vmxnet3_hw {

/* Config space read/writes */

-#define VMXNET3_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define VMXNET3_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})

static inline uint32_t
vmxnet3_read_addr(volatile void *addr)
@@ -128,9 +134,9 @@ vmxnet3_read_addr(volatile void *addr)
return VMXNET3_PCI_REG(addr);
}

-#define VMXNET3_PCI_REG_WRITE(reg, value) do { \
- VMXNET3_PCI_REG((reg)) = (value); \
-} while(0)
+#define VMXNET3_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(value, reg); \
+})

#define VMXNET3_PCI_BAR0_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr0 + (reg)))
--
2.5.5
Yuanhan Liu
2016-12-14 02:55:34 UTC
Permalink
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
---
drivers/net/vmxnet3/vmxnet3_ethdev.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.h b/drivers/net/vmxnet3/vmxnet3_ethdev.h
index 7d3b11e..5b6501b 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.h
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.h
@@ -34,6 +34,8 @@
#ifndef _VMXNET3_ETHDEV_H_
#define _VMXNET3_ETHDEV_H_
+#include <rte_io.h>
+
#define VMXNET3_MAX_MAC_ADDRS 1
/* UPT feature to negotiate */
@@ -120,7 +122,11 @@ struct vmxnet3_hw {
/* Config space read/writes */
-#define VMXNET3_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define VMXNET3_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})
Why not simply using rte_readl directly?

#define VMXNET3_PCI_REG(reg) rte_readl(reg)
Post by Jerin Jacob
static inline uint32_t
vmxnet3_read_addr(volatile void *addr)
@@ -128,9 +134,9 @@ vmxnet3_read_addr(volatile void *addr)
return VMXNET3_PCI_REG(addr);
}
-#define VMXNET3_PCI_REG_WRITE(reg, value) do { \
- VMXNET3_PCI_REG((reg)) = (value); \
-} while(0)
+#define VMXNET3_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(value, reg); \
+})
I think this could be done in one line.

--yliu
Santosh Shukla
2016-12-15 05:48:40 UTC
Permalink
Post by Yuanhan Liu
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
---
drivers/net/vmxnet3/vmxnet3_ethdev.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.h b/drivers/net/vmxnet3/vmxnet3_ethdev.h
index 7d3b11e..5b6501b 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.h
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.h
@@ -34,6 +34,8 @@
#ifndef _VMXNET3_ETHDEV_H_
#define _VMXNET3_ETHDEV_H_
+#include <rte_io.h>
+
#define VMXNET3_MAX_MAC_ADDRS 1
/* UPT feature to negotiate */
@@ -120,7 +122,11 @@ struct vmxnet3_hw {
/* Config space read/writes */
-#define VMXNET3_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define VMXNET3_PCI_REG(reg) ({ \
+ uint32_t __val; \
+ __val = rte_readl(reg); \
+ __val; \
+})
Why not simply using rte_readl directly?
#define VMXNET3_PCI_REG(reg) rte_readl(reg)
Ok.
Post by Yuanhan Liu
Post by Jerin Jacob
static inline uint32_t
vmxnet3_read_addr(volatile void *addr)
@@ -128,9 +134,9 @@ vmxnet3_read_addr(volatile void *addr)
return VMXNET3_PCI_REG(addr);
}
-#define VMXNET3_PCI_REG_WRITE(reg, value) do { \
- VMXNET3_PCI_REG((reg)) = (value); \
-} while(0)
+#define VMXNET3_PCI_REG_WRITE(reg, value) ({ \
+ rte_writel(value, reg); \
+})
I think this could be done in one line.
Ok.
will take care in V2.
Post by Yuanhan Liu
--yliu
Jerin Jacob
2016-12-14 01:55:58 UTC
Permalink
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/thunderx/base/nicvf_plat.h | 45 ++++++++++------------------------
1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/drivers/net/thunderx/base/nicvf_plat.h b/drivers/net/thunderx/base/nicvf_plat.h
index 83c1844..25eeb7e 100644
--- a/drivers/net/thunderx/base/nicvf_plat.h
+++ b/drivers/net/thunderx/base/nicvf_plat.h
@@ -69,31 +69,24 @@
#include <rte_ether.h>
#define NICVF_MAC_ADDR_SIZE ETHER_ADDR_LEN

+#include <rte_io.h>
+static inline void __attribute__((always_inline))
+nicvf_addr_write(uintptr_t addr, uint64_t val)
+{
+ rte_writeq_relaxed(val, (void *)addr);
+}
+
+static inline uint64_t __attribute__((always_inline))
+nicvf_addr_read(uintptr_t addr)
+{
+ return rte_readq_relaxed((void *)addr);
+}
+
/* ARM64 specific functions */
#if defined(RTE_ARCH_ARM64)
#define nicvf_prefetch_store_keep(_ptr) ({\
asm volatile("prfm pstl1keep, %a0\n" : : "p" (_ptr)); })

-static inline void __attribute__((always_inline))
-nicvf_addr_write(uintptr_t addr, uint64_t val)
-{
- asm volatile(
- "str %x[val], [%x[addr]]"
- :
- : [val] "r" (val), [addr] "r" (addr));
-}
-
-static inline uint64_t __attribute__((always_inline))
-nicvf_addr_read(uintptr_t addr)
-{
- uint64_t val;
-
- asm volatile(
- "ldr %x[val], [%x[addr]]"
- : [val] "=r" (val)
- : [addr] "r" (addr));
- return val;
-}

#define NICVF_LOAD_PAIR(reg1, reg2, addr) ({ \
asm volatile( \
@@ -106,18 +99,6 @@ nicvf_addr_read(uintptr_t addr)

#define nicvf_prefetch_store_keep(_ptr) do {} while (0)

-static inline void __attribute__((always_inline))
-nicvf_addr_write(uintptr_t addr, uint64_t val)
-{
- *(volatile uint64_t *)addr = val;
-}
-
-static inline uint64_t __attribute__((always_inline))
-nicvf_addr_read(uintptr_t addr)
-{
- return *(volatile uint64_t *)addr;
-}
-
#define NICVF_LOAD_PAIR(reg1, reg2, addr) \
do { \
reg1 = nicvf_addr_read((uintptr_t)addr); \
--
2.5.5
Yuanhan Liu
2016-12-14 02:53:57 UTC
Permalink
Post by Jerin Jacob
patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[b/w/l/q], rte_write[b/w/l/q] eal abstraction
Instead of rte_read[b/w/l/q], there is another typical naming style:
rte_read[8/16/32/64]. Any preferences? If you ask me, I'd prefer the
later.

--yliu
Bruce Richardson
2016-12-14 10:12:44 UTC
Permalink
Post by Yuanhan Liu
Post by Jerin Jacob
patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[b/w/l/q], rte_write[b/w/l/q] eal abstraction
rte_read[8/16/32/64]. Any preferences? If you ask me, I'd prefer the
later.
I think I prefer the latter too, as it aligns with our naming of atomic
functions and our use of uint16_t etc. types.

/Bruce
Jerin Jacob
2016-12-14 13:18:36 UTC
Permalink
Post by Yuanhan Liu
Post by Jerin Jacob
patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[b/w/l/q], rte_write[b/w/l/q] eal abstraction
rte_read[8/16/32/64]. Any preferences? If you ask me, I'd prefer the
later.
No strong opinion here. The rte_read[b/w/l/q] naming style is from Linux
kernel. I will change to rte_read[8/16/32/64] in v2 if there is no
objection.
Thomas Monjalon
2016-12-16 17:04:29 UTC
Permalink
Post by Jerin Jacob
Post by Yuanhan Liu
Post by Jerin Jacob
patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[b/w/l/q], rte_write[b/w/l/q] eal abstraction
rte_read[8/16/32/64]. Any preferences? If you ask me, I'd prefer the
later.
No strong opinion here. The rte_read[b/w/l/q] naming style is from Linux
kernel. I will change to rte_read[8/16/32/64] in v2 if there is no
objection.
Yes please. Let's use numbers where possible.

See this recent explanation:
http://www.mail-archive.com/linux-***@vger.kernel.org/msg1293045.html
Jerin Jacob
2016-12-27 09:49:06 UTC
Permalink
v1..v2:
1) Changed rte_[read/write]b/w/l/q_[relaxed] to rte_[read/write]8/16/32/64_[relaxed](Yuanhan)
2) Changed rte_?mb to macros for arm64(Jianbo)
3) rte_wmb() followed by rte_write* changed to rte_wmb() followed by relaxed version(rte_write_relaxed)
in _fast_ path to avoid an extra memory barrier for arm64 in fast path(Jianbo)
3) Replaced virtio io_read*/io_write* with rte_read*/rte_write*(Yuanhan)

Based on the discussion in the below-mentioned thread,
http://dev.dpdk.narkive.com/DpIRqDuy/dpdk-dev-patch-v2-i40e-fix-eth-i40e-dev-init-sequence-on-thunderx

This patchset introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the EAL abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture-agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

This patch-set split into three functional set:

patch-set 1-9: Introduce I/O device memory barrier eal abstraction and
implement it for all the architectures.

patch-set 10-13: Introduce I/O device memory read/write operations Earl abstraction
and implement it for all the architectures using previous I/O device memory
barrier.

patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[8/16/32/64], rte_write[8/16/32/64] eal abstraction

Note:

1) We couldn't test the patch on all the Hardwares due to unavailability.
Appreciate the feedback from ARCH and PMD maintainers.

2) patch 13/28 has false positive check patch error with ASM syntax

ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#92: FILE: lib/librte_eal/common/include/arch/arm/rte_io_64.h:54:
+ : [val] "=r" (val)

Jerin Jacob (15):
eal: introduce I/O device memory barriers
eal/x86: define I/O device memory barriers for IA
eal/tile: define I/O device memory barriers for tile
eal/ppc64: define I/O device memory barriers for ppc64
eal/arm: separate smp barrier definition for ARMv7 and ARMv8
eal/armv7: define I/O device memory barriers for ARMv7
eal/arm64: fix memory barrier definition for arm64
eal/arm64: define smp barrier definition for arm64
eal/arm64: define I/O device memory barriers for arm64
eal: introduce I/O device memory read/write operations
eal: generic implementation for I/O device read/write access
eal: let all architectures use generic I/O implementation
eal/arm64: override I/O device read/write access for arm64
eal/arm64: change barrier definitions to macros
net/thunderx: use eal I/O device memory read/write API

Santosh Shukla (14):
crypto/qat: use eal I/O device memory read/write API
net/bnxt: use eal I/O device memory read/write API
net/bnx2x: use eal I/O device memory read/write API
net/cxgbe: use eal I/O device memory read/write API
net/e1000: use eal I/O device memory read/write API
net/ena: use eal I/O device memory read/write API
net/enic: use eal I/O device memory read/write API
net/fm10k: use eal I/O device memory read/write API
net/i40e: use eal I/O device memory read/write API
net/ixgbe: use eal I/O device memory read/write API
net/nfp: use eal I/O device memory read/write API
net/qede: use eal I/O device memory read/write API
net/virtio: use eal I/O device memory read/write API
net/vmxnet3: use eal I/O device memory read/write API

doc/api/doxy-api-index.md | 3 +-
.../qat/qat_adf/adf_transport_access_macros.h | 11 +-
drivers/net/bnx2x/bnx2x.h | 26 +-
drivers/net/bnxt/bnxt_cpr.h | 13 +-
drivers/net/bnxt/bnxt_hwrm.c | 7 +-
drivers/net/bnxt/bnxt_txr.h | 6 +-
drivers/net/cxgbe/base/adapter.h | 34 ++-
drivers/net/cxgbe/cxgbe_compat.h | 8 +-
drivers/net/cxgbe/sge.c | 10 +-
drivers/net/e1000/base/e1000_osdep.h | 18 +-
drivers/net/e1000/em_rxtx.c | 2 +-
drivers/net/e1000/igb_rxtx.c | 2 +-
drivers/net/ena/base/ena_eth_com.h | 2 +-
drivers/net/ena/base/ena_plat_dpdk.h | 11 +-
drivers/net/enic/enic_compat.h | 27 +-
drivers/net/enic/enic_rxtx.c | 9 +-
drivers/net/fm10k/base/fm10k_osdep.h | 17 +-
drivers/net/i40e/base/i40e_osdep.h | 10 +-
drivers/net/i40e/i40e_rxtx.c | 4 +-
drivers/net/ixgbe/base/ixgbe_osdep.h | 11 +-
drivers/net/ixgbe/ixgbe_rxtx.c | 13 +-
drivers/net/nfp/nfp_net_pmd.h | 9 +-
drivers/net/qede/base/bcm_osal.h | 20 +-
drivers/net/qede/base/ecore_int_api.h | 28 +-
drivers/net/qede/base/ecore_spq.c | 3 +-
drivers/net/qede/qede_rxtx.c | 2 +-
drivers/net/thunderx/base/nicvf_plat.h | 36 +--
drivers/net/virtio/virtio_pci.c | 97 ++-----
drivers/net/vmxnet3/vmxnet3_ethdev.h | 8 +-
lib/librte_eal/common/Makefile | 3 +-
.../common/include/arch/arm/rte_atomic.h | 6 -
.../common/include/arch/arm/rte_atomic_32.h | 12 +
.../common/include/arch/arm/rte_atomic_64.h | 57 ++--
lib/librte_eal/common/include/arch/arm/rte_io.h | 51 ++++
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 159 +++++++++++
.../common/include/arch/ppc_64/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/ppc_64/rte_io.h | 47 +++
.../common/include/arch/tile/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/tile/rte_io.h | 47 +++
.../common/include/arch/x86/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/x86/rte_io.h | 47 +++
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++
lib/librte_eal/common/include/generic/rte_io.h | 317 +++++++++++++++++++++
43 files changed, 980 insertions(+), 258 deletions(-)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/tile/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_io.h
create mode 100644 lib/librte_eal/common/include/generic/rte_io.h
--
2.5.5
Jerin Jacob
2016-12-27 09:49:07 UTC
Permalink
This commit introduce rte_io_mb(), rte_io_wmb() and rte_io_rmb(), in
order to enable memory barriers between I/O device and CPU.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
index 43a704e..7b81705 100644
--- a/lib/librte_eal/common/include/generic/rte_atomic.h
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -100,6 +100,33 @@ static inline void rte_smp_wmb(void);
*/
static inline void rte_smp_rmb(void);

+/**
+ * General memory barrier for I/O device
+ *
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_io_mb() call are visible to I/O device or CPU before the
+ * LOAD and STORE operations that follow it.
+ */
+static inline void rte_io_mb(void);
+
+/**
+ * Write memory barrier for I/O device
+ *
+ * Guarantees that the STORE operations that precede the
+ * rte_io_wmb() call are visible to I/O device before the STORE
+ * operations that follow it.
+ */
+static inline void rte_io_wmb(void);
+
+/**
+ * Read memory barrier for IO device
+ *
+ * Guarantees that the LOAD operations on I/O device that precede the
+ * rte_io_rmb() call are visible to CPU before the LOAD
+ * operations that follow it.
+ */
+static inline void rte_io_rmb(void);
+
#endif /* __DOXYGEN__ */

/**
--
2.5.5
Jerin Jacob
2016-12-27 09:49:08 UTC
Permalink
The patch does not provide any functional change for IA.
I/O barriers are mapped to existing smp barriers.

CC: Bruce Richardson <***@intel.com>
CC: Konstantin Ananyev <***@intel.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/x86/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
index 00b1cdf..4eac666 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -61,6 +61,12 @@ extern "C" {

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
/*------------------------- 16 bit atomic operations -------------------------*/

#ifndef RTE_FORCE_INTRINSICS
--
2.5.5
Jerin Jacob
2016-12-27 09:49:09 UTC
Permalink
The patch does not provide any functional change for tile.
I/O barriers are mapped to existing smp barriers.

CC: Zhigang Lu <***@ezchip.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/tile/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/tile/rte_atomic.h b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
index 28825ff..1f332ee 100644
--- a/lib/librte_eal/common/include/arch/tile/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
@@ -85,6 +85,12 @@ static inline void rte_rmb(void)

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-27 09:49:10 UTC
Permalink
The patch does not provide any functional change for ppc_64.
I/O barriers are mapped to existing smp barriers.

CC: Chao Zhu <***@linux.vnet.ibm.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
index fb4fccb..150810c 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -87,6 +87,12 @@ extern "C" {

#define rte_smp_rmb() rte_rmb()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
/*------------------------- 16 bit atomic operations -------------------------*/
/* To be compatible with Power7, use GCC built-in functions for 16 bit
* operations */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:11 UTC
Permalink
Separate the smp barrier definition for arm and arm64 for fine
control on smp barrier definition for each architecture.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic.h | 6 ------
lib/librte_eal/common/include/arch/arm/rte_atomic_32.h | 6 ++++++
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
index 454a12b..f3f3b6e 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -39,10 +39,4 @@
#include <rte_atomic_32.h>
#endif

-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#endif /* _RTE_ATOMIC_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
index 9ae1e78..dd627a0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
@@ -67,6 +67,12 @@ extern "C" {
*/
#define rte_rmb() __sync_synchronize()

+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_wmb()
+
+#define rte_smp_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 671caa7..d854aac 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -81,6 +81,12 @@ static inline void rte_rmb(void)
dmb(ishld);
}

+#define rte_smp_mb() rte_mb()
+
+#define rte_smp_wmb() rte_wmb()
+
+#define rte_smp_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-27 09:49:12 UTC
Permalink
The patch does not provide any functional change for ARMv7.
I/O barriers are mapped to existing smp barriers.

CC: Jan Viktorin <***@rehivetech.com>
CC: Jianbo Liu <***@linaro.org>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_32.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
index dd627a0..14c0486 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_32.h
@@ -73,6 +73,12 @@ extern "C" {

#define rte_smp_rmb() rte_rmb()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2016-12-27 09:49:13 UTC
Permalink
dsb instruction based barrier is used for non smp
version of memory barrier.

Fixes: d708f01b7102 ("eal/arm: add atomic operations for ARMv8")

CC: Jianbo Liu <***@linaro.org>
CC: ***@dpdk.org
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index d854aac..bc7de64 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -43,7 +43,8 @@ extern "C" {

#include "generic/rte_atomic.h"

-#define dmb(opt) do { asm volatile("dmb " #opt : : : "memory"); } while (0)
+#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
+#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }

/**
* General memory barrier.
@@ -54,7 +55,7 @@ extern "C" {
*/
static inline void rte_mb(void)
{
- dmb(ish);
+ dsb(sy);
}

/**
@@ -66,7 +67,7 @@ static inline void rte_mb(void)
*/
static inline void rte_wmb(void)
{
- dmb(ishst);
+ dsb(st);
}

/**
@@ -78,7 +79,7 @@ static inline void rte_wmb(void)
*/
static inline void rte_rmb(void)
{
- dmb(ishld);
+ dsb(ld);
}

#define rte_smp_mb() rte_mb()
--
2.5.5
Jianbo Liu
2017-01-03 07:40:25 UTC
Permalink
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
dsb instruction based barrier is used for non smp
version of memory barrier.
Fixes: d708f01b7102 ("eal/arm: add atomic operations for ARMv8")
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index d854aac..bc7de64 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -43,7 +43,8 @@ extern "C" {
#include "generic/rte_atomic.h"
-#define dmb(opt) do { asm volatile("dmb " #opt : : : "memory"); } while (0)
+#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
+#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }
/**
* General memory barrier.
@@ -54,7 +55,7 @@ extern "C" {
*/
static inline void rte_mb(void)
{
- dmb(ish);
+ dsb(sy);
}
/**
@@ -66,7 +67,7 @@ static inline void rte_mb(void)
*/
static inline void rte_wmb(void)
{
- dmb(ishst);
+ dsb(st);
}
/**
@@ -78,7 +79,7 @@ static inline void rte_wmb(void)
*/
static inline void rte_rmb(void)
{
- dmb(ishld);
+ dsb(ld);
}
#define rte_smp_mb() rte_mb()
--
2.5.5
Acked-by: Jianbo Liu <***@linaro.org>
Jerin Jacob
2016-12-27 09:49:14 UTC
Permalink
dmb instruction based barrier is used for smp version of memory barrier.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index bc7de64..78ebea2 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -82,11 +82,11 @@ static inline void rte_rmb(void)
dsb(ld);
}

-#define rte_smp_mb() rte_mb()
+#define rte_smp_mb() dmb(ish)

-#define rte_smp_wmb() rte_wmb()
+#define rte_smp_wmb() dmb(ishst)

-#define rte_smp_rmb() rte_rmb()
+#define rte_smp_rmb() dmb(ishld)

#ifdef __cplusplus
}
--
2.5.5
Jerin Jacob
2016-12-27 09:49:15 UTC
Permalink
CC: Jianbo Liu <***@linaro.org>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)

#define rte_smp_rmb() dmb(ishld)

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jianbo Liu
2017-01-03 07:48:32 UTC
Permalink
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)
#define rte_smp_rmb() dmb(ishld)
+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Jerin Jacob
2017-01-04 10:01:05 UTC
Permalink
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)
#define rte_smp_rmb() dmb(ishld)
+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Its is difficult to generalize. AFAIK, from the IO barrier perspective
dsb would be the right candidate. But just for the DMA barrier between IO may
be outer sharable dmb is enough. In-terms of performance implication, the
fastpath code(door bell write) has been changed to relaxed write in all
the drivers in this patchset and rte_io_* will be only
used by rte_[read/write]8/16/32/64 which will be in slow-path.
So, IMO, it better stick with dsb and its safe from the complete IO barrier
perspective.

At least on ThunderX, I couldn't see any performance difference between
using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
write in fastpath. In case there are platforms which has such performance difference,
may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it,
though)

Jerin
Jianbo Liu
2017-01-05 05:31:44 UTC
Permalink
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)
#define rte_smp_rmb() dmb(ishld)
+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Its is difficult to generalize. AFAIK, from the IO barrier perspective
dsb would be the right candidate. But just for the DMA barrier between IO may
be outer sharable dmb is enough. In-terms of performance implication, the
fastpath code(door bell write) has been changed to relaxed write in all
the drivers in this patchset and rte_io_* will be only
used by rte_[read/write]8/16/32/64 which will be in slow-path.
So, IMO, it better stick with dsb and its safe from the complete IO barrier
perspective.
If so, why not use *mb() directly?
Post by Jerin Jacob
At least on ThunderX, I couldn't see any performance difference between
using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
write in fastpath. In case there are platforms which has such performance difference,
may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it,
though)
But there is no io_*mb() in the kernel, so you want to be different?
Jerin Jacob
2017-01-05 06:24:31 UTC
Permalink
Post by Jianbo Liu
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)
#define rte_smp_rmb() dmb(ishld)
+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Its is difficult to generalize. AFAIK, from the IO barrier perspective
dsb would be the right candidate. But just for the DMA barrier between IO may
be outer sharable dmb is enough. In-terms of performance implication, the
fastpath code(door bell write) has been changed to relaxed write in all
the drivers in this patchset and rte_io_* will be only
used by rte_[read/write]8/16/32/64 which will be in slow-path.
So, IMO, it better stick with dsb and its safe from the complete IO barrier
perspective.
If so, why not use *mb() directly?
Adding David Marchand, EAL Maintainer.

Instead of rte_io_?. I thought, IO specific constraints can be abstracted
here in rte_io_*. Apart from arm, there other arch like "arc" has similar
constraints. IMHO, no harm in keeping that abstraction.

Thoughts ?

http://lxr.free-electrons.com/ident?i=__iormb
Post by Jianbo Liu
Post by Jerin Jacob
At least on ThunderX, I couldn't see any performance difference between
using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
write in fastpath. In case there are platforms which has such performance difference,
may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it,
though)
But there is no io_*mb() in the kernel, so you want to be different?
It is their for arm,arm64,arc architectures in Linux kernel. Please check writel
implementation for arm64

http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L143
Jianbo Liu
2017-01-05 06:47:27 UTC
Permalink
Post by Jerin Jacob
Post by Jianbo Liu
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 78ebea2..ef0efc7 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -88,6 +88,12 @@ static inline void rte_rmb(void)
#define rte_smp_rmb() dmb(ishld)
+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Its is difficult to generalize. AFAIK, from the IO barrier perspective
dsb would be the right candidate. But just for the DMA barrier between IO may
be outer sharable dmb is enough. In-terms of performance implication, the
fastpath code(door bell write) has been changed to relaxed write in all
the drivers in this patchset and rte_io_* will be only
used by rte_[read/write]8/16/32/64 which will be in slow-path.
So, IMO, it better stick with dsb and its safe from the complete IO barrier
perspective.
If so, why not use *mb() directly?
Adding David Marchand, EAL Maintainer.
Instead of rte_io_?. I thought, IO specific constraints can be abstracted
here in rte_io_*. Apart from arm, there other arch like "arc" has similar
constraints. IMHO, no harm in keeping that abstraction.
Thoughts ?
http://lxr.free-electrons.com/ident?i=__iormb
Post by Jianbo Liu
Post by Jerin Jacob
At least on ThunderX, I couldn't see any performance difference between
using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
write in fastpath. In case there are platforms which has such performance difference,
may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it,
though)
But there is no io_*mb() in the kernel, so you want to be different?
It is their for arm,arm64,arc architectures in Linux kernel. Please check writel
implementation for arm64
http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L143
Yes, I knew. But I'm afraid it will be mixed with dma_*mb by someone else.
Jerin Jacob
2017-01-05 07:22:46 UTC
Permalink
Post by Jianbo Liu
Post by Jerin Jacob
Post by Jianbo Liu
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
I think it's better to use outer shareable dmb for io barrier, instead of dsb.
Its is difficult to generalize. AFAIK, from the IO barrier perspective
dsb would be the right candidate. But just for the DMA barrier between IO may
be outer sharable dmb is enough. In-terms of performance implication, the
fastpath code(door bell write) has been changed to relaxed write in all
the drivers in this patchset and rte_io_* will be only
used by rte_[read/write]8/16/32/64 which will be in slow-path.
So, IMO, it better stick with dsb and its safe from the complete IO barrier
perspective.
If so, why not use *mb() directly?
Adding David Marchand, EAL Maintainer.
Instead of rte_io_?. I thought, IO specific constraints can be abstracted
here in rte_io_*. Apart from arm, there other arch like "arc" has similar
constraints. IMHO, no harm in keeping that abstraction.
Thoughts ?
http://lxr.free-electrons.com/ident?i=__iormb
Post by Jianbo Liu
Post by Jerin Jacob
At least on ThunderX, I couldn't see any performance difference between
using dsb(st) and dmb(oshst) for dma write barrier before the doorbell register
write in fastpath. In case there are platforms which has such performance difference,
may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux kernel
dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it,
though)
But there is no io_*mb() in the kernel, so you want to be different?
It is their for arm,arm64,arc architectures in Linux kernel. Please check writel
implementation for arm64
http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L143
Yes, I knew. But I'm afraid it will be mixed with dma_*mb by someone else.
OK. Got it. To me both are totally different.
Feel free introduce additional dma_*mb* then, if you think its solving
any problem in DPDK.I am not seeing any performance different between
dsb sy and dmb outer one. If you have any platform that has performance
difference then I _think_ you can introduce dma_*mb()
Jerin Jacob
2016-12-27 09:49:16 UTC
Permalink
This commit introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the eal abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
doc/api/doxy-api-index.md | 3 +-
lib/librte_eal/common/Makefile | 3 +-
lib/librte_eal/common/include/generic/rte_io.h | 263 +++++++++++++++++++++++++
3 files changed, 267 insertions(+), 2 deletions(-)
create mode 100644 lib/librte_eal/common/include/generic/rte_io.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 99a1b7a..47a3580 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -70,7 +70,8 @@ There are many libraries, so their headers may be grouped by topics:
[branch prediction] (@ref rte_branch_prediction.h),
[cache prefetch] (@ref rte_prefetch.h),
[byte order] (@ref rte_byteorder.h),
- [CPU flags] (@ref rte_cpuflags.h)
+ [CPU flags] (@ref rte_cpuflags.h),
+ [I/O access] (@ref rte_io.h)

- **CPU multicore**:
[interrupts] (@ref rte_interrupts.h),
diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index a92c984..6498c15 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -43,7 +43,8 @@ INC += rte_pci_dev_feature_defs.h rte_pci_dev_features.h
INC += rte_malloc.h rte_keepalive.h rte_time.h

GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
-GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h rte_rwlock.h
+GENERIC_INC += rte_spinlock.h rte_memcpy.h rte_cpuflags.h rte_rwlock.h rte_io.h
+
# defined in mk/arch/$(RTE_ARCH)/rte.vars.mk
ARCH_DIR ?= $(RTE_ARCH)
ARCH_INC := $(notdir $(wildcard $(RTE_SDK)/lib/librte_eal/common/include/arch/$(ARCH_DIR)/*.h))
diff --git a/lib/librte_eal/common/include/generic/rte_io.h b/lib/librte_eal/common/include/generic/rte_io.h
new file mode 100644
index 0000000..edfebf8
--- /dev/null
+++ b/lib/librte_eal/common/include/generic/rte_io.h
@@ -0,0 +1,263 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_H_
+#define _RTE_IO_H_
+
+/**
+ * @file
+ * I/O device memory operations
+ *
+ * This file defines the generic API for I/O device memory read/write operations
+ */
+
+#include <stdint.h>
+#include <rte_common.h>
+#include <rte_atomic.h>
+
+#ifdef __DOXYGEN__
+
+/**
+ * Read a 8-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint8_t
+rte_read8_relaxed(const volatile void *addr);
+
+/**
+ * Read a 16-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint16_t
+rte_read16_relaxed(const volatile void *addr);
+
+/**
+ * Read a 32-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint32_t
+rte_read32_relaxed(const volatile void *addr);
+
+/**
+ * Read a 64-bit value from I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint64_t
+rte_read64_relaxed(const volatile void *addr);
+
+/**
+ * Write a 8-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+
+static inline void
+rte_write8_relaxed(uint8_t value, volatile void *addr);
+
+/**
+ * Write a 16-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write16_relaxed(uint16_t value, volatile void *addr);
+
+/**
+ * Write a 32-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write32_relaxed(uint32_t value, volatile void *addr);
+
+/**
+ * Write a 64-bit value to I/O device memory address *addr*.
+ *
+ * The relaxed version does not have additional I/O memory barrier, useful in
+ * accessing the device registers of integrated controllers which implicitly
+ * strongly ordered with respect to memory access.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write64_relaxed(uint64_t value, volatile void *addr);
+
+/**
+ * Read a 8-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint8_t
+rte_read8(const volatile void *addr);
+
+/**
+ * Read a 16-bit value from I/O device memory address *addr*.
+ *
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint16_t
+rte_read16(const volatile void *addr);
+
+/**
+ * Read a 32-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint32_t
+rte_read32(const volatile void *addr);
+
+/**
+ * Read a 64-bit value from I/O device memory address *addr*.
+ *
+ * @param addr
+ * I/O memory address to read the value from
+ * @return
+ * read value
+ */
+static inline uint64_t
+rte_read64(const volatile void *addr);
+
+/**
+ * Write a 8-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+
+static inline void
+rte_write8(uint8_t value, volatile void *addr);
+
+/**
+ * Write a 16-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write16(uint16_t value, volatile void *addr);
+
+/**
+ * Write a 32-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write32(uint32_t value, volatile void *addr);
+
+/**
+ * Write a 64-bit value to I/O device memory address *addr*.
+ *
+ * @param value
+ * Value to write
+ * @param addr
+ * I/O memory address to write the value to
+ */
+static inline void
+rte_write64(uint64_t value, volatile void *addr);
+
+#endif /* __DOXYGEN__ */
+
+#endif /* _RTE_IO_H_ */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:17 UTC
Permalink
This patch implements the generic version of rte_read[b/w/l/q]_[relaxed]
and rte_write[b/w/l/q]_[relaxed] using rte_io_wmb() and rte_io_rmb()

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/generic/rte_io.h | 54 ++++++++++++++++++++++++++
1 file changed, 54 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_io.h b/lib/librte_eal/common/include/generic/rte_io.h
index edfebf8..342bfec 100644
--- a/lib/librte_eal/common/include/generic/rte_io.h
+++ b/lib/librte_eal/common/include/generic/rte_io.h
@@ -34,6 +34,8 @@
#ifndef _RTE_IO_H_
#define _RTE_IO_H_

+#include <rte_atomic.h>
+
/**
* @file
* I/O device memory operations
@@ -260,4 +262,56 @@ rte_write64(uint64_t value, volatile void *addr);

#endif /* __DOXYGEN__ */

+#ifndef RTE_OVERRIDE_IO_H
+
+#define rte_read8_relaxed(addr) \
+ ({ uint8_t __v = *(const volatile uint8_t *)addr; __v; })
+
+#define rte_read16_relaxed(addr) \
+ ({ uint16_t __v = *(const volatile uint16_t *)addr; __v; })
+
+#define rte_read32_relaxed(addr) \
+ ({ uint32_t __v = *(const volatile uint32_t *)addr; __v; })
+
+#define rte_read64_relaxed(addr) \
+ ({ uint64_t __v = *(const volatile uint64_t *)addr; __v; })
+
+#define rte_write8_relaxed(value, addr) \
+ ({ *(volatile uint8_t *)addr = value; })
+
+#define rte_write16_relaxed(value, addr) \
+ ({ *(volatile uint16_t *)addr = value; })
+
+#define rte_write32_relaxed(value, addr) \
+ ({ *(volatile uint32_t *)addr = value; })
+
+#define rte_write64_relaxed(value, addr) \
+ ({ *(volatile uint64_t *)addr = value; })
+
+#define rte_read8(addr) \
+ ({ uint8_t __v = *(const volatile uint8_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_read16(addr) \
+ ({uint16_t __v = *(const volatile uint16_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_read32(addr) \
+ ({uint32_t __v = *(const volatile uint32_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_read64(addr) \
+ ({uint64_t __v = *(const volatile uint64_t *)addr; rte_io_rmb(); __v; })
+
+#define rte_write8(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint8_t *)addr = value; })
+
+#define rte_write16(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint16_t *)addr = value; })
+
+#define rte_write32(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint32_t *)addr = value; })
+
+#define rte_write64(value, addr) \
+ ({ rte_io_wmb(); *(volatile uint64_t *)addr = value; })
+
+#endif /* RTE_OVERRIDE_IO_H */
+
#endif /* _RTE_IO_H_ */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:18 UTC
Permalink
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/ppc_64/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/tile/rte_io.h | 47 ++++++++++++++++++++++
lib/librte_eal/common/include/arch/x86/rte_io.h | 47 ++++++++++++++++++++++
4 files changed, 188 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/tile/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_io.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
new file mode 100644
index 0000000..74c1f2c
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM_H_
+#define _RTE_IO_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_ARM_H_ */
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_io.h b/lib/librte_eal/common/include/arch/ppc_64/rte_io.h
new file mode 100644
index 0000000..be192da
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_PPC_64_H_
+#define _RTE_IO_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_PPC_64_H_ */
diff --git a/lib/librte_eal/common/include/arch/tile/rte_io.h b/lib/librte_eal/common/include/arch/tile/rte_io.h
new file mode 100644
index 0000000..9c8588f
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/tile/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_TILE_H_
+#define _RTE_IO_TILE_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_TILE_H_ */
diff --git a/lib/librte_eal/common/include/arch/x86/rte_io.h b/lib/librte_eal/common/include/arch/x86/rte_io.h
new file mode 100644
index 0000000..c8d1404
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/x86/rte_io.h
@@ -0,0 +1,47 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Cavium networks. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_X86_H_
+#define _RTE_IO_X86_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_io.h"
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_X86_H_ */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:19 UTC
Permalink
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/arm/rte_io.h | 4 +
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 159 +++++++++++++++++++++
2 files changed, 163 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_io.h b/lib/librte_eal/common/include/arch/arm/rte_io.h
index 74c1f2c..9593b42 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_io.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_io.h
@@ -38,7 +38,11 @@
extern "C" {
#endif

+#ifdef RTE_ARCH_64
+#include "rte_io_64.h"
+#else
#include "generic/rte_io.h"
+#endif

#ifdef __cplusplus
}
diff --git a/lib/librte_eal/common/include/arch/arm/rte_io_64.h b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
new file mode 100644
index 0000000..7759595
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_io_64.h
@@ -0,0 +1,159 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (C) Cavium networks Ltd. 2016.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_IO_ARM64_H_
+#define _RTE_IO_ARM64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#define RTE_OVERRIDE_IO_H
+
+#include "generic/rte_io.h"
+#include "rte_atomic_64.h"
+
+static inline __attribute__((always_inline)) uint8_t
+rte_read8_relaxed(const volatile void *addr)
+{
+ uint8_t val;
+
+ asm volatile(
+ "ldrb %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint16_t
+rte_read16_relaxed(const volatile void *addr)
+{
+ uint16_t val;
+
+ asm volatile(
+ "ldrh %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint32_t
+rte_read32_relaxed(const volatile void *addr)
+{
+ uint32_t val;
+
+ asm volatile(
+ "ldr %w[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) uint64_t
+rte_read64_relaxed(const volatile void *addr)
+{
+ uint64_t val;
+
+ asm volatile(
+ "ldr %x[val], [%x[addr]]"
+ : [val] "=r" (val)
+ : [addr] "r" (addr));
+ return val;
+}
+
+static inline __attribute__((always_inline)) void
+rte_write8_relaxed(uint8_t val, volatile void *addr)
+{
+ asm volatile(
+ "strb %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+rte_write16_relaxed(uint16_t val, volatile void *addr)
+{
+ asm volatile(
+ "strh %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+rte_write32_relaxed(uint32_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %w[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+static inline __attribute__((always_inline)) void
+rte_write64_relaxed(uint64_t val, volatile void *addr)
+{
+ asm volatile(
+ "str %x[val], [%x[addr]]"
+ :
+ : [val] "r" (val), [addr] "r" (addr));
+}
+
+#define rte_read8(addr) \
+ ({ uint8_t __v = rte_read8_relaxed(addr); rte_io_rmb(); __v; })
+
+#define rte_read16(addr) \
+ ({ uint16_t __v = rte_read16_relaxed(addr); rte_io_rmb(); __v; })
+
+#define rte_read32(addr) \
+ ({ uint32_t __v = rte_read32_relaxed(addr); rte_io_rmb(); __v; })
+
+#define rte_read64(addr) \
+ ({ uint64_t __v = rte_read64_relaxed(addr); rte_io_rmb(); __v; })
+
+#define rte_write8(value, addr) \
+ ({ rte_io_wmb(); rte_write8_relaxed(value, addr); })
+
+#define rte_write16(value, addr) \
+ ({ rte_io_wmb(); rte_write16_relaxed(value, addr); })
+
+#define rte_write32(value, addr) \
+ ({ rte_io_wmb(); rte_write32_relaxed(value, addr); })
+
+#define rte_write64(value, addr) \
+ ({ rte_io_wmb(); rte_write64_relaxed(value, addr); })
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_IO_ARM64_H_ */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:20 UTC
Permalink
Change rte_?wb definitions to macros in order to
keep consistent with other barrier definitions in
the file.

Suggested-by: Jianbo Liu <***@linaro.org>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
.../common/include/arch/arm/rte_atomic_64.h | 36 ++--------------------
1 file changed, 3 insertions(+), 33 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index ef0efc7..dc3a0f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -46,41 +46,11 @@ extern "C" {
#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }

-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_mb(void)
-{
- dsb(sy);
-}
+#define rte_mb() dsb(sy)

-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_wmb(void)
-{
- dsb(st);
-}
+#define rte_wmb() dsb(st)

-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_rmb(void)
-{
- dsb(ld);
-}
+#define rte_rmb() dsb(ld)

#define rte_smp_mb() dmb(ish)
--
2.5.5
Jianbo Liu
2017-01-03 07:55:45 UTC
Permalink
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
Change rte_?wb definitions to macros in order to
use rte_*mb?
Post by Jerin Jacob
keep consistent with other barrier definitions in
the file.
---
.../common/include/arch/arm/rte_atomic_64.h | 36 ++--------------------
1 file changed, 3 insertions(+), 33 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index ef0efc7..dc3a0f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -46,41 +46,11 @@ extern "C" {
#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_mb(void)
-{
- dsb(sy);
-}
+#define rte_mb() dsb(sy)
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_wmb(void)
-{
- dsb(st);
-}
+#define rte_wmb() dsb(st)
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- * This function is architecture dependent.
- */
How about keep the comments for all these macros?
Post by Jerin Jacob
-static inline void rte_rmb(void)
-{
- dsb(ld);
-}
+#define rte_rmb() dsb(ld)
#define rte_smp_mb() dmb(ish)
--
2.5.5
Jerin Jacob
2017-01-04 10:09:14 UTC
Permalink
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
Change rte_?wb definitions to macros in order to
use rte_*mb?
IMHO, regex ? is appropriate here.
https://en.wikipedia.org/wiki/Regular_expression
Post by Jianbo Liu
Post by Jerin Jacob
keep consistent with other barrier definitions in
the file.
---
.../common/include/arch/arm/rte_atomic_64.h | 36 ++--------------------
1 file changed, 3 insertions(+), 33 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index ef0efc7..dc3a0f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -46,41 +46,11 @@ extern "C" {
#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_mb(void)
-{
- dsb(sy);
-}
+#define rte_mb() dsb(sy)
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_wmb(void)
-{
- dsb(st);
-}
+#define rte_wmb() dsb(st)
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- * This function is architecture dependent.
- */
How about keep the comments for all these macros?
lib/librte_eal/common/include/generic/rte_atomic.h file has description
for all the barriers.All other arch are doing in the same-way.
Post by Jianbo Liu
Post by Jerin Jacob
-static inline void rte_rmb(void)
-{
- dsb(ld);
-}
+#define rte_rmb() dsb(ld)
#define rte_smp_mb() dmb(ish)
--
2.5.5
Tiwei Bie
2017-01-04 11:00:30 UTC
Permalink
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
Change rte_?wb definitions to macros in order to
use rte_*mb?
IMHO, regex ? is appropriate here.
https://en.wikipedia.org/wiki/Regular_expression
+#define rte_mb() dsb(sy)
+#define rte_wmb() dsb(st)
+#define rte_rmb() dsb(ld)
If it's a regex, shouldn't it be: rte_[wr]?mb or rte_.?mb

If ? is a wildcard used by shell, it should at least be: rte_?mb
But rte_*mb is easier to recognize, and matches all of them. :-)

Best regards,
Tiwei Bie
Post by Jerin Jacob
Post by Jianbo Liu
Post by Jerin Jacob
keep consistent with other barrier definitions in
the file.
---
.../common/include/arch/arm/rte_atomic_64.h | 36 ++--------------------
1 file changed, 3 insertions(+), 33 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index ef0efc7..dc3a0f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -46,41 +46,11 @@ extern "C" {
#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_mb(void)
-{
- dsb(sy);
-}
+#define rte_mb() dsb(sy)
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_wmb(void)
-{
- dsb(st);
-}
+#define rte_wmb() dsb(st)
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- * This function is architecture dependent.
- */
How about keep the comments for all these macros?
lib/librte_eal/common/include/generic/rte_atomic.h file has description
for all the barriers.All other arch are doing in the same-way.
Post by Jianbo Liu
Post by Jerin Jacob
-static inline void rte_rmb(void)
-{
- dsb(ld);
-}
+#define rte_rmb() dsb(ld)
#define rte_smp_mb() dmb(ish)
--
2.5.5
Jerin Jacob
2017-01-04 13:03:07 UTC
Permalink
Post by Tiwei Bie
Post by Jerin Jacob
Post by Jianbo Liu
On 27 December 2016 at 17:49, Jerin Jacob
Post by Jerin Jacob
Change rte_?wb definitions to macros in order to
use rte_*mb?
IMHO, regex ? is appropriate here.
https://en.wikipedia.org/wiki/Regular_expression
+#define rte_mb() dsb(sy)
+#define rte_wmb() dsb(st)
+#define rte_rmb() dsb(ld)
If it's a regex, shouldn't it be: rte_[wr]?mb or rte_.?mb
If ? is a wildcard used by shell, it should at least be: rte_?mb
But rte_*mb is easier to recognize, and matches all of them. :-)
OK. I will wait for further comments on this patchset(especially comments
on driver changes) and post v3 to fix this
Post by Tiwei Bie
Best regards,
Tiwei Bie
Post by Jerin Jacob
Post by Jianbo Liu
Post by Jerin Jacob
keep consistent with other barrier definitions in
the file.
---
.../common/include/arch/arm/rte_atomic_64.h | 36 ++--------------------
1 file changed, 3 insertions(+), 33 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index ef0efc7..dc3a0f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -46,41 +46,11 @@ extern "C" {
#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); }
#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); }
-/**
- * General memory barrier.
- *
- * Guarantees that the LOAD and STORE operations generated before the
- * barrier occur before the LOAD and STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_mb(void)
-{
- dsb(sy);
-}
+#define rte_mb() dsb(sy)
-/**
- * Write memory barrier.
- *
- * Guarantees that the STORE operations generated before the barrier
- * occur before the STORE operations generated after.
- * This function is architecture dependent.
- */
-static inline void rte_wmb(void)
-{
- dsb(st);
-}
+#define rte_wmb() dsb(st)
-/**
- * Read memory barrier.
- *
- * Guarantees that the LOAD operations generated before the barrier
- * occur before the LOAD operations generated after.
- * This function is architecture dependent.
- */
How about keep the comments for all these macros?
lib/librte_eal/common/include/generic/rte_atomic.h file has description
for all the barriers.All other arch are doing in the same-way.
Post by Jianbo Liu
Post by Jerin Jacob
-static inline void rte_rmb(void)
-{
- dsb(ld);
-}
+#define rte_rmb() dsb(ld)
#define rte_smp_mb() dmb(ish)
--
2.5.5
Jerin Jacob
2016-12-27 09:49:21 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

CC: John Griffin <***@intel.com>
CC: Fiona Trahe <***@intel.com>
CC: Deepak Kumar Jain <***@intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/crypto/qat/qat_adf/adf_transport_access_macros.h | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h b/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
index 47f1c91..d218f85 100644
--- a/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
+++ b/drivers/crypto/qat/qat_adf/adf_transport_access_macros.h
@@ -47,14 +47,15 @@
#ifndef ADF_TRANSPORT_ACCESS_MACROS_H
#define ADF_TRANSPORT_ACCESS_MACROS_H

+#include <rte_io.h>
+
/* CSR write macro */
-#define ADF_CSR_WR(csrAddr, csrOffset, val) \
- (void)((*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset)) \
- = (val)))
+#define ADF_CSR_WR(csrAddr, csrOffset, val) \
+ rte_write32(val, (((uint8_t *)csrAddr) + csrOffset))

/* CSR read macro */
-#define ADF_CSR_RD(csrAddr, csrOffset) \
- (*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset)))
+#define ADF_CSR_RD(csrAddr, csrOffset) \
+ rte_read32((((uint8_t *)csrAddr) + csrOffset))

#define ADF_BANK_INT_SRC_SEL_MASK_0 0x4444444CUL
#define ADF_BANK_INT_SRC_SEL_MASK_X 0x44444444UL
--
2.5.5
Jerin Jacob
2016-12-27 09:49:22 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

CC: Stephen Hurd <***@broadcom.com>
CC: Ajit Khaparde <***@broadcom.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/bnxt/bnxt_cpr.h | 13 ++++++++-----
drivers/net/bnxt/bnxt_hwrm.c | 7 +++++--
drivers/net/bnxt/bnxt_txr.h | 6 +++---
3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.h b/drivers/net/bnxt/bnxt_cpr.h
index f9f2adb..83e5376 100644
--- a/drivers/net/bnxt/bnxt_cpr.h
+++ b/drivers/net/bnxt/bnxt_cpr.h
@@ -34,6 +34,8 @@
#ifndef _BNXT_CPR_H_
#define _BNXT_CPR_H_

+#include <rte_io.h>
+
#define CMP_VALID(cmp, raw_cons, ring) \
(!!(((struct cmpl_base *)(cmp))->info3_v & CMPL_BASE_V) == \
!((raw_cons) & ((ring)->ring_size)))
@@ -50,13 +52,14 @@
#define DB_CP_FLAGS (DB_KEY_CP | DB_IDX_VALID | DB_IRQ_DIS)

#define B_CP_DB_REARM(cpr, raw_cons) \
- (*(uint32_t *)((cpr)->cp_doorbell) = (DB_CP_REARM_FLAGS | \
- RING_CMP(cpr->cp_ring_struct, raw_cons)))
+ rte_write32((DB_CP_REARM_FLAGS | \
+ RING_CMP(((cpr)->cp_ring_struct), raw_cons)), \
+ ((cpr)->cp_doorbell))

#define B_CP_DIS_DB(cpr, raw_cons) \
- rte_smp_wmb(); \
- (*(uint32_t *)((cpr)->cp_doorbell) = (DB_CP_FLAGS | \
- RING_CMP(cpr->cp_ring_struct, raw_cons)))
+ rte_write32((DB_CP_FLAGS | \
+ RING_CMP(((cpr)->cp_ring_struct), raw_cons)), \
+ ((cpr)->cp_doorbell))

struct bnxt_ring;
struct bnxt_cp_ring_info {
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 07e7124..c182152 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -50,6 +50,8 @@
#include "bnxt_vnic.h"
#include "hsi_struct_def_dpdk.h"

+#include <rte_io.h>
+
#define HWRM_CMD_TIMEOUT 2000

/*
@@ -72,7 +74,7 @@ static int bnxt_hwrm_send_message_locked(struct bnxt *bp, void *msg,
/* Write request msg to hwrm channel */
for (i = 0; i < msg_len; i += 4) {
bar = (uint8_t *)bp->bar0 + i;
- *(volatile uint32_t *)bar = *data;
+ rte_write32(*data, bar);
data++;
}

@@ -80,11 +82,12 @@ static int bnxt_hwrm_send_message_locked(struct bnxt *bp, void *msg,
for (; i < bp->max_req_len; i += 4) {
bar = (uint8_t *)bp->bar0 + i;
*(volatile uint32_t *)bar = 0;
+ rte_write32(0, bar);
}

/* Ring channel doorbell */
bar = (uint8_t *)bp->bar0 + 0x100;
- *(volatile uint32_t *)bar = 1;
+ rte_write32(1, bar);

/* Poll for the valid bit */
for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
diff --git a/drivers/net/bnxt/bnxt_txr.h b/drivers/net/bnxt/bnxt_txr.h
index 4c16101..5b09711 100644
--- a/drivers/net/bnxt/bnxt_txr.h
+++ b/drivers/net/bnxt/bnxt_txr.h
@@ -34,12 +34,12 @@
#ifndef _BNXT_TXR_H_
#define _BNXT_TXR_H_

+#include <rte_io.h>
+
#define MAX_TX_RINGS 16
#define BNXT_TX_PUSH_THRESH 92

-#define B_TX_DB(db, prod) \
- rte_smp_wmb(); \
- (*(uint32_t *)db = (DB_KEY_TX | prod))
+#define B_TX_DB(db, prod) rte_write32((DB_KEY_TX | (prod)), db)

struct bnxt_tx_ring_info {
uint16_t tx_prod;
--
2.5.5
Jerin Jacob
2016-12-27 09:49:23 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

CC: Harish Patil <***@cavium.com>
CC: Rasesh Mody <***@cavium.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/bnx2x/bnx2x.h | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 5cefea4..59064d8 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -18,6 +18,7 @@

#include <rte_byteorder.h>
#include <rte_spinlock.h>
+#include <rte_io.h>

#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
#ifndef __LITTLE_ENDIAN
@@ -1419,8 +1420,7 @@ bnx2x_reg_write8(struct bnx2x_softc *sc, size_t offset, uint8_t val)
{
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%02x",
(unsigned long)offset, val);
- *((volatile uint8_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_write8(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
}

static inline void
@@ -1433,8 +1433,8 @@ bnx2x_reg_write16(struct bnx2x_softc *sc, size_t offset, uint16_t val)
#endif
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%04x",
(unsigned long)offset, val);
- *((volatile uint16_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_write16(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
+
}

static inline void
@@ -1448,8 +1448,7 @@ bnx2x_reg_write32(struct bnx2x_softc *sc, size_t offset, uint32_t val)

PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);
- *((volatile uint32_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)) = val;
+ rte_write32(val, ((uint8_t *)sc->bar[BAR0].base_addr + offset));
}

static inline uint8_t
@@ -1457,8 +1456,7 @@ bnx2x_reg_read8(struct bnx2x_softc *sc, size_t offset)
{
uint8_t val;

- val = (uint8_t)(*((volatile uint8_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_read8((uint8_t *)sc->bar[BAR0].base_addr + offset);
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%02x",
(unsigned long)offset, val);

@@ -1476,8 +1474,7 @@ bnx2x_reg_read16(struct bnx2x_softc *sc, size_t offset)
(unsigned long)offset);
#endif

- val = (uint16_t)(*((volatile uint16_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_read16(((uint8_t *)sc->bar[BAR0].base_addr + offset));
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);

@@ -1495,8 +1492,7 @@ bnx2x_reg_read32(struct bnx2x_softc *sc, size_t offset)
(unsigned long)offset);
#endif

- val = (uint32_t)(*((volatile uint32_t*)
- ((uintptr_t)sc->bar[BAR0].base_addr + offset)));
+ val = rte_read32(((uint8_t *)sc->bar[BAR0].base_addr + offset));
PMD_DEBUG_PERIODIC_LOG(DEBUG, "offset=0x%08lx val=0x%08x",
(unsigned long)offset, val);

@@ -1560,11 +1556,9 @@ bnx2x_reg_read32(struct bnx2x_softc *sc, size_t offset)
#define DPM_TRIGGER_TYPE 0x40

/* Doorbell macro */
-#define BNX2X_DB_WRITE(db_bar, val) \
- *((volatile uint32_t *)(db_bar)) = (val)
+#define BNX2X_DB_WRITE(db_bar, val) rte_write32_relaxed((val), (db_bar))

-#define BNX2X_DB_READ(db_bar) \
- *((volatile uint32_t *)(db_bar))
+#define BNX2X_DB_READ(db_bar) rte_read32_relaxed(db_bar)

#define DOORBELL_ADDR(sc, offset) \
(volatile uint32_t *)(((char *)(sc)->bar[BAR1].base_addr + (offset)))
--
2.5.5
Jerin Jacob
2016-12-27 09:49:24 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Rahul Lakkireddy <***@chelsio.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/cxgbe/base/adapter.h | 34 ++++++++++++++++++++++++++++------
drivers/net/cxgbe/cxgbe_compat.h | 8 +++++++-
drivers/net/cxgbe/sge.c | 10 +++++-----
3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 5e3bd50..beb1e3e 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -37,6 +37,7 @@
#define __T4_ADAPTER_H__

#include <rte_mbuf.h>
+#include <rte_io.h>

#include "cxgbe_compat.h"
#include "t4_regs_values.h"
@@ -324,7 +325,7 @@ struct adapter {
int use_unpacked_mode; /* unpacked rx mode state */
};

-#define CXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define CXGBE_PCI_REG(reg) rte_read32(reg)

static inline uint64_t cxgbe_read_addr64(volatile void *addr)
{
@@ -350,16 +351,21 @@ static inline uint32_t cxgbe_read_addr(volatile void *addr)
#define CXGBE_READ_REG64(adap, reg) \
cxgbe_read_addr64(CXGBE_PCI_REG_ADDR((adap), (reg)))

-#define CXGBE_PCI_REG_WRITE(reg, value) ({ \
- CXGBE_PCI_REG((reg)) = (value); })
+#define CXGBE_PCI_REG_WRITE(reg, value) rte_write32((value), (reg))
+
+#define CXGBE_PCI_REG_WRITE_RELAXED(reg, value) \
+ rte_write32_relaxed((value), (reg))

#define CXGBE_WRITE_REG(adap, reg, value) \
CXGBE_PCI_REG_WRITE(CXGBE_PCI_REG_ADDR((adap), (reg)), (value))

+#define CXGBE_WRITE_REG_RELAXED(adap, reg, value) \
+ CXGBE_PCI_REG_WRITE_RELAXED(CXGBE_PCI_REG_ADDR((adap), (reg)), (value))
+
static inline uint64_t cxgbe_write_addr64(volatile void *addr, uint64_t val)
{
- CXGBE_PCI_REG(addr) = val;
- CXGBE_PCI_REG(((volatile uint8_t *)(addr) + 4)) = (val >> 32);
+ CXGBE_PCI_REG_WRITE(addr, val);
+ CXGBE_PCI_REG_WRITE(((volatile uint8_t *)(addr) + 4), (val >> 32));
return val;
}

@@ -383,7 +389,7 @@ static inline u32 t4_read_reg(struct adapter *adapter, u32 reg_addr)
}

/**
- * t4_write_reg - write a HW register
+ * t4_write_reg - write a HW register with barrier
* @adapter: the adapter
* @reg_addr: the register address
* @val: the value to write
@@ -398,6 +404,22 @@ static inline void t4_write_reg(struct adapter *adapter, u32 reg_addr, u32 val)
}

/**
+ * t4_write_reg_relaxed - write a HW register with no barrier
+ * @adapter: the adapter
+ * @reg_addr: the register address
+ * @val: the value to write
+ *
+ * Write a 32-bit value into the given HW register.
+ */
+static inline void t4_write_reg_relaxed(struct adapter *adapter, u32 reg_addr,
+ u32 val)
+{
+ CXGBE_DEBUG_REG(adapter, "setting register 0x%x to 0x%x\n", reg_addr,
+ val);
+ CXGBE_WRITE_REG_RELAXED(adapter, reg_addr, val);
+}
+
+/**
* t4_read_reg64 - read a 64-bit HW register
* @adapter: the adapter
* @reg_addr: the register address
diff --git a/drivers/net/cxgbe/cxgbe_compat.h b/drivers/net/cxgbe/cxgbe_compat.h
index e68f8f5..1551cbf 100644
--- a/drivers/net/cxgbe/cxgbe_compat.h
+++ b/drivers/net/cxgbe/cxgbe_compat.h
@@ -45,6 +45,7 @@
#include <rte_cycles.h>
#include <rte_spinlock.h>
#include <rte_log.h>
+#include <rte_io.h>

#define dev_printf(level, fmt, args...) \
RTE_LOG(level, PMD, "rte_cxgbe_pmd: " fmt, ## args)
@@ -254,7 +255,7 @@ static inline unsigned long ilog2(unsigned long n)

static inline void writel(unsigned int val, volatile void __iomem *addr)
{
- *(volatile unsigned int *)addr = val;
+ rte_write32(val, addr);
}

static inline void writeq(u64 val, volatile void __iomem *addr)
@@ -263,4 +264,9 @@ static inline void writeq(u64 val, volatile void __iomem *addr)
writel(val >> 32, (void *)((uintptr_t)addr + 4));
}

+static inline void writel_relaxed(unsigned int val, volatile void __iomem *addr)
+{
+ rte_write32_relaxed(val, addr);
+}
+
#endif /* _CXGBE_COMPAT_H_ */
diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 736f08c..fc03a0c 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -338,12 +338,12 @@ static inline void ring_fl_db(struct adapter *adap, struct sge_fl *q)
* mechanism.
*/
if (unlikely(!q->bar2_addr)) {
- t4_write_reg(adap, MYPF_REG(A_SGE_PF_KDOORBELL),
- val | V_QID(q->cntxt_id));
+ t4_write_reg_relaxed(adap, MYPF_REG(A_SGE_PF_KDOORBELL),
+ val | V_QID(q->cntxt_id));
} else {
- writel(val | V_QID(q->bar2_qid),
- (void *)((uintptr_t)q->bar2_addr +
- SGE_UDB_KDOORBELL));
+ writel_relaxed(val | V_QID(q->bar2_qid),
+ (void *)((uintptr_t)q->bar2_addr +
+ SGE_UDB_KDOORBELL));

/*
* This Write memory Barrier will force the write to
--
2.5.5
Jerin Jacob
2016-12-27 09:49:25 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Wenzhuo Lu <***@intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/e1000/base/e1000_osdep.h | 18 ++++++++++--------
drivers/net/e1000/em_rxtx.c | 2 +-
drivers/net/e1000/igb_rxtx.c | 2 +-
3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/e1000/base/e1000_osdep.h b/drivers/net/e1000/base/e1000_osdep.h
index 47a1948..b886804 100644
--- a/drivers/net/e1000/base/e1000_osdep.h
+++ b/drivers/net/e1000/base/e1000_osdep.h
@@ -44,6 +44,7 @@
#include <rte_log.h>
#include <rte_debug.h>
#include <rte_byteorder.h>
+#include <rte_io.h>

#include "../e1000_logs.h"

@@ -94,17 +95,18 @@ typedef int bool;

#define E1000_WRITE_FLUSH(a) E1000_READ_REG(a, E1000_STATUS)

-#define E1000_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define E1000_PCI_REG(reg) rte_read32(reg)

-#define E1000_PCI_REG16(reg) (*((volatile uint16_t *)(reg)))
+#define E1000_PCI_REG16(reg) rte_read16(reg)

-#define E1000_PCI_REG_WRITE(reg, value) do { \
- E1000_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while (0)
+#define E1000_PCI_REG_WRITE(reg, value) \
+ rte_write32((rte_cpu_to_le_32(value)), reg)

-#define E1000_PCI_REG_WRITE16(reg, value) do { \
- E1000_PCI_REG16((reg)) = (rte_cpu_to_le_16(value)); \
-} while (0)
+#define E1000_PCI_REG_WRITE_RELAXED(reg, value) \
+ rte_write32_relaxed((rte_cpu_to_le_32(value)), reg)
+
+#define E1000_PCI_REG_WRITE16(reg, value) \
+ rte_write16((rte_cpu_to_le_16(value)), reg)

#define E1000_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index 41f51c0..6ec38d4 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -610,7 +610,7 @@ eth_em_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
PMD_TX_LOG(DEBUG, "port_id=%u queue_id=%u tx_tail=%u nb_tx=%u",
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);
- E1000_PCI_REG_WRITE(txq->tdt_reg_addr, tx_id);
+ E1000_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, tx_id);
txq->tx_tail = tx_id;

return nb_tx;
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index dbd37ac..61edbfb 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -605,7 +605,7 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
/*
* Set the Transmit Descriptor Tail (TDT).
*/
- E1000_PCI_REG_WRITE(txq->tdt_reg_addr, tx_id);
+ E1000_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, tx_id);
PMD_TX_LOG(DEBUG, "port_id=%u queue_id=%u tx_tail=%u nb_tx=%u",
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);
--
2.5.5
Jerin Jacob
2016-12-27 09:49:26 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Jan Medala <***@semihalf.com>
CC: Jakub Palider <***@semihalf.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
Acked-by: Jan Medala <***@semihalf.com>
---
drivers/net/ena/base/ena_eth_com.h | 2 +-
drivers/net/ena/base/ena_plat_dpdk.h | 11 +++++++++--
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ena/base/ena_eth_com.h b/drivers/net/ena/base/ena_eth_com.h
index 71a880c..ee62685 100644
--- a/drivers/net/ena/base/ena_eth_com.h
+++ b/drivers/net/ena/base/ena_eth_com.h
@@ -118,7 +118,7 @@ static inline int ena_com_write_sq_doorbell(struct ena_com_io_sq *io_sq)
ena_trc_dbg("write submission queue doorbell for queue: %d tail: %d\n",
io_sq->qid, tail);

- ENA_REG_WRITE32(tail, io_sq->db_addr);
+ ENA_REG_WRITE32_RELAXED(tail, io_sq->db_addr);

return 0;
}
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index 87c3bf1..09d540a 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -48,6 +48,7 @@
#include <rte_malloc.h>
#include <rte_memzone.h>
#include <rte_spinlock.h>
+#include <rte_io.h>

#include <sys/time.h>

@@ -226,15 +227,21 @@ typedef uint64_t dma_addr_t;

static inline void writel(u32 value, volatile void *addr)
{
- *(volatile u32 *)addr = value;
+ rte_write32(value, addr);
+}
+
+static inline void writel_relaxed(u32 value, volatile void *addr)
+{
+ rte_write32_relaxed(value, addr);
}

static inline u32 readl(const volatile void *addr)
{
- return *(const volatile u32 *)addr;
+ return rte_read32(addr);
}

#define ENA_REG_WRITE32(value, reg) writel((value), (reg))
+#define ENA_REG_WRITE32_RELAXED(value, reg) writel_relaxed((value), (reg))
#define ENA_REG_READ32(reg) readl((reg))

#define ATOMIC32_INC(i32_ptr) rte_atomic32_inc(i32_ptr)
--
2.5.5
Jerin Jacob
2016-12-27 09:49:27 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

CC: John Daley <***@cisco.com>
CC: Nelson Escobar <***@cisco.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/enic/enic_compat.h | 27 +++++++++++++++++++--------
drivers/net/enic/enic_rxtx.c | 9 +++++----
2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 5dbd983..fc58bb4 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -41,6 +41,7 @@
#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
+#include <rte_io.h>

#define ENIC_PAGE_ALIGN 4096UL
#define ENIC_ALIGN ENIC_PAGE_ALIGN
@@ -95,42 +96,52 @@ typedef unsigned long long dma_addr_t;

static inline uint32_t ioread32(volatile void *addr)
{
- return *(volatile uint32_t *)addr;
+ return rte_read32(addr);
}

static inline uint16_t ioread16(volatile void *addr)
{
- return *(volatile uint16_t *)addr;
+ return rte_read16(addr);
}

static inline uint8_t ioread8(volatile void *addr)
{
- return *(volatile uint8_t *)addr;
+ return rte_read8(addr);
}

static inline void iowrite32(uint32_t val, volatile void *addr)
{
- *(volatile uint32_t *)addr = val;
+ rte_write32(val, addr);
+}
+
+static inline void iowrite32_relaxed(uint32_t val, volatile void *addr)
+{
+ rte_write32_relaxed(val, addr);
}

static inline void iowrite16(uint16_t val, volatile void *addr)
{
- *(volatile uint16_t *)addr = val;
+ rte_write16(val, addr);
}

static inline void iowrite8(uint8_t val, volatile void *addr)
{
- *(volatile uint8_t *)addr = val;
+ rte_write8(val, addr);
}

static inline unsigned int readl(volatile void __iomem *addr)
{
- return *(volatile unsigned int *)addr;
+ return rte_read32(addr);
+}
+
+static inline unsigned int readl_relaxed(volatile void __iomem *addr)
+{
+ return rte_read32_relaxed(addr);
}

static inline void writel(unsigned int val, volatile void __iomem *addr)
{
- *(volatile unsigned int *)addr = val;
+ rte_write32(val, addr);
}

#define min_t(type, x, y) ({ \
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index f762a26..382d1ab 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -380,10 +380,11 @@ enic_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,

rte_mb();
if (data_rq->in_use)
- iowrite32(data_rq->posted_index,
- &data_rq->ctrl->posted_index);
+ iowrite32_relaxed(data_rq->posted_index,
+ &data_rq->ctrl->posted_index);
rte_compiler_barrier();
- iowrite32(sop_rq->posted_index, &sop_rq->ctrl->posted_index);
+ iowrite32_relaxed(sop_rq->posted_index,
+ &sop_rq->ctrl->posted_index);
}


@@ -550,7 +551,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
}
post:
rte_wmb();
- iowrite32(head_idx, &wq->ctrl->posted_index);
+ iowrite32_relaxed(head_idx, &wq->ctrl->posted_index);
done:
wq->ring.desc_avail = wq_desc_avail;
wq->head_idx = head_idx;
--
2.5.5
Jerin Jacob
2016-12-27 09:49:28 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Jing Chen <***@intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/fm10k/base/fm10k_osdep.h | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/fm10k/base/fm10k_osdep.h b/drivers/net/fm10k/base/fm10k_osdep.h
index a21daa2..f07b678 100644
--- a/drivers/net/fm10k/base/fm10k_osdep.h
+++ b/drivers/net/fm10k/base/fm10k_osdep.h
@@ -39,6 +39,8 @@ POSSIBILITY OF SUCH DAMAGE.
#include <rte_atomic.h>
#include <rte_byteorder.h>
#include <rte_cycles.h>
+#include <rte_io.h>
+
#include "../fm10k_logs.h"

/* TODO: this does not look like it should be used... */
@@ -88,17 +90,16 @@ typedef int bool;
#endif

/* offsets are WORD offsets, not BYTE offsets */
-#define FM10K_WRITE_REG(hw, reg, val) \
- ((((volatile uint32_t *)(hw)->hw_addr)[(reg)]) = ((uint32_t)(val)))
-#define FM10K_READ_REG(hw, reg) \
- (((volatile uint32_t *)(hw)->hw_addr)[(reg)])
+#define FM10K_WRITE_REG(hw, reg, val) \
+ rte_write32((val), ((hw)->hw_addr + (reg)))
+
+#define FM10K_READ_REG(hw, reg) rte_read32(((hw)->hw_addr + (reg)))
+
#define FM10K_WRITE_FLUSH(a) FM10K_READ_REG(a, FM10K_CTRL)

-#define FM10K_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define FM10K_PCI_REG(reg) rte_read32(reg)

-#define FM10K_PCI_REG_WRITE(reg, value) do { \
- FM10K_PCI_REG((reg)) = (value); \
-} while (0)
+#define FM10K_PCI_REG_WRITE(reg, value) rte_write32((value), (reg))

/* not implemented */
#define FM10K_READ_PCI_WORD(hw, reg) 0
--
2.5.5
Jerin Jacob
2016-12-27 09:49:29 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.

CC: Helin Zhang <***@intel.com>
CC: Jingjing Wu <***@intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Satha Rao <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/i40e/base/i40e_osdep.h | 10 +++++++---
drivers/net/i40e/i40e_rxtx.c | 4 ++--
2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_osdep.h b/drivers/net/i40e/base/i40e_osdep.h
index 38e7ba5..c57ecde 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_spinlock.h>
#include <rte_log.h>
+#include <rte_io.h>

#include "../i40e_logs.h"

@@ -153,15 +154,18 @@ do { \
* I40E_PRTQF_FD_MSK
*/

-#define I40E_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define I40E_PCI_REG(reg) rte_read32(reg)
#define I40E_PCI_REG_ADDR(a, reg) \
((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
static inline uint32_t i40e_read_addr(volatile void *addr)
{
return rte_le_to_cpu_32(I40E_PCI_REG(addr));
}
-#define I40E_PCI_REG_WRITE(reg, value) \
- do { I40E_PCI_REG((reg)) = rte_cpu_to_le_32(value); } while (0)
+
+#define I40E_PCI_REG_WRITE(reg, value) \
+ rte_write32((rte_cpu_to_le_32(value)), reg)
+#define I40E_PCI_REG_WRITE_RELAXED(reg, value) \
+ rte_write32_relaxed((rte_cpu_to_le_32(value)), reg)

#define I40E_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_GLGEN_STAT)
#define I40EVF_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_VFGEN_RSTAT)
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..5c41a90 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1228,7 +1228,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);

- I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
txq->tx_tail = tx_id;

return nb_tx;
@@ -1380,7 +1380,7 @@ tx_xmit_pkts(struct i40e_tx_queue *txq,

/* Update the tx tail register */
rte_wmb();
- I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, txq->tx_tail);

return nb_pkts;
}
--
2.5.5
Tiwei Bie
2017-01-04 13:53:40 UTC
Permalink
Post by Jerin Jacob
Replace the raw I/O device memory read/write access with eal abstraction
for I/O device memory read/write access to fix portability issues across
different architectures.
[...]
Post by Jerin Jacob
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..5c41a90 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1228,7 +1228,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);
- I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
txq->tx_tail = tx_id;
return nb_tx;
@@ -1380,7 +1380,7 @@ tx_xmit_pkts(struct i40e_tx_queue *txq,
/* Update the tx tail register */
rte_wmb();
- I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, txq->tx_tail);
return nb_pkts;
}
Besides i40e_xmit_pkts() and tx_xmit_pkts(), i40e_rx_alloc_bufs() which is
called by rx_recv_pkts() is also in the fast path. So I40E_PCI_REG_WRITE()
called by it should also be replaced by the relaxed version:

diff --git i/drivers/net/i40e/i40e_rxtx.c w/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..55a707a 100644
--- i/drivers/net/i40e/i40e_rxtx.c
+++ w/drivers/net/i40e/i40e_rxtx.c
@@ -581,7 +581,7 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)

/* Update rx tail regsiter */
rte_wmb();
- I40E_PCI_REG_WRITE(rxq->qrx_tail, rxq->rx_free_trigger);
+ I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rxq->rx_free_trigger);

rxq->rx_free_trigger =
(uint16_t)(rxq->rx_free_trigger + rxq->rx_free_thresh);

Thanks & regards,
Tiwei Bie
Post by Jerin Jacob
--
2.5.5
Santosh Shukla
2017-01-04 15:22:27 UTC
Permalink
Post by Tiwei Bie
Post by Jerin Jacob
/* Update the tx tail register */
rte_wmb();
- I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, txq->tx_tail);
return nb_pkts;
}
Besides i40e_xmit_pkts() and tx_xmit_pkts(), i40e_rx_alloc_bufs() which is
called by rx_recv_pkts() is also in the fast path. So I40E_PCI_REG_WRITE()
diff --git i/drivers/net/i40e/i40e_rxtx.c w/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..55a707a 100644
--- i/drivers/net/i40e/i40e_rxtx.c
+++ w/drivers/net/i40e/i40e_rxtx.c
@@ -581,7 +581,7 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
/* Update rx tail regsiter */
rte_wmb();
- I40E_PCI_REG_WRITE(rxq->qrx_tail, rxq->rx_free_trigger);
+ I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rxq->rx_free_trigger);
rxq->rx_free_trigger =
(uint16_t)(rxq->rx_free_trigger + rxq->rx_free_thresh);
Yes.

Will queue it in v3.
Post by Tiwei Bie
Post by Jerin Jacob
--
2.5.5
Jerin Jacob
2016-12-27 09:49:30 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Helin Zhang <***@intel.com>
CC: Konstantin Ananyev <***@intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/ixgbe/base/ixgbe_osdep.h | 11 +++++++----
drivers/net/ixgbe/ixgbe_rxtx.c | 13 +++++++------
2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h b/drivers/net/ixgbe/base/ixgbe_osdep.h
index 77f0af5..9b874b8 100644
--- a/drivers/net/ixgbe/base/ixgbe_osdep.h
+++ b/drivers/net/ixgbe/base/ixgbe_osdep.h
@@ -44,6 +44,7 @@
#include <rte_cycles.h>
#include <rte_log.h>
#include <rte_byteorder.h>
+#include <rte_io.h>

#include "../ixgbe_logs.h"
#include "../ixgbe_bypass_defines.h"
@@ -121,16 +122,18 @@ typedef int bool;

#define prefetch(x) rte_prefetch0(x)

-#define IXGBE_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define IXGBE_PCI_REG(reg) rte_read32(reg)

static inline uint32_t ixgbe_read_addr(volatile void* addr)
{
return rte_le_to_cpu_32(IXGBE_PCI_REG(addr));
}

-#define IXGBE_PCI_REG_WRITE(reg, value) do { \
- IXGBE_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
-} while(0)
+#define IXGBE_PCI_REG_WRITE(reg, value) \
+ rte_write32((rte_cpu_to_le_32(value)), reg)
+
+#define IXGBE_PCI_REG_WRITE_RELAXED(reg, value) \
+ rte_write32_relaxed((rte_cpu_to_le_32(value)), reg)

#define IXGBE_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b2d9f45..81544bb 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -321,7 +321,7 @@ tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,

/* update tail pointer */
rte_wmb();
- IXGBE_PCI_REG_WRITE(txq->tdt_reg_addr, txq->tx_tail);
+ IXGBE_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, txq->tx_tail);

return nb_pkts;
}
@@ -897,7 +897,7 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
PMD_TX_LOG(DEBUG, "port_id=%u queue_id=%u tx_tail=%u nb_tx=%u",
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);
- IXGBE_PCI_REG_WRITE(txq->tdt_reg_addr, tx_id);
+ IXGBE_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, tx_id);
txq->tx_tail = tx_id;

return nb_tx;
@@ -1581,7 +1581,8 @@ rx_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,

/* update tail pointer */
rte_wmb();
- IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, cur_free_trigger);
+ IXGBE_PCI_REG_WRITE_RELAXED(rxq->rdt_reg_addr,
+ cur_free_trigger);
}

if (rxq->rx_tail >= rxq->nb_rx_desc)
@@ -1985,8 +1986,8 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,

if (!ixgbe_rx_alloc_bufs(rxq, false)) {
rte_wmb();
- IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr,
- next_rdt);
+ IXGBE_PCI_REG_WRITE_RELAXED(rxq->rdt_reg_addr,
+ next_rdt);
nb_hold -= rxq->rx_free_thresh;
} else {
PMD_RX_LOG(DEBUG, "RX bulk alloc failed "
@@ -2157,7 +2158,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
rxq->port_id, rxq->queue_id, rx_id, nb_hold, nb_rx);

rte_wmb();
- IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, prev_id);
+ IXGBE_PCI_REG_WRITE_RELAXED(rxq->rdt_reg_addr, prev_id);
nb_hold = 0;
}
--
2.5.5
Jerin Jacob
2016-12-27 09:49:31 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Alejandro Lucero <***@netronome.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/nfp/nfp_net_pmd.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index c180972..f11b32e 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -121,25 +121,26 @@ struct nfp_net_adapter;
#define NFD_CFG_MINOR_VERSION_of(x) (((x) >> 0) & 0xff)

#include <linux/types.h>
+#include <rte_io.h>

static inline uint8_t nn_readb(volatile const void *addr)
{
- return *((volatile const uint8_t *)(addr));
+ return rte_read8(addr);
}

static inline void nn_writeb(uint8_t val, volatile void *addr)
{
- *((volatile uint8_t *)(addr)) = val;
+ rte_write8(val, addr);
}

static inline uint32_t nn_readl(volatile const void *addr)
{
- return *((volatile const uint32_t *)(addr));
+ return rte_read32(addr);
}

static inline void nn_writel(uint32_t val, volatile void *addr)
{
- *((volatile uint32_t *)(addr)) = val;
+ rte_write32(val, addr);
}

static inline uint64_t nn_readq(volatile void *addr)
--
2.5.5
Jerin Jacob
2016-12-27 09:49:32 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Harish Patil <***@cavium.com>
CC: Rasesh Mody <***@cavium.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/qede/base/bcm_osal.h | 20 +++++++++++---------
drivers/net/qede/base/ecore_int_api.h | 28 +++++++++++++++++++++++-----
drivers/net/qede/base/ecore_spq.c | 3 ++-
drivers/net/qede/qede_rxtx.c | 2 +-
4 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/drivers/net/qede/base/bcm_osal.h b/drivers/net/qede/base/bcm_osal.h
index 0b446f2..33d43c6 100644
--- a/drivers/net/qede/base/bcm_osal.h
+++ b/drivers/net/qede/base/bcm_osal.h
@@ -18,6 +18,7 @@
#include <rte_cycles.h>
#include <rte_debug.h>
#include <rte_ether.h>
+#include <rte_io.h>

/* Forward declaration */
struct ecore_dev;
@@ -113,18 +114,18 @@ void *osal_dma_alloc_coherent_aligned(struct ecore_dev *, dma_addr_t *,

/* HW reads/writes */

-#define DIRECT_REG_RD(_dev, _reg_addr) \
- (*((volatile u32 *) (_reg_addr)))
+#define DIRECT_REG_RD(_dev, _reg_addr) rte_read32(_reg_addr)

#define REG_RD(_p_hwfn, _reg_offset) \
DIRECT_REG_RD(_p_hwfn, \
((u8 *)(uintptr_t)(_p_hwfn->regview) + (_reg_offset)))

-#define DIRECT_REG_WR16(_reg_addr, _val) \
- (*((volatile u16 *)(_reg_addr)) = _val)
+#define DIRECT_REG_WR16(_reg_addr, _val) rte_write16((_val), (_reg_addr))

-#define DIRECT_REG_WR(_dev, _reg_addr, _val) \
- (*((volatile u32 *)(_reg_addr)) = _val)
+#define DIRECT_REG_WR(_dev, _reg_addr, _val) rte_write32((_val), (_reg_addr))
+
+#define DIRECT_REG_WR_RELAXED(_dev, _reg_addr, _val) \
+ rte_write32_relaxed((_val), (_reg_addr))

#define REG_WR(_p_hwfn, _reg_offset, _val) \
DIRECT_REG_WR(NULL, \
@@ -134,9 +135,10 @@ void *osal_dma_alloc_coherent_aligned(struct ecore_dev *, dma_addr_t *,
DIRECT_REG_WR16(((u8 *)(uintptr_t)(_p_hwfn->regview) + \
(_reg_offset)), (u16)_val)

-#define DOORBELL(_p_hwfn, _db_addr, _val) \
- DIRECT_REG_WR(_p_hwfn, \
- ((u8 *)(uintptr_t)(_p_hwfn->doorbells) + (_db_addr)), (u32)_val)
+#define DOORBELL(_p_hwfn, _db_addr, _val) \
+ DIRECT_REG_WR_RELAXED((_p_hwfn), \
+ ((u8 *)(uintptr_t)(_p_hwfn->doorbells) + \
+ (_db_addr)), (u32)_val)

/* Mutexes */

diff --git a/drivers/net/qede/base/ecore_int_api.h b/drivers/net/qede/base/ecore_int_api.h
index fc873e7..a0d6a43 100644
--- a/drivers/net/qede/base/ecore_int_api.h
+++ b/drivers/net/qede/base/ecore_int_api.h
@@ -120,19 +120,37 @@ static OSAL_INLINE void __internal_ram_wr(void *p_hwfn,
}

#ifdef ECORE_CONFIG_DIRECT_HWFN
+static OSAL_INLINE void __internal_ram_wr_relaxed(struct ecore_hwfn *p_hwfn,
+ void OSAL_IOMEM * addr,
+ int size, u32 *data)
+#else
+static OSAL_INLINE void __internal_ram_wr_relaxed(void *p_hwfn,
+ void OSAL_IOMEM * addr,
+ int size, u32 *data)
+#endif
+{
+ unsigned int i;
+
+ for (i = 0; i < size / sizeof(*data); i++)
+ DIRECT_REG_WR_RELAXED(p_hwfn, &((u32 OSAL_IOMEM *)addr)[i],
+ data[i]);
+}
+
+#ifdef ECORE_CONFIG_DIRECT_HWFN
static OSAL_INLINE void internal_ram_wr(struct ecore_hwfn *p_hwfn,
- void OSAL_IOMEM *addr,
- int size, u32 *data)
+ void OSAL_IOMEM * addr,
+ int size, u32 *data)
{
- __internal_ram_wr(p_hwfn, addr, size, data);
+ __internal_ram_wr_relaxed(p_hwfn, addr, size, data);
}
#else
static OSAL_INLINE void internal_ram_wr(void OSAL_IOMEM *addr,
- int size, u32 *data)
+ int size, u32 *data)
{
- __internal_ram_wr(OSAL_NULL, addr, size, data);
+ __internal_ram_wr_relaxed(OSAL_NULL, addr, size, data);
}
#endif
+
#endif

struct ecore_hwfn;
diff --git a/drivers/net/qede/base/ecore_spq.c b/drivers/net/qede/base/ecore_spq.c
index 0d744dd..6e5ce5d 100644
--- a/drivers/net/qede/base/ecore_spq.c
+++ b/drivers/net/qede/base/ecore_spq.c
@@ -248,7 +248,8 @@ static enum _ecore_status_t ecore_spq_hw_post(struct ecore_hwfn *p_hwfn,
/* make sure the SPQE is updated before the doorbell */
OSAL_WMB(p_hwfn->p_dev);

- DOORBELL(p_hwfn, DB_ADDR(p_spq->cid, DQ_DEMS_LEGACY), *(u32 *)&db);
+ DOORBELL(p_hwfn, DB_ADDR(p_spq->cid, DQ_DEMS_LEGACY),
+ *(u32 *)&db);

/* make sure doorbell is rang */
OSAL_WMB(p_hwfn->p_dev);
diff --git a/drivers/net/qede/qede_rxtx.c b/drivers/net/qede/qede_rxtx.c
index 2e181c8..e1e9956 100644
--- a/drivers/net/qede/qede_rxtx.c
+++ b/drivers/net/qede/qede_rxtx.c
@@ -1246,7 +1246,7 @@ qede_xmit_pkts(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
txq->tx_db.data.bd_prod = bd_prod;
rte_wmb();
rte_compiler_barrier();
- DIRECT_REG_WR(edev, txq->doorbell_addr, txq->tx_db.raw);
+ DIRECT_REG_WR_RELAXED(edev, txq->doorbell_addr, txq->tx_db.raw);
rte_wmb();

/* Check again for Tx completions */
--
2.5.5
Jerin Jacob
2016-12-27 09:49:33 UTC
Permalink
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/thunderx/base/nicvf_plat.h | 36 ++++------------------------------
1 file changed, 4 insertions(+), 32 deletions(-)

diff --git a/drivers/net/thunderx/base/nicvf_plat.h b/drivers/net/thunderx/base/nicvf_plat.h
index 83c1844..3754e1b 100644
--- a/drivers/net/thunderx/base/nicvf_plat.h
+++ b/drivers/net/thunderx/base/nicvf_plat.h
@@ -69,31 +69,15 @@
#include <rte_ether.h>
#define NICVF_MAC_ADDR_SIZE ETHER_ADDR_LEN

+#include <rte_io.h>
+#define nicvf_addr_write(addr, val) rte_write64_relaxed((val), (void *)(addr))
+#define nicvf_addr_read(addr) rte_read64_relaxed((void *)(addr))
+
/* ARM64 specific functions */
#if defined(RTE_ARCH_ARM64)
#define nicvf_prefetch_store_keep(_ptr) ({\
asm volatile("prfm pstl1keep, %a0\n" : : "p" (_ptr)); })

-static inline void __attribute__((always_inline))
-nicvf_addr_write(uintptr_t addr, uint64_t val)
-{
- asm volatile(
- "str %x[val], [%x[addr]]"
- :
- : [val] "r" (val), [addr] "r" (addr));
-}
-
-static inline uint64_t __attribute__((always_inline))
-nicvf_addr_read(uintptr_t addr)
-{
- uint64_t val;
-
- asm volatile(
- "ldr %x[val], [%x[addr]]"
- : [val] "=r" (val)
- : [addr] "r" (addr));
- return val;
-}

#define NICVF_LOAD_PAIR(reg1, reg2, addr) ({ \
asm volatile( \
@@ -106,18 +90,6 @@ nicvf_addr_read(uintptr_t addr)

#define nicvf_prefetch_store_keep(_ptr) do {} while (0)

-static inline void __attribute__((always_inline))
-nicvf_addr_write(uintptr_t addr, uint64_t val)
-{
- *(volatile uint64_t *)addr = val;
-}
-
-static inline uint64_t __attribute__((always_inline))
-nicvf_addr_read(uintptr_t addr)
-{
- return *(volatile uint64_t *)addr;
-}
-
#define NICVF_LOAD_PAIR(reg1, reg2, addr) \
do { \
reg1 = nicvf_addr_read((uintptr_t)addr); \
--
2.5.5
Jerin Jacob
2016-12-27 09:49:34 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Huawei Xie <***@intel.com>
CC: Yuanhan Liu <***@linux.intel.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
Acked-by: Yuanhan Liu <***@linux.intel.com>
---
drivers/net/virtio/virtio_pci.c | 97 +++++++++++++----------------------------
1 file changed, 31 insertions(+), 66 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 9b47165..7c1cb4c 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -37,6 +37,8 @@
#include <fcntl.h>
#endif

+#include <rte_io.h>
+
#include "virtio_pci.h"
#include "virtio_logs.h"
#include "virtqueue.h"
@@ -316,48 +318,11 @@ static const struct virtio_pci_ops legacy_ops = {
.notify_queue = legacy_notify_queue,
};

-
-static inline uint8_t
-io_read8(uint8_t *addr)
-{
- return *(volatile uint8_t *)addr;
-}
-
-static inline void
-io_write8(uint8_t val, uint8_t *addr)
-{
- *(volatile uint8_t *)addr = val;
-}
-
-static inline uint16_t
-io_read16(uint16_t *addr)
-{
- return *(volatile uint16_t *)addr;
-}
-
-static inline void
-io_write16(uint16_t val, uint16_t *addr)
-{
- *(volatile uint16_t *)addr = val;
-}
-
-static inline uint32_t
-io_read32(uint32_t *addr)
-{
- return *(volatile uint32_t *)addr;
-}
-
-static inline void
-io_write32(uint32_t val, uint32_t *addr)
-{
- *(volatile uint32_t *)addr = val;
-}
-
static inline void
io_write64_twopart(uint64_t val, uint32_t *lo, uint32_t *hi)
{
- io_write32(val & ((1ULL << 32) - 1), lo);
- io_write32(val >> 32, hi);
+ rte_write32(val & ((1ULL << 32) - 1), lo);
+ rte_write32(val >> 32, hi);
}

static void
@@ -369,13 +334,13 @@ modern_read_dev_config(struct virtio_hw *hw, size_t offset,
uint8_t old_gen, new_gen;

do {
- old_gen = io_read8(&hw->common_cfg->config_generation);
+ old_gen = rte_read8(&hw->common_cfg->config_generation);

p = dst;
for (i = 0; i < length; i++)
- *p++ = io_read8((uint8_t *)hw->dev_cfg + offset + i);
+ *p++ = rte_read8((uint8_t *)hw->dev_cfg + offset + i);

- new_gen = io_read8(&hw->common_cfg->config_generation);
+ new_gen = rte_read8(&hw->common_cfg->config_generation);
} while (old_gen != new_gen);
}

@@ -387,7 +352,7 @@ modern_write_dev_config(struct virtio_hw *hw, size_t offset,
const uint8_t *p = src;

for (i = 0; i < length; i++)
- io_write8(*p++, (uint8_t *)hw->dev_cfg + offset + i);
+ rte_write8((*p++), (((uint8_t *)hw->dev_cfg) + offset + i));
}

static uint64_t
@@ -395,11 +360,11 @@ modern_get_features(struct virtio_hw *hw)
{
uint32_t features_lo, features_hi;

- io_write32(0, &hw->common_cfg->device_feature_select);
- features_lo = io_read32(&hw->common_cfg->device_feature);
+ rte_write32(0, &hw->common_cfg->device_feature_select);
+ features_lo = rte_read32(&hw->common_cfg->device_feature);

- io_write32(1, &hw->common_cfg->device_feature_select);
- features_hi = io_read32(&hw->common_cfg->device_feature);
+ rte_write32(1, &hw->common_cfg->device_feature_select);
+ features_hi = rte_read32(&hw->common_cfg->device_feature);

return ((uint64_t)features_hi << 32) | features_lo;
}
@@ -407,25 +372,25 @@ modern_get_features(struct virtio_hw *hw)
static void
modern_set_features(struct virtio_hw *hw, uint64_t features)
{
- io_write32(0, &hw->common_cfg->guest_feature_select);
- io_write32(features & ((1ULL << 32) - 1),
- &hw->common_cfg->guest_feature);
+ rte_write32(0, &hw->common_cfg->guest_feature_select);
+ rte_write32(features & ((1ULL << 32) - 1),
+ &hw->common_cfg->guest_feature);

- io_write32(1, &hw->common_cfg->guest_feature_select);
- io_write32(features >> 32,
- &hw->common_cfg->guest_feature);
+ rte_write32(1, &hw->common_cfg->guest_feature_select);
+ rte_write32(features >> 32,
+ &hw->common_cfg->guest_feature);
}

static uint8_t
modern_get_status(struct virtio_hw *hw)
{
- return io_read8(&hw->common_cfg->device_status);
+ return rte_read8(&hw->common_cfg->device_status);
}

static void
modern_set_status(struct virtio_hw *hw, uint8_t status)
{
- io_write8(status, &hw->common_cfg->device_status);
+ rte_write8(status, &hw->common_cfg->device_status);
}

static void
@@ -438,21 +403,21 @@ modern_reset(struct virtio_hw *hw)
static uint8_t
modern_get_isr(struct virtio_hw *hw)
{
- return io_read8(hw->isr);
+ return rte_read8(hw->isr);
}

static uint16_t
modern_set_config_irq(struct virtio_hw *hw, uint16_t vec)
{
- io_write16(vec, &hw->common_cfg->msix_config);
- return io_read16(&hw->common_cfg->msix_config);
+ rte_write16(vec, &hw->common_cfg->msix_config);
+ return rte_read16(&hw->common_cfg->msix_config);
}

static uint16_t
modern_get_queue_num(struct virtio_hw *hw, uint16_t queue_id)
{
- io_write16(queue_id, &hw->common_cfg->queue_select);
- return io_read16(&hw->common_cfg->queue_size);
+ rte_write16(queue_id, &hw->common_cfg->queue_select);
+ return rte_read16(&hw->common_cfg->queue_size);
}

static int
@@ -470,7 +435,7 @@ modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
ring[vq->vq_nentries]),
VIRTIO_PCI_VRING_ALIGN);

- io_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+ rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);

io_write64_twopart(desc_addr, &hw->common_cfg->queue_desc_lo,
&hw->common_cfg->queue_desc_hi);
@@ -479,11 +444,11 @@ modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
io_write64_twopart(used_addr, &hw->common_cfg->queue_used_lo,
&hw->common_cfg->queue_used_hi);

- notify_off = io_read16(&hw->common_cfg->queue_notify_off);
+ notify_off = rte_read16(&hw->common_cfg->queue_notify_off);
vq->notify_addr = (void *)((uint8_t *)hw->notify_base +
notify_off * hw->notify_off_multiplier);

- io_write16(1, &hw->common_cfg->queue_enable);
+ rte_write16(1, &hw->common_cfg->queue_enable);

PMD_INIT_LOG(DEBUG, "queue %u addresses:", vq->vq_queue_index);
PMD_INIT_LOG(DEBUG, "\t desc_addr: %" PRIx64, desc_addr);
@@ -498,7 +463,7 @@ modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
static void
modern_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
- io_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+ rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);

io_write64_twopart(0, &hw->common_cfg->queue_desc_lo,
&hw->common_cfg->queue_desc_hi);
@@ -507,13 +472,13 @@ modern_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
io_write64_twopart(0, &hw->common_cfg->queue_used_lo,
&hw->common_cfg->queue_used_hi);

- io_write16(0, &hw->common_cfg->queue_enable);
+ rte_write16(0, &hw->common_cfg->queue_enable);
}

static void
modern_notify_queue(struct virtio_hw *hw __rte_unused, struct virtqueue *vq)
{
- io_write16(1, vq->notify_addr);
+ rte_write16(1, vq->notify_addr);
}

static const struct virtio_pci_ops modern_ops = {
--
2.5.5
Jerin Jacob
2016-12-27 09:49:35 UTC
Permalink
From: Santosh Shukla <***@caviumnetworks.com>

Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.

CC: Yong Wang <***@vmware.com>
Signed-off-by: Santosh Shukla <***@caviumnetworks.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
drivers/net/vmxnet3/vmxnet3_ethdev.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.h b/drivers/net/vmxnet3/vmxnet3_ethdev.h
index 7d3b11e..85c00e4 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.h
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.h
@@ -34,6 +34,8 @@
#ifndef _VMXNET3_ETHDEV_H_
#define _VMXNET3_ETHDEV_H_

+#include <rte_io.h>
+
#define VMXNET3_MAX_MAC_ADDRS 1

/* UPT feature to negotiate */
@@ -120,7 +122,7 @@ struct vmxnet3_hw {

/* Config space read/writes */

-#define VMXNET3_PCI_REG(reg) (*((volatile uint32_t *)(reg)))
+#define VMXNET3_PCI_REG(reg) rte_read32(reg)

static inline uint32_t
vmxnet3_read_addr(volatile void *addr)
@@ -128,9 +130,7 @@ vmxnet3_read_addr(volatile void *addr)
return VMXNET3_PCI_REG(addr);
}

-#define VMXNET3_PCI_REG_WRITE(reg, value) do { \
- VMXNET3_PCI_REG((reg)) = (value); \
-} while(0)
+#define VMXNET3_PCI_REG_WRITE(reg, value) rte_write32((value), (reg))

#define VMXNET3_PCI_BAR0_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr0 + (reg)))
--
2.5.5
Jerin Jacob
2017-01-12 09:16:57 UTC
Permalink
v2..v3:

1) Changed I40E_PCI_REG_WRITE to I40E_PCI_REG_WRITE_RELAXED in fastpath
i40e_rx_alloc_bufs function(Tiwei)
2) Changed rte_?wb to rte_*wb in the git commit log of
"eal/arm64: change barrier definitions to macros"(Jianbo)
3) Re-based to latest dpdk master(Jan 12)

v1..v2:
1) Changed rte_[read/write]b/w/l/q_[relaxed] to rte_[read/write]8/16/32/64_[relaxed](Yuanhan)
2) Changed rte_?mb to macros for arm64(Jianbo)
3) rte_wmb() followed by rte_write* changed to rte_wmb() followed by relaxed version(rte_write_relaxed)
in _fast_ path to avoid an extra memory barrier for arm64 in fast path(Jianbo)
3) Replaced virtio io_read*/io_write* with rte_read*/rte_write*(Yuanhan)

Based on the discussion in the below-mentioned thread,
http://dev.dpdk.narkive.com/DpIRqDuy/dpdk-dev-patch-v2-i40e-fix-eth-i40e-dev-init-sequence-on-thunderx

This patchset introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the EAL abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture-agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

This patch-set split into three functional set:

patch-set 1-9: Introduce I/O device memory barrier eal abstraction and
implement it for all the architectures.

patch-set 10-13: Introduce I/O device memory read/write operations Earl abstraction
and implement it for all the architectures using previous I/O device memory
barrier.

patchset 14-28: Replace the raw readl/writel in the drivers with
new rte_read[8/16/32/64], rte_write[8/16/32/64] eal abstraction

Note:

1) We couldn't test the patch on all the Hardwares due to unavailability.
Appreciate the feedback from ARCH and PMD maintainers.

2) patch 13/28 has false positive check patch error with ASM syntax

ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#92: FILE: lib/librte_eal/common/include/arch/arm/rte_io_64.h:54:
+ : [val] "=r" (val)

Jerin Jacob (15):
eal: introduce I/O device memory barriers
eal/x86: define I/O device memory barriers for IA
eal/tile: define I/O device memory barriers for tile
eal/ppc64: define I/O device memory barriers for ppc64
eal/arm: separate smp barrier definition for ARMv7 and ARMv8
eal/armv7: define I/O device memory barriers for ARMv7
eal/arm64: fix memory barrier definition for arm64
eal/arm64: define smp barrier definition for arm64
eal/arm64: define I/O device memory barriers for arm64
eal: introduce I/O device memory read/write operations
eal: generic implementation for I/O device read/write access
eal: let all architectures use generic I/O implementation
eal/arm64: override I/O device read/write access for arm64
eal/arm64: change barrier definitions to macros
net/thunderx: use eal I/O device memory read/write API

Santosh Shukla (14):
crypto/qat: use eal I/O device memory read/write API
net/bnxt: use eal I/O device memory read/write API
net/bnx2x: use eal I/O device memory read/write API
net/cxgbe: use eal I/O device memory read/write API
net/e1000: use eal I/O device memory read/write API
net/ena: use eal I/O device memory read/write API
net/enic: use eal I/O device memory read/write API
net/fm10k: use eal I/O device memory read/write API
net/i40e: use eal I/O device memory read/write API
net/ixgbe: use eal I/O device memory read/write API
net/nfp: use eal I/O device memory read/write API
net/qede: use eal I/O device memory read/write API
net/virtio: use eal I/O device memory read/write API
net/vmxnet3: use eal I/O device memory read/write API

doc/api/doxy-api-index.md | 3 +-
.../qat/qat_adf/adf_transport_access_macros.h | 11 +-
drivers/net/bnx2x/bnx2x.h | 26 +-
drivers/net/bnxt/bnxt_cpr.h | 13 +-
drivers/net/bnxt/bnxt_hwrm.c | 7 +-
drivers/net/bnxt/bnxt_txr.h | 6 +-
drivers/net/cxgbe/base/adapter.h | 34 ++-
drivers/net/cxgbe/cxgbe_compat.h | 8 +-
drivers/net/cxgbe/sge.c | 10 +-
drivers/net/e1000/base/e1000_osdep.h | 18 +-
drivers/net/e1000/em_rxtx.c | 2 +-
drivers/net/e1000/igb_rxtx.c | 2 +-
drivers/net/ena/base/ena_eth_com.h | 2 +-
drivers/net/ena/base/ena_plat_dpdk.h | 11 +-
drivers/net/enic/enic_compat.h | 27 +-
drivers/net/enic/enic_rxtx.c | 9 +-
drivers/net/fm10k/base/fm10k_osdep.h | 17 +-
drivers/net/i40e/base/i40e_osdep.h | 10 +-
drivers/net/i40e/i40e_rxtx.c | 6 +-
drivers/net/ixgbe/base/ixgbe_osdep.h | 11 +-
drivers/net/ixgbe/ixgbe_rxtx.c | 13 +-
drivers/net/nfp/nfp_net_pmd.h | 9 +-
drivers/net/qede/base/bcm_osal.h | 20 +-
drivers/net/qede/base/ecore_int_api.h | 28 +-
drivers/net/qede/base/ecore_spq.c | 3 +-
drivers/net/qede/qede_rxtx.c | 2 +-
drivers/net/thunderx/base/nicvf_plat.h | 36 +--
drivers/net/virtio/virtio_pci.c | 97 ++-----
drivers/net/vmxnet3/vmxnet3_ethdev.h | 8 +-
lib/librte_eal/common/Makefile | 3 +-
.../common/include/arch/arm/rte_atomic.h | 6 -
.../common/include/arch/arm/rte_atomic_32.h | 12 +
.../common/include/arch/arm/rte_atomic_64.h | 57 ++--
lib/librte_eal/common/include/arch/arm/rte_io.h | 51 ++++
lib/librte_eal/common/include/arch/arm/rte_io_64.h | 159 +++++++++++
.../common/include/arch/ppc_64/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/ppc_64/rte_io.h | 47 +++
.../common/include/arch/tile/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/tile/rte_io.h | 47 +++
.../common/include/arch/x86/rte_atomic.h | 6 +
lib/librte_eal/common/include/arch/x86/rte_io.h | 47 +++
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++
lib/librte_eal/common/include/generic/rte_io.h | 317 +++++++++++++++++++++
43 files changed, 981 insertions(+), 259 deletions(-)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_io_64.h
create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/tile/rte_io.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_io.h
create mode 100644 lib/librte_eal/common/include/generic/rte_io.h
--
2.5.5
Jerin Jacob
2017-01-12 09:16:58 UTC
Permalink
This commit introduce rte_io_mb(), rte_io_wmb() and rte_io_rmb(), in
order to enable memory barriers between I/O device and CPU.

Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/generic/rte_atomic.h | 27 ++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h b/lib/librte_eal/common/include/generic/rte_atomic.h
index 43a704e..7b81705 100644
--- a/lib/librte_eal/common/include/generic/rte_atomic.h
+++ b/lib/librte_eal/common/include/generic/rte_atomic.h
@@ -100,6 +100,33 @@ static inline void rte_smp_wmb(void);
*/
static inline void rte_smp_rmb(void);

+/**
+ * General memory barrier for I/O device
+ *
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_io_mb() call are visible to I/O device or CPU before the
+ * LOAD and STORE operations that follow it.
+ */
+static inline void rte_io_mb(void);
+
+/**
+ * Write memory barrier for I/O device
+ *
+ * Guarantees that the STORE operations that precede the
+ * rte_io_wmb() call are visible to I/O device before the STORE
+ * operations that follow it.
+ */
+static inline void rte_io_wmb(void);
+
+/**
+ * Read memory barrier for IO device
+ *
+ * Guarantees that the LOAD operations on I/O device that precede the
+ * rte_io_rmb() call are visible to CPU before the LOAD
+ * operations that follow it.
+ */
+static inline void rte_io_rmb(void);
+
#endif /* __DOXYGEN__ */

/**
--
2.5.5
Jerin Jacob
2017-01-12 09:16:59 UTC
Permalink
The patch does not provide any functional change for IA.
I/O barriers are mapped to existing smp barriers.

CC: Bruce Richardson <***@intel.com>
CC: Konstantin Ananyev <***@intel.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/x86/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
index 00b1cdf..4eac666 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
@@ -61,6 +61,12 @@ extern "C" {

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
/*------------------------- 16 bit atomic operations -------------------------*/

#ifndef RTE_FORCE_INTRINSICS
--
2.5.5
Jerin Jacob
2017-01-12 09:17:00 UTC
Permalink
The patch does not provide any functional change for tile.
I/O barriers are mapped to existing smp barriers.

CC: Zhigang Lu <***@ezchip.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/tile/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/tile/rte_atomic.h b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
index 28825ff..1f332ee 100644
--- a/lib/librte_eal/common/include/arch/tile/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/tile/rte_atomic.h
@@ -85,6 +85,12 @@ static inline void rte_rmb(void)

#define rte_smp_rmb() rte_compiler_barrier()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_compiler_barrier()
+
+#define rte_io_rmb() rte_compiler_barrier()
+
#ifdef __cplusplus
}
#endif
--
2.5.5
Jerin Jacob
2017-01-12 09:17:01 UTC
Permalink
The patch does not provide any functional change for ppc_64.
I/O barriers are mapped to existing smp barriers.

CC: Chao Zhu <***@linux.vnet.ibm.com>
Signed-off-by: Jerin Jacob <***@caviumnetworks.com>
---
lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
index fb4fccb..150810c 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -87,6 +87,12 @@ extern "C" {

#define rte_smp_rmb() rte_rmb()

+#define rte_io_mb() rte_mb()
+
+#define rte_io_wmb() rte_wmb()
+
+#define rte_io_rmb() rte_rmb()
+
/*------------------------- 16 bit atomic operations -------------------------*/
/* To be compatible with Power7, use GCC built-in functions for 16 bit
* operations */
--
2.5.5
Loading...