Using both DMA and separate AXI Slave as PCIe requester? #40

dbarrie · 2023-10-03T06:57:43Z

I've currently got a design set up using the pcie_us_axi_dma as the sole user of the PCIe requester interface. This works just as I'd expect and I can DMA between the device and the host, but the design has changed and now calls for the ability to have the device access the host's memory using individual AXI transactions from a separate AXI slave module.

I realize that having a DMA lets me basically do the same thing as having the device able to access the CPU memory directly, but behavior of parts of the system external to the design are forcing my hand a bit, here. The device must be able to deal with AXI transactions generated by the design that potentially cross the PCIe interface and end up in the host's address space.

Is there any way using the library as it is now to split the RQ and RC interfaces and have them shared by two separate users? Are there any plans to add a drop-in AXI slave module that would let the design treat the entire PCIe host address space as an AXI bus?

The text was updated successfully, but these errors were encountered:

alexforencich · 2023-10-03T08:26:49Z

The pcie_us_axi_dma module is basically deprecated. There will be no extensive modifications or extensions to that module. The more recent DMA engine is dma_if_pcie, which is used in combination with device-specific interface shims (pcie_if_us, etc.) and there are several FIFO and mux/demux modules available. There isn't really a nice way to share the DMA side though due to how the tag space works. But, the dma_if_pcie module also supports immediate data for performing small writes without having to manage internal buffer space. Currently, I only have client modules for AXI stream, I'm planning on making both master and slave AXI DMA clients, but have not yet had the time to do so.

dbarrie · 2023-10-03T16:39:09Z

I originally chose to use pcie_us_axi_dma simply because it let me go straight from the PCIe interface to an AXI interface that I could plug in to the rest of the design without having to go through any additional steps; when looking through the library, I didn't see any other (built-in) way to connect the PCIe interface to an AXI bus master. Since that module is deprecated though, it seems that the intention is to have the user write their own conversion between the RAM interface exposed by dma_if_pcie and an AXI master to translate the RAM requests into AXI transactions? It doesn't look like there is any mechanism with that interface to perform larger (than the data width) memory transactions - could doing it this way potentially reduce the performance of the DMA?

If I refactor my design to use dma_if_pcie, it looks like I should be able to MUX the TLPs (exactly like I'm currently doing for the CQ/CC interface that lets the host poke at registers and memory directly) and then just write my own AXI slave -> TLP module? For the CQ interface, I demux the TLPs based on the BAR specified by the TLP, but I'm not sure how I would know how to route the RC TLPs back to the correct destination (either the dma_if_pcie or my own AXI slave). Is there something obvious I'm missing that would let me distinguish the correct destination for RC TLPs? Could I maybe have the DMA use fewer tag bits than are available and utilize one of the remaining tag bits to control which of the two modules the RC TLP should be routed to?

The design itself needs to utilize the DMA to move data quickly between the host and device, but the AXI slave I'm now trying to add is meant to allow for access to the CPU's memory space by any other AXI master in the design (of which there are many - this is a toy GPU!) through a page table mapping similar to GART.

(As an aside, I just wanted to say how awesome your verilog-pcie and verilog-axi libraries are! I'm using them extensively in my design, and after writing some custom SV wrappers for them, they've been nothing but easy to use and a massive time savings over having to deal with Xilinx's versions. My familiarity with PCIe is pretty limited, and I was still able to drop verilog-pcie into my design and get things stood up incredibly quickly. The benefit these libraries have shown to the HDL community can not be overstated!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using both DMA and separate AXI Slave as PCIe requester? #40

Using both DMA and separate AXI Slave as PCIe requester? #40

dbarrie commented Oct 3, 2023

alexforencich commented Oct 3, 2023

dbarrie commented Oct 3, 2023

Using both DMA and separate AXI Slave as PCIe requester? #40

Using both DMA and separate AXI Slave as PCIe requester? #40

Comments

dbarrie commented Oct 3, 2023

alexforencich commented Oct 3, 2023

dbarrie commented Oct 3, 2023