

Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.Intilop.com

## 10G bit TCP+UDP Offload Engine MAC + PCIe + Host\_IF (Same PHY Port)

## INT 25012 (Ultra-Low Latency SXTOE+UOE+MAC+PCIe+Host\_IF)

# **SOC IP**

## **Top Level Product Specifications**

Intilop does not assume any liability arising out of the application or use of any product described or shown herein; nor does it convey any license under its patents, copyrights, or design work rights or any rights of others. Intilop reserves the right to make changes, at any time, in order to improve functionality, performance, supportability and reliability of this product. Intilop will not assume responsibility for the use of any code described herein other than the code entirely embodied in its own products or developed under a legally binding contract. Intilop provides design, code, or information shown or described herein "as is." By providing the design, code, or information as one possible implementation of a feature, application, or standard, Intilop makes no representation that such implementation is free from any claims of infringement. End users are responsible for obtaining any rights they may require for their implementation. Intilop expressly disclaims any warranty whatsoever with respect to the adequacy of any such implementation, including but not limited to any warranties or representations that the implementation is free from claims of infringement, as well as any implied warranties of merchantability or fitness for a particular purpose.

Intilop will not assume any liability for the accuracy or correctness of any engineering or software support or assistance provided to a user. Intilop products are not intended for use in life support appliances, devices, or systems. Use of Intilop's product in such applications without the written consent of the appropriate executive of Intilop is prohibited.

# **Product brief, features and benefits summary:**

Highly customizable hardware IP Core. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured/ASIC flow.

# **Provides Ultra-Low latency and highest bandwidth (NETWORK PROVEN)**

- Latency through 10 G TOE/TOE = less than 100 ns
- Ultra-High Throughput: Receives and Sends sustained large TCP/UDP payloads, depending upon remote server/client's capability
- Fully Integrated and tested on Altera/Xilinx FPGAs; TOE+UOE+MAC+Host\_I/F SoC IP bundle

INT 25012 is the only SOC IP Core that implements a full 10G bit TCP and UDP Stack in Handcrafted, Ultra-High Performance, Innovative, Flexible and Scalable architecture which can also be easily customized for end product differentiation. It provides the lowest latency and highest performance in the industry, No exceptions.....

INT 25012 is also the only SOC that integrates 10G TOE+UOE + 10G EMAC + PCIe + Host\_IF interfaces in the smallest logic footprint. It is highly flexible that is customizable for layer-3, layer 4-7 network infrastructure and network security systems applications. It is recommended for use in, among others, high performance Financial Servers and data center equipment design applications. It provides key IP building blocks for very high performance 10 Giga bit Ethernet ASIC/ASSP/FPGAs.

INT 25012 has built in advanced architectural flexibility that provides capability for enterprises to differentiate their Network infrastructure appliances from others and customize them for their specific design application.

INT 25012 can process TCP/UDP Session traffic for Network equipment, at 10G bit rate. This relieves the host CPU from costly UDP software related session, data copying and maintenance tasks thereby delivering 8x to 15x UDP network performance improvement when compared with UDP software.

Intilop offers a wide range of TOE and UOE processing hardware cores for 10-GE to 1-GE applications using PCI Express or embedded system interfaces. TOE/UOE products support full TCP/UDP offload as well as conventional NIC mode operation (as an option in UDP Bypass Mode) and feature advanced PCIe/DMA software support (optional) where applications need little modification/integration to take advantage of TOE+UOE acceleration.

It provides easy-to-use frameworks for utilizing the Xilinx Virtex-5/6, Altera Stratix-IV/V and as an option, provides PCIe/DMA hardcore IPs enabling rapid and efficient system application development.

The 10 G Bit TOE+UOE is based upon the proven and mature patent pending 10 G bit TOE+UOE architecture from Intilop corporation.

The same architecture is scalable to 40 G bit.

## TOE/UOE's design version options in addition to standard TOE/UOE version:

## • Generic TOE+UOE for Network infrastructure design applications:

- a) Optional Very high performance DMA blocks also available to integrate with high performance PCIe Gen 2 interface.
- b) PCIe/Driver for Linux available as option

## • TOE+UOE with enhanced features (available upon request)

- All of the options available in Standard TOE/UOE plus;
  - i. IP and Port number filter block
  - ii. Specific IP and Port Filtered traffic routed to optional selected MAC interface/s or PCIe interface or Memory interface directly at line rate without CPU involvement.
- iii. MAC Filter block, traffic routed to any of the selected interfaces

## **Benefits of Intilop TOE+UOE:**

Featuring APIs at different levels the Standard TOE and UOE allows the application developer to easily migrate from software, to TOE/UOE hardware, to custom hardware, to achieve higher performance.

Advantages and benefits of TOE//UOE

- 20 G throughput.
- Very low application to application latency
- Scalable solution; 40G

## APIs

Network applications use the Socket API. Typically OS implements the Socket API with a UDP/IP software stack. However, the Intilop TOE+UOE implements a standard Hardware API that bypasses the Kernel, places the user\_payload data directly in user\_space allowing next higher level applications to fully take advantage of TOE+UOEs full hardware Offload benefits.

Optionally, to achieve higher performance, Intilop has implemented an equivalent Socket API named TOE+UOE Socket API through PCIe driver which enables plug and play acceleration through a simple intercept of legacy standard calls.

• Hardware API: Enables dedicated processing in the FPGA for application specific acceleration

#### TOE:

• Fully verified using comprehensive verification methodology for ASIC ports and Network system tested core.

• Smallest logic foot print; less than 30,000 Xilinx slices, Altera ALMs or 250,000 ASIC gates + on-chip memory

• Fully integrated 10 G bit high performance Ethernet MAC.

• Scalable MAC Rx FIFOs and Tx FIFOs make it ideal for optimizing system performance.

• Hardware implementation of TCP/IP stacks' control plane and data plane.

• Hardware implementation of ARP protocol processing.

• Extended ARP table creation, deletion management (optional)

• Adheres to RFCs; 793, 1500, 1700, 813, 791, 2001

• Hardware implementation of ICMP or Ping processing.

• 'Sliding Window'. Similar mechanism implemented in hardware allowing Flow Control

• 'Slow start' transfer control in hardware (opt)

• Customizable for IP-protocol only.

• Non-TCP Bypass mode lets all Non TCP/IP related traffic go directly to host

interface via User\_FIFO for TCP/IP software to handle

• Can be deployed behind a gateway which will respond to Gateway-IP request as opposed to ARP request

• On-chip DDR or SSRAM memory controller which can address from 4K Bytes to 4 MB Bytes on chip or 256 MB off chip memories (optional)

• Simple User Side interface for easy hardware integration or a little more complicated for more power full and controlled 'Streaming' data transfers.

• Many trade-offs for some functions performed in hardware or software

• Configurable Packet buffers, session table buffers On-chip or Off-chip memories, attached DDR I/II interface. Depending on system, performance, ASIC/FPGA size requirements-> User Customizable, (optional)

- Interfaces directly to XGMII, 10 G Bit serial interfaces
  - Customizable to handle jumbo frames

• Integrated PLB interface (Xilinx) or Altera PLB. AXI bus interface available

• Integrated AMBA 2.0 interface or MIPs CPU bus for Local Processor control. (opt)

• User programmable/prioritize-able interrupts

• Performs connection/session management

• Monitors, Stores, Maintains and processes up to 1024 live TCP sessions. Customizable to implement more, depending upon on-chip memory availability and other FPGA limitations.

• Extendable to 4K TCP sessions. Internal Memory dependent.

• Wire-speed 20-Gbps Ethernet and TCP performance in full duplex

• Multiple TOEs can process up to 4K connections per second

• TCP + IP check sum generation and check performed in hardware in less than 3 clks (20 ns at 200 MHz) vs 1-2 us by typical software TCP-stack

• Connection Set up, tear down/termination and TCP data transfer without CPU involvement.

• User programmable Session table parameters

• Dedicated set of hardware Timers for each TCP/IP session (opt) or customizable for sharing one set of common timers for all stale sessions.

• Multiple 'slot storage' for fragmented packets. More slots allocated when more Onchip Memory available. Self-checking available memory logic. (optional)

• Out of sequence packet detection/storage and Reassembly/Segmentation (optional)

• Direct Data placement in Applications buffer at full wire speed without CPU->

reduces CPU's buffer copy time and utilization by 95%

- Support VLAN mode (optional)
- Easily customizable for filtering various IP and TCP traffic Protocols, directed towards any port or IP (Ideal for security appliances)
- Implements Full TCP/IP Offload. No CPU involvement at any TCP stage

• Future Proof- Flexible implementation of TCP Offload

- Fully integrated and FPGA ported PHY+MAC+TOE+PCIe/DMA System (opt)
- Basic mini API available for easy integration with Linux/windows. Others OSs/CPUs also available
- Fully integrate System with PCIe/DMA and driver (optional)

• EMAC+TOE+Host\_Interface as one bundle SoC.

• Future TCP Specs updates easily adaptable

• Fully verified using comprehensive verification methodology for ASIC ports and Network system tested core.

• Smallest logic foot print; less than 30,000 Xilinx slices, Altera ALMs or 250,000 ASIC gates + on-chip memory

• Fully integrated 10 G bit high performance Ethernet MAC.

#### UDP:

• Scalable MAC Rx FIFOs and Tx FIFOs make it ideal for optimizing system performance.

• Hardware implementation of UDP stacks' control plane and data plane.

• UDP/IP unicast or Broadcast

• Hardware implementation of ARP protocol processing.

• Hardware implementation of ICMP/Ping Replies.

• Extended ARP table creation, deletion management (optional)

• Hardware implementation of ICMP or Ping processing/Replies.

• Filters for IP addresses and UDP Port numbers

• Non-UDP Bypass mode lets all Non UDP related traffic go directly to host interface via user\_fifo for UDP software to handle

• Can be deployed behind a gateway which will respond to Gateway-IP request as opposed to ARP request

• On-chip DDR or SSRAM memory controller which can address from 4K Bytes to 4 MB Bytes on chip or 256 MB off chip memories (optional) • Simple User Side interface for easy hardware integration or a little more complicated for more power full and controlled 'Streaming' data transfers.

• Many trade-offs for some functions performed in hardware or software

• Configurable Packet buffers, Port table buffers On-chip or Off-chip memories, attached DDRx interface. Depending on system, performance, ASIC/FPGA size requirements-> User Customizable, (optional)

• Interfaces directly to XGMII, 10 G Bit serial interfaces

• Architecture can be scaled up to 40-G bits

• Integrated PLB interface (Xilinx) or Altera PLB. AXI bus interface available

• Integrated AMBA 2.0 interface or MIPs CPU bus for Local Processor control. (opt)

• User programmable/prioritize-able interrupts

• Wire-speed 20-Gbps Ethernet performance in full duplex

• UDP + IP check sum generation and check performed in hardware in less than 3 clks (20 ns at 156 MHz) vs 1-2 us by typical software UDP-stack

• User programmable Session table parameters

• Dedicated set of hardware Timers for each UDP session (opt) or customizable for sharing one set of common timers for all stale sessions.

• Direct Data placement of payload data in Applications buffer at full wire speed without CPU-> reduces CPU's buffer copy time and utilization by 95%

• Support VLAN mode (optional)

• Easily customizable for filtering various IP and UDP traffic Protocols, directed towards any port or IP (Ideal for Trading Appliances)

• Implements Full UDP Offload. No CPU involvement at any stage

• Future Proof- Flexible implementation of UDP Offload

• Fully integrated and FPGA ported PHY+MAC+TOE+UOE+PCIe+Host\_IF System (opt)

• Basic mini API available for easy integration with Linux/windows. Others OSs/CPUs also available

• Fully integrated SoC with PHY+MAC+TOE+UOE+PCIe/DMA and driver

• Future UDP Specs updates easily adaptable



## PCI Express IP core and TOE/UOE+PCIe FPGA NIC Key features:

- Compliant with the PCI Express® Base Specification, revision 2.0 and 1.1
- Supports Native and Legacy Endpoint: x1, x4, x8 lanes
- 1 Virtual Channel (VC) with standard TOE/UOE+DMA+PCIe NIC System
- Up to 32 PCIe Virtual Channels available as Option
- Direct UOE Register access via PCIe interface.
- Dedicated/Independent high performance TCP Payload Data Path between TOE/UOE and PCIe
- TOE/UOE-PCIe driver API for easy Linux Host System Application integration
- Standard UOE+PCIe+DMA FPGA-NIC implements up to 4 DMAs.
- Includes Physical, Data Link, Transaction, and EZDMA Application layers
- Optimized for high throughput and minimal latency
- PIPE interface to PHY
- o 16-bit/125Mhz or 8-bit/250Mhz
- Maximum payload size up to 2KB
- Number of outstanding read requests: up to 16
- Up to 6 BARs plus expansion ROM
- DMA-based user's interface
- $\circ~$  Up to 8 DMA channel option
- o Scatter-Gather support with host based descriptors
- Integrated DMA arbitration optimized for maximum throughput
- PCIe Standard Linux Driver with fully Integrated FPGA-NIC-System/development Kit
- Ultra-High Performance, Ultra-low latency PCIe driver with fully Integrated FPGA-NIC-System/development Kit, available as Option





# FPGA Development System with fully integrated and tested; PHY+MAC+TOE+UOE+PCIe+Host\_IF available as an option

# Specifications brief:

- Second Gen-TOE+UOE. Protocol Compliance and functionality proven in multiple networking equipment
- Complete header, flag processing of UDP sessions and UDP Payloads in hardware → accelerates by 8x 15x
- UDP Offload Engine- 10-G b/s Wire-speed performance
- Scalable to 40 G b/s
- UDP + IP check sum- hardware
- UDP port address tracking/automatic DMA
- MAC Address search logic/filter (opt)
- IP address search logic/filter (opt)
- iRDMA implementation- Direct Data placement in Applications buffer
  --> reduces CPU utilization by 95+ %
- Future Proof- Flexible implementation of UDP Offload

• Accommodates future Specifications changes.

## AMBA/PLB/AXI CPU interface features;

- Basic transfers
- Various Transfer types
- Master/slave Bus Arbitration (optional)
- Bus slave transactions
- Bus master transactions (optional)
- Address decoder
- Bus Arbiter (optional)
- System Endianness: Little-Endian

## **Deliverables:**

- NetList.
- Test Bench, ,vcd files, configuration code/API for easy Linux port
- Linux Driver for TOE+PCIe/DMA
- Verilog models for various components e.g. UDP Client and Server models, transaction model (optional)
- External memory interface/model (optional).
- UDP Model (optional)
- Verification suite (optional)
- Test packet-traffic suite (optional)

## **CONTACT INTILOP FOR LATEST SPECIFICATIONS** (Specifications are subject to change)

**Technology and Solutions License Purchasing Options:** 

- IP Core in Netlist form.
- Fully ported and Network tested FPGA development System Platform
- IP Customization and Customer Hardware and Application Software integration services.

Contact: info@ intilop.com for details

