Zettar ZX enables the "distributed" in distributed data-intensive engineering and sciences

Highly scalable, reliable, and petascale-proven in real high-end HPC deployments

About Zettar

An HPC data transfer solution turns network-bound problems into I/O-bound problems - Chin Fang, Zettar Inc.

Zettar Inc. delivers a production-ready, scale-out (i.e. cluster-oriented), petascale-proven, highly available, high-speed data transfer solution capable of multi-100+Gbps for distributed data-intensive engineering and science applications.

Zettar's solution is valuable for artificial intelligence, machine learning, digital transformation, and intelligent enterprises. All four demand the movement of massive amounts of data, and the solution enables them all. Zettar is a US National Science Foundation (NSF) funded startup in Palo Alto, California.

The solution consists of 1) the data transfer software, Zettar zx, 2) a multi-node (i.e. cluster) data transfer system reference design, and 3) an HPC all-NVMe storage tier (aka "burst buffer") reference design. Zettar zx is capable of scale-out encryption for data transfers. The solution's capabilities enable the use of existing storage and transfer setup to transparently gain higher throughput with each additional transfer node, with high reliability, and strong privacy for data-in-motion (if encryption is enabled). This Intel blog Transferring data at 100Gbps and beyond for distributed data-intensive engineering and sciences explains the overall co-design approach of the solution.

In 2017, the solution is already advancing from its third generation - each generation has been developed, tested, and verified on best-of-breed hardware provided by various tier 1 vendor partners in the storage, compute, and networking spaces.

Zettar collaborates with the US Department of Energy (DOE) Office of Science (SC) National Laboratories and its Energy Science Network (ESNet), which runs a state-of-the-art national 100Gbps network infrastructure. As a result, Zettar's software and data transfer system reference design are both proven in this demanding production environment. For example, from early April to early May, 2017, Zettar zx was used to transfer 1PiB files in 1.4 days over a 5000-mile ESNet shared production network (120+ms latency) multiple times. Even the transfer bandwidth is capped at 80Gbps, the results are still more than 5X faster than using Amazon AWS' Snowball. Much simpler and cleaner too!

Zettar is a founding member of the exclusive and elite "Exascale-ready Club". To be admitted in 2017, an organization/team MUST be capable of using a shared, production, point-to-point WAN link to attain a production data transfer rate >= 270PB-km/hour.

Zettar demoed an earlier version of its solution at the Supercomputing 2014 conference (SC14) and showcased its third generation solution at various vendor partners' booths and the DOE's booth at the Supercomputing 2016 conference (SC16). All presentations included a 100Gbps real-time live demo of production preparation runs (with and without TLS encryption) using an innovative and unique 100Gbps 5000-mile loop provided by ESnet (120+ms latency).

Zettar has deployed its solution at a well-known research university, a US DOE Exascale Computing Project (ECP) project, and a hyperscale Web property. Many use cases involve the moving of multiple petabytes of data over distance.

During ESCC Spring 2017, Zettar zx transferred 1PiB files over ESNet and generated 1/3 plus of the live traffic.

Each end only employs a modest two-node cluster consisting of two inexpensive 1U commodity servers with 4x10Gbps unbonded Ethernet ports (thus 2 x 4 x10Gbps = 80Gbps - the bandwidth cap)

Zettar zx generated 1/3 plus of ESNet traffic

News & Publications

  1. Press release, Zettar Transferred 1 Petabyte of Data in Just 34 Hours Using AIC Servers, May 24, 2017
  2. Chin Fang, Les Cottrell, High Performance PetaByte Transfer (PB-SLAC-LesCottrell.pdf - restricted to US DOE audience only), ESNet Site Coordinators Commitee, May 2nd, 2017, Berkeley, California
  3. Andrey Kudryavtsev, Chin Fang, “Transferring data at 100Gbps and beyond for distributed data-intensive engineering and sciences”, Intel IT Peer Network article, November 2016
  4. Chin Fang, “Using AIC SB122A-PH 1U 10-Bay NVMe Storage Servers as the Building Blocks of High-Performance Scale-out Storage for HPC Applications”, white paper commissioned by AIC with Intel’s participation, November 2016
  5. Les Cottrell, Chin Fang, Andy Hanushevsky, Wilko Kreoger, Wei Yang, “High Performance Data Transfer”, CHEP 16, October 10-14, San Francisco, California
  6. Chin Fang, Les Cottrell, Andy Hanushevsky, Wilko Kreoger, Wei Yang, “High Performance Data Transfer for Distributed Data Intensive Sciences”, SLAC Technical Note: SLAC-TN-16-001, September 2016, SLAC National Accelerator Laboratory, Menlo Park, California
  7. Les Cottrell, Antonio Ceseracciu, Chin Fang , “Zettar Data Transfer Node Appliance testing" (zettar_escc-spring16.pptx - restricted to US DOE audience only), ESNet Site Coordinators Committee, Spring 2016, Berkeley, California
  8. Les Cottrell, Antonio Ceseracciu, Chin Fang , “Zettar Data Transfer Node Appliance testing” (zettar_fq.pptx - restricted to US DOE audience only), ESNet Site Coordinators Committee, September 2015, Austin, Texas
  9. Chin Fang, R. “Les” A. Cottrell, “An Overview of High-performance Parallel Big Data transfers over multiple network channels with Transport Layer Security (TLS) and TLS plus Perfect Forward Secrecy (PFS)”, SLAC Technical Note: SLAC-TN-15-002, April 25, 2015
  10. Chin Fang, “Using NVMe Gen3 PCIe SSD Cards in High-density Servers for High-performance Big Data Transfer Over Multiple Network Channels”, SLAC Technical Note: SLAC-TN-15-001, February 24, 2015
  11. Scot Schultz, "Mellanox and Zettar Crush World Record LOSF Performance Using ESnet OSCARS Test Circuit" Mellanox blog, Mellanox Technologies, December 12, 2016

Contact Us