I/O System

In the era of data-intensive computing, large-scale applications, in both scientific and the BigData communities, demonstrate unique I/O requirements leading to a variety of storage solutions which are often incompatible with one another. How can we support a wide variety of conflicting I/O workloads under a single storage system?  

“The tools and cultures of HPC and BigData analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains.”

D. Reed & J. Dongarra 

We introduce the idea of a Label, a new data representation, and, we present LABIOS: a new, distributed, Label- based I/O system. LABIOS boosts I/O performance by up to 17x via asynchronous I/O, supports heterogeneous storage resources, offers storage elasticity, and promotes in-situ analytics via data provisioning. LABIOS demonstrates the effectiveness of storage consolidation to support the convergence of HPC and BigData workloads on a single platform.

High-Level Architecture


I/O requests are transformed into a configurable unit, called a (data) Label


Labels are pushed to a distributed queue


Data or contents are pushed into a warehouse


A dispatcher distributes labels to the workers


Workers execute labels independently (i.e., fully decoupled)


- Distributed, scalable, and adaptive storage 
- Fully decoupled architecture
- Software defined storage (SDS)
- Energy-aware enabling power-capped I/O
- Reactive storage with tunable I/O performance
- Flexible API
- Intersection of HPC and BigData.


Agile and Flexible

Adaptive to the environment.
POSIX, MPI-IO, HDF5, REST/Swift, Hadoop
Lustre, GPFS, HDFS, Hive, Object Stores


Power-cap I/O
Elastic I/O resources
Containerized environments


Software Defined Storage (SDS)

Offloading computation to servers
Data-centric architecture.
Fully decoupled architecture.

Reactive Design

Tunable I/O performance Concurrency control
Guaranteed Storage QoS based on job size


Write and read operations decomposition

Dispatcher runs on a dedicated node
100K auto generated labels
Mixed read and write
Equal size
Linear scalability
Round robin and random select 55-125K
Constraint-based more communication intensive
MinMax more CPU intensive due to DP approach

4096 labels of 1MB each
Vary the ratio of active – suspended workers
Worker activation in 3 sec on average
Worker allocation techniques
Static: labels only on active workers
Elastic: labels to all workers (even on suspended paying the penalty of activation

Support of both sync – async modes
Label paradigm fits (naturally) in async
CM1 simulation scaled up to 3072 processes with 16 time steps
Each process writes 32MB of I/O
100GB per step for the 3072 case
Sync mode competitive with PFS baseline
Async mode overlaps label execution with computations
16x boost in I/O performance
40% reduction in execution time

Montage application
Multiple executables that share data
50GB of intermediate results in temporary files in PFS
LABIOS shares data via the warehouse (i.e., in-memory)
Label destination is analysis compute nodes
Performance acceleration
No temporary files are created in remote storage
Simulation and analysis can be pipelined
17x boost in I/O performance
65% reduction in execution time

Two modes for LABIOS:
Node-local I/O (similar to HDFS)
Remote external I/O (similar to HPC)
Map processes read 32MB each and then write them back to storage
Reduce processes read 32MB each
Shuffle sends 32MB through network
Hadoop-memory optimized version
No disk I/O for intermediate results
LABIOS employs collective I/O to perform data aggregations
LABIOS successfully integrates MapReduce with HPC


A Multi-tiered Shared Log Store


Distributed Multi-tiered I/O Buffering


I/O Redirection via Integrated Storage


10 West 31st Street                     
Chicago, IL, 60616

Contact Info

Work:     +1 312 567 5023  
Phone:   +1 312 493 9389

Set up a free website - More