DNA sequencers, mass spectrometers, and cell imaging has experienced explosive growth in data production, which overwhelmingly challenges traditional storage systems. For example, the Illumina HiSeq 2500 DNA sequencer can sequence a single whole human genome sample in 27 hours. Many of subsequent data analysis steps are IO intensive and involve repetitively reading and writing large 100 Gb files to and from disks. Another critical requirement in Life Sciences research is to concurrently support hundreds or even thousands of researchers seeking easy access to the data.
Special metadata cluster technology has been designed to efficiently handle both large-size and small-size files, providing either high aggregated sequential I/O or random I/O. Separating metadata from storage, metadata clusters can be up to 128 servers. It is easy for a single system to support millions of file operations per second, which will greatly accelerate data delivery and access.
Our high-performance scale-out NAS systems do not have to rely on all-flash arrays. Customers have a choice in hardware selections. Built on Linux, our suite of softwares runs on commodity hardware, dedicated appliances, or in virtual machines. Customes will be able to leverage the benefits of cost-effective commodity hardware and avoid expensive hardware vendor lock-ins. With 80% utilization, we provide the freedom and flexibility of deploying Scala Storage systems anywhere – whether on-premise or as a virtual machine.
Our suite of software products combines file system, data protection, and volume manager into one unified software layer, creating a single highly intelligent file system, reducing complexity and eliminating silos of storage. All within a single file system, our system delivers industry-leading scalability–from 3 to over thousands of storage nodes in a single cluster.
Scala Storage Solutions offers a new infrastructure that incorporates next-generation scale-out technologies to handle PBs of data. Data is available 24/7/365 to allow customers to create iteration after iteration of work for analysis and testing without running into performance or storage issues. Our system can continue to provide constant data availability to help ensure zero disruption in workflow, even when faced with major hardware failures.
- Great Performance: Hundreds of GB/s throughput, linear growth with the increase ability of storage nodes
- Limitless Scalability: Single system up to 300 PB
- Excellent Reliability: No single point of failure, high end industry standard without traditional RAID
- Easy Management: It takes only minutes to install hundreds of TB, one engineer can easily manage a PB level system
A leading computing center is a scientific research institution with help from the United Nations Development Bureau. In 2009 the center began to focus on high-performance computing and cloud computing.
The center offers the platform and markets promotion around engineering calculations and project consulting, CAE customization services, scientific computing, rendering, 3D scanning and rapid prototyping, virtual simulation, and biocomputing services. The center has successfully provided computing services to more than 200 enterprises and research institutes.
- The center has been a customer since 2012. They currently employ a 1.2 PB system capacity with 20 storage nodes each, 2U 12 drives SATA disks, and throughput above 12 GB/s.
SCALA Storage provide 5PB of storage with 40 billions of image files in one of the largest social networking websites in Asia. Being a New York Stock Exchange company with 31 million active monthly users, SCALA Storage high scalability is very important.
- Solution Tryout in 2008, 50 TB
- Currently over 5 PB deployed, over 40 billions of files in storage
- Average daily upload of 20 million small picture files
- Effectively replaced previous solutions built with open source file systems
- Easy management with only one engineering staff part-time
One of the largest TV station is ranked 85 in the world by Alexa. This TV station is Asia leading national web-based TV broadcaster with local and foreign language services, including English, French, Spanish, Chinese, and Russian. They receive about 3 million unique visitors and 20 million page views per day.
- Solution tryout in 2010, less than 100 TB
- In 2011 they began the switch from their previous NAS solutions and now currently have 15 PBs deployed over 3 data centers.
- No system downtime or uninterrupted operations in over 5 years.
- In 3 data centers with over 500 storage nodes, SCALASync (replication package) effectively managing data across different locations
One of the largest oilfield has produced over 10 billion barrels of oil since production started in 1960. The current production rate is about 1 million barrels per day. Online increase capacity and performance from 200TB to over 4.5PB
- Extensive 9 month product testing between 2008 and 2009
- Phase I: In March of 2009 they purchased 200 TB. The commodity hardware delivered 2x throughput performance over FC SAN
- Phase II: In 2012 the system is now over 1 PB, 10 GbE network with 12 GB/s
- Currently over 4.5 PB, system using 4U 36 drive server 4 TB SATA disk can deliver 1.8 GB/s to 2.1 GB/s per storage node
The Research Institute of Petroleum Exploration and Development (RIPED) is the R&D center of one of the world largest oil field. Holding proven reserves of 3.7 billion barrels of oil, this R&D center has lead 30 international exploration and production projects with operations in Azerbaijan, Canada, Indonesia, Myanmar, Oman, Peru, Sudan, Niger, Thailand, Turkmenistan, and Venezuela.
- Phase I: In 2011 purchased 400 TB
- Phase II: purchased a 2 PB system, using 3U 16 drives commodity hardware, with 4 TB SATA disks
- Aggregated throughput on 31 storage nodes over 20 GB/s, Click here for testing details
A leading institute of genomics is one of the newest institutes of life science in the Asia. They have 311 staff members, including 22 research professors and 219 postgraduate students.
The mission of the institute is to develop advanced genome sequencing and bioinformatic techniques, explore efficient statistical and computational strategies to take advantage of various omics data, and address fundamental questions of life science by integrating the above abilities.
Research infrastructure is organized in four sections: Genome science & bioinformatics, Computational & evolutionary genomics, Genomic variations & personalised medicine, and Core omics facility. The core facility has over 20 sequencers including Solid, Solexa, Hiseq, Ion Proton and Pacbio, and a high-performance computing platform consisting of computer clusters with a total of 5,000 CPU cores with over 3 PB of high performance storage.
- They have been a customer since 2012. They employ a 1 PB capacity system