Data Intensive Research Facility

Astronomy Image

Data Intensive Research Facility

IDIA has established a “Tier 2” data-intensive research facility for the purpose of storage and processing of data from the SKA precursor MeerKAT imaging Large Survey projects (LSPs) and data from other SKA pathfinder projects.  The facility will resource, at highest priority,  MeerKAT projects that are led by researchers from South Africa within the IDIA university partnership (UCT, UWC, NWU and UP), or includes significant participation by researchers at IDIA institutes.  Projects with IDIA participation currently include LADUMA, MIGHTEE, MONGHOOSE, Fornax, ThunderKAT, and the related MeerLICHT project.  It is open for use by LSP projects from South African outside of IDIA and to international collaborators, subject to agreement and provision of resources.

The IDIA facility infrastructure currently consists of:

  • 40 compute nodes with:
    • 2.6 GHz Xeon processors
    • 32 cores
    • 256 GB RAM
    • 4 nodes with 2x NVidia P100 GPUs
  • a combination of POSIX, Block and Object storage
  • 0.5 PB initial storage
  • 10Gb/s network access to SANReN

The facility is intended to provide sufficient storage capacity for persistent storage of the aggregated LSP visibility data from the MeerKAT data store over the lifetime of the project, as well as storage of intermediate science data products and post-processing products.  The storage capacity will increase as data from MeerKAT and other SKA pathfinder projects are ingested.

The facility is designed to be an agile development platform for pipeline and post-processing algorithms, and for analytics and data mining.  A goal is to foster coordinated development of pipelines among and between LSPs to identify common processing needs and take advantage of expertise across the LSPs. It is managed by the University partners with the participation of the community of research users.  As a node of the proto-type distributed African Data Intensive Research Cloud,  in addition to providing a pipeline development platform it will serve as a testbed for cloud-based provision of resources, tools and platforms for data intensive research.

IDIA is also the lead organization in a proposed Western Cape Data Intensive Research Facility (WCDIRF). The WCDIRF will be a data-centric high performance computing facility focused on providing data intensive research capacity for astronomy and bioinformatics as part of a national tier-distributed infrastructure within the Data Intensive Research Initiative for South Africa (DIRISA). IDIA will lead the development and implementation of astronomy-focused data intensive research solutions and more general system access and data distribution tools, with a major goal to provide the infrastructure and software systems for execution of MeerKAT Large Survey Projects.

Call For Projects

How does it work?