Ophidia is a CMCC Foundation research project addressing Big Data challenges in eScience. It exploits advanced parallel computing techniques and a hierarchical storage organization to execute intensive data analysis over multi-terabytes datasets.
Ophidia provides a Big Data analytics framework for parallel I/O and the analysis of multi-dimensional datasets. It leverages the datacube abstraction and comes with an extensive set of OLAP-oriented parallel operators, supporting e.g. datacube sub-setting, datacube aggregation, NetCDF file import and export, datacube intercomparison. Additionally it provides several primitives to operate on n-dimensional arrays that allow, for example, sub-setting, data aggregation, array concatenation, algebraic expressions, predicate evaluation, statistical analysis and regression.
Key features and benefits.
Essential information for potential users
The latest Ophidia release is v1.0.0 (released in March 2017).
Open source framework released under the GPLv3 license.
It can be installed on Linux Debian/RedHat-based operating systems. Most libraries and tools dependencies are automatically solved when installing the binary packages, while MySQL server and Slurm should be manually installed and configured. This is being addressed (as cluster deployment in a cloud environment) in the frame of EUBra-BIGSEA and full automatic installation will be feasible by the end of the project.
Ophidia can be exploited by users through the Ophidia terminal (shell-like client) or PyOphidia (python bindings). To support the user, the terminal provides auto-completion features and an online manual for all commands and operators available.
Ophidia is used mainly in scientific sectors like in the climate change domain. It has been extended and used in several research projects like: FP7 EUBRazilCloudConnect, FP7 CLIP-C and H2020 INDIGO-DataCloud.