Skip to main content

EUROPE - BRAZIL COLLABORATION OF BIG DATA SCIENTIFIC RESEARCH THROUGH CLOUD-CENTRIC APPLICATIONS

  • Partners
  • Communication kit
  • Contact
  • About
logo
  • News & Events
    • Events
      • WACC 2017
    • News
    • Library
    • Deliverables
  • Technology
    • Data Analytics development framework
    • QoS Cloud services
    • Toolbox of descriptive and predictive models
  • Application field
    • Connected societies
  • Standards
  • Europe - Brazil Cooperation
Press [ esc ] or close+

Search form

dagSIM

You are here

Home
CATEGORY

Administrators & cloud service providers

Open Source communities & application developers

QoS Cloud services

How can we help you ?
Do you have any questions about the EUBra-BigSea technology?
Contact us
Share
Facebook Google Plus LinkedIn Twitter 

dagSim is a discrete event simulator, developed by Politecnico di Milano, that can be used to study the performance of DAG based processes such as Map/Reduce, Tez or Apache Spark applications. It simulates the execution of jobs, composed by several stages, each one divided into a set of identical tasks that can be run in parallel on multiple computation resources. A Direct Acyclic Graph (DAG) defines the rules according to which stages should be executed. The tool allows advanced scripting features based on the Lua programming language, and it supports several probability distributions (including non-parametric models obtained directly from a dataset) to define the execution times of tasks.

 

Who should use it? 


Leverage the tool for distributed cloud services, Big Data infrastructures, business analytics software development, performance assessment, benchmarking of computing infrastructures in general. dagSim is used within academic environments and for research purposes, currently integrated as part of model-based resource provisioning and auto-scaling algorithms.

User needs: 

Predict performances of DAG based parallel computation frameworks

Validate benchmarking suites

Accuracy and fast performance

Specific benefits: 

Accuracy and efficiency of the simulations

Performance prediction and optimization

 

User scenario

An autoscaling component of a BigData application deployment that performs queries which can be described using DaG based workflows, can use the tool after having performed a minimal benchmarking campaign, to define the initial resource requirements in term of Virtual Machines to be provisioned, and decide when to release or acquire new resources to respect Service Level Agreements.

 

Download & Resources


The development of the first version of the tool has been completed. The next version is under debugging. New features considering allocation policies to support different Big Data and High Performance computing frameworks are being implemented.

The tool can be download from https://github.com/eubr-bigsea/dagSim

The tool is configured through Lua scripts, which however resemble classical textual based configuration files. For this reason, the actual usage of the tool does not require specific skills, though the definition of the model requires at least a BSc including a course in Statistics. Moreover, script generation can also be automated using for example the SparkLogParser developed in the project. A knowledge of the Lua programming language can improve the outcomes the user can achieve from the tools.

 

License


The tool is based on the Lua programming language, which however is based on the MIT license (https://www.lua.org/license.html) that allows the use of its source code at absolutely no cost and no “copyleft” restrictions.

The cost for using the tool is very limited, since it can produce results very efficiently. An organisation that produces Big Data Analytics software that can be described with DAGs, would require around 1 day for performing basic benchmarking, and one hour to produce the configuration script and run the tool. Benchmarking and script production can however be automated in the deployment phase, reducing the usage cost to few minutes of machine execution time, using for example the SparkLogParser developed in the project.

 

Contact


Prof. Marco Gribaudo of Politecnico di Milano: marco.gribaudo@polimi.it

 

What to learn more? 


View related publications 

--> E. Barbierato, M. Gribaudo, and D. Manini. "Fluid approximation of pool depletion systems", 23rd International Conference on Analytical & Stochastic Modelling Techniques & Applications (ASMTA '16)    E. Barbierato, M. Gribaudo, and D. Manini, pages 60-75. Springer International Publishing, Cham, 2016.

--> E. Gianniti, A. M. Rizzi, E. Barbierato, M. Gribaudo, D. Ardagna, "Fluid Petri Nets for the Performance Evaluation of MapReduce Applications    InfQ 2016 - New Frontiers in Quantitative Methods in Informatics, INFQ 2016 Workshop Proceedings.  

--> D. Ardagna, E. Barbierato, A. Evangelinou, E. Gianniti, M. Gribaudo, T. B. M. Pinto, A. Guimarães, A. P. Couto da Silva, J. M. Almeida. "Performance Prediction of Cloud-Based Big Data Applications".  ICPE 2018, to Appear

Not directly connected to dagSim, but supporting the tool:

--> E.Barbierato, M.Gribaudo and M. Iacono, "Modeling Hybrid Systems in SIMTHESys", Eighth International Workshop on Practical Applications of Stochastic Modelling (PASM '16)", Electronic Notes on Theoretical Computer Science, vol. 327, pp. 5-25 (October 2016), Elsevier, ISSN: 1571-0661, DOI: 10.1016/j.entcs.2016.09.021

--> E. Barbierato, M. Gribaudo, and M. Iacono, "Simulating hybrid systems within SIMTHESys multi-formalism models. 13th European Workshop on Performance Engineering Chios (Greece), October 5-7 2016, Lecture Notes on Computer Science, 9951, pp. 189-203 (October 2016), Springer, ISSN: 0302-9743

avisione logo

EUBra-BIGSEA is funded by the European Commission under the Cooperation Programme, Horizon 2020 grant agreement No 690116. Este projeto é resultante da 3a Chamada Coordenada BR-UE em Tecnologias da Informação e Comunicação (TIC), anunciada pelo Ministério de Ciência, Tecnologia e Inovação (MCTI).  | Disclaimer | Privacy Policy | 

Subscribed