文章摘要
刘国良,陈蜀宇.具有O(n)消息复杂度的非阻塞检查点算法[J].高技术通讯(中文),2012,22(12):
具有O(n)消息复杂度的非阻塞检查点算法
A non blocking checkpointing algorithm with message complexity O(n)
  修订日期:2011-03-28
DOI:
中文关键词: 容错, 非阻塞检查点, 回卷恢复, 单阶段提交算法
英文关键词: fault tolerance, non blocking checkpointing, rollback recovery, single phase commit algorithm
基金项目:重庆市自然科学基金计划(CSTC2008BB2307)资助项目
作者单位
刘国良 重庆大学计算机学院 重庆;重庆市计量质量检测研究院 重庆 
陈蜀宇 重庆大学计算机学院 重庆 
摘要点击次数: 2918
全文下载次数: 2226
中文摘要:
      为了用检查点设置及回卷恢复技术提高并行分布式系统容错性能时降低设置检查点的时间和空间开销,提出了一种非阻塞协调检查点算法。与传统的两阶段提交算法不同,该算法是单阶段提交算法,可跳过临时检查点阶段直接获得永久检查点,减少了同步控制消息的数量,加快了检查点的形成时间。它通过发送进程排除孤儿消息,实现了并行计算;通过设置检查点算法启动周期,解决中途消息问题。该算法的时间复杂度由通常的O(n2)降低到O(n),只需要n-1个同步消息。
英文摘要:
      In order to reduce the time and space overheads for setting checkpoints when using the checkpointing rollback recovery technique to enhance the error tolerance performance of parallel and distributed systems, a non blocking cooperative checkpointing algorithm is proposed. It is a one phase method. Different from the conventional (two phase) approach, it jumps over the temporary checkpoint phase to obtain permanent checkpoints to reduce the number of controlled messages and the time of checkpoint forming. The proposed checkpointing algorithm allows processes to take permanent checkpoints directly, without taking temporary checkpoints . The character of the algorithm contributes to its speed of execution. Orphan messages are eliminated by sender processes and in transit messages are eliminated by the checkpointing interval and retransmission mechanism. Its time complexity for obtaining checkpoints is O(n), not the ordinary O(n2).
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮