Slides for Chapter 11: Coordination and Agreement From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.1 A network partition
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.2 Server managing a mutual exclusion token for a set of processes
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.3 A ring of processes transferring a mutual exclusion token
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zRicart e Agrawala [1981] zImplementa exclusão mútua distribuída entre N processos. zusa “multicast” e clocks lógicos zIdéia básica: Processos que requerem entrar em uma seção crítica “multicast” uma mensagem de “request”, e pode entrar somente quando todos os outros processos têm respondido a esta mensagem de “request”.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zAs condições sob as quais um processo responde a um “request” são projetadas para garantir que as condições ME1(segurança), ME2(vivacidade)-ME3(ordenação) são satisfeitas. zOs processos p1, p2,..., pN arcam com identificadores numéricos distintos. zÉ assumido existirem canais de comunicação entre os processos e cada processo guarda um clock lógico de Lamport, atualizado de acordo com as regras LC1-LC4 (Cap10 – Tempo em SDs).
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zMensagens de “request” são da forma, onde T é o rótulo de tempo do processo enviando o “request” e pi é o identificador do processo que envia o “request”. zCada processo registra seu estado em uma variável state: - de estar fora da seção crítica (RELEASED), - esperando entrar na seção crítica (WANTED), - ou estando na seção crítica (HELD)
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zSe um processo requisita entrar e o estado de todos os processos é RELEASED, então todos os processo responderão imediatamente ao “request” e o processo requerente obterá a entrada na seção crítica. zSe algum processo está em HELD, então aquele processo não responderá a “requests” até que ele tenha terminado sua seção crítica.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zE assim, o processo requerente não pode ganhar a entrada enquanto isso. zSe dois ou mais processos requerem entrar ao mesmo tempo em suas seções críticas, então qualquer que seja o processo requerente, aquele que suporta o menor rótulo de tempo será o primeiro a coletar N-1 respostas (replies), concedendo a ele próxima entrada na seção crítica.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zSe os “requests” têm rótulos de tempo iguais, os “requests” são ordenados pelos identificadores correspondendo aos processos. zNote que, quando um processo “requests” entrada, ele adia o processamento de “requests” de outros processos até que seu próprio “request” tenha sido enviado e ele tenha registrado o rótulo de tempo T do “request”. zÉ assim que processos tomam decisões consistentes quando processando “requests”.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zO algoritmo alcança a propriedade ME1: zSe fosse possível para dois processos pi e pj (i diferente de j) entrarem nas suas seções críticas ao mesmo tempo, então ambos os processos teriam que ter respondido ao outro. zMas, visto que, os pares e são totalmente ordenados, isto se torna impossível. zO algoritmo também satisfaz a ME2 e ME3.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.4 Ricart and Agrawala’s algorithm On initialization state := RELEASED; To enter the section state := WANTED; Multicast request to all processes;request processing deferred here T := request’s timestamp; Wait until (number of replies received = (N – 1)); state := HELD; On receipt of a request at p j (i ≠ j) if (state = HELD or (state = WANTED and (T, p j ) < (T i, p i ))) then queue request from p i without replying; else reply immediately to p i ; end if To exit the critical section state := RELEASED; reply to any queued requests;
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zPara ilustar o algoritmo, considere a Figura 11.5, seguinte.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.5 Multicast synchronization p 3 34 Reply p 1 p 2 Reply
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zAssuma que p3 não está interessado em entrar na sua seção crítica. zP1 e p2 “requests” entrar concorrentemente. zO rótulo de tempo de p1, T1, é 41. zO rótulo de tempo de p2, T2, é 34.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zQuando p3 recebe seus “requests”, ele responde imediatamente, a p1 e a p2. zQuando p2 recebe o “request” de p1, ele descobre que seu próprio “request” tem o rótilo de tempo menor e assim, não “reply”, retendo p1 a esperar. zContudo, p1 descobre que o “request” de p2 ^tem o rótulo de tempo menor do que o seu próprio “request” e assim responde imediatamente.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Algoritmo de Ricart e Agrawala zNo recebimento do segundo “reply”, p2 pode então entar na sua seção crítica. zQuando p2 sai da sua seção crítica, ele responderá ao “request” de p1 e assim, concede a ele a entrada. zPara obter a entrada, o algoritmo proporciona 2(N-1) mensagens, seguidas por (N-1) respostas. zA vantagem do algoritmo é que seu atraso de sincronização é “round-trip syncronization”, ou seja, somente sobre o tempo de ida-e-volta para transmissão das mensagens “requests” e “replies”.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.6 Maekawa’s algorithm – part 1 On initialization state := RELEASED; voted := FALSE; For p i to enter the critical section state := WANTED; Multicast request to all processes in V i – {p i }; Wait until (number of replies received = (K – 1)); state := HELD; On receipt of a request from p i at p j (i ≠ j) if (state = HELD or voted = TRUE) then queue request from p i without replying; else send reply to p i ; voted := TRUE; end if Continues on next slide
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.6 Maekawa’s algorithm – part 2 For p i to exit the critical section state := RELEASED; Multicast release to all processes in V i – {p i }; On receipt of a release from p i at p j (i ≠ j) if (queue of requests is non-empty) then remove head of queue – from p k, say; send reply to p k ; voted := TRUE; else voted := FALSE; end if
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.7 A ring-based election in progress Note: The election was started by process 17. The highest process identifier encountered so far is 24. Participant processes are shown darkened
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.8 The bully algorithm The election of coordinator p 2, after the failure of p 4 and then p 3
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 11.9 Open and closed groups
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Reliable multicast algorithm
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure The hold-back queue for arriving multicast messages
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Total, FIFO and causal ordering of multicast messages Notice the consistent ordering of totally ordered messages T 1 and T 2, the FIFO-related messages F 1 and F 2 and the causally related messages C 1 and C 3 – and the otherwise arbitrary delivery ordering of messages.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Display from bulletin board program Bulletin board: os.interesting Item FromSubject 23A.HanlonMach 24G.JosephMicrokernels 25A.HanlonRe: Microkernels 26T.L’HeureuxRPC performance 27M.WalkerRe: Mach end
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Total ordering using a sequencer
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure The ISIS algorithm for total ordering Message 2 Proposed Seq P 2 P 3 P 1 P 4 3 Agreed Seq 3 3
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Causal ordering using vector timestamps
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Consensus for three processes
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Consensus in a synchronous system
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Three byzantine generals p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u p 1 (Commander) p 2 p 3 1:x1:w 2:1:w 3:1:x Faulty processes are shown shaded
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure Four byzantine generals p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u Faulty processes are shown shaded p 4 1:v 4:1:v 2:1:v3:1:w 4:1:v p 1 (Commander) p 2 p 3 1:w1:u 2:1:u 3:1:w p 4 1:v 4:1:v 2:1:u3:1:w 4:1:v