Randomized Algorithms Eduardo Laber Loana T. Nogueira
Quicksort Objective
Quicksort Objective – Sort a list of n elements
An Idea
Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme
An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S 1 and S 2
An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S 1 and S 2 S 1 : elements of S that are smaller than y S 2 : elements of S that are greater than y
An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S 1 and S 2 S 1 : elements of S that are smaller than y S 2 : elements of S that are greater than y Recursively sort S 1 and S 2
Suppose we know how to find y
Time to find y: cn steps, for some constant c
Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps
Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps T(n) 2T(n/2) + (c+1)n The total number os steps in out sorting procedure would be given by the recurrence
Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps T(n) 2T(n/2) + (c+1)n The total number os steps in out sorting procedure would be given by the recurrence c’nlogn
What´s the problem with the scheme above? Quicksort
What´s the problem with the scheme above? Quicksort How to find y?
Deterministic Quicksort Let y be the first element of S
Deterministic Quicksort Let y be the first element of S Split S into two sets: S
Deterministic Quicksort Let y be the first element of S Split S into two sets: S – S < : elements smaller than y – S > : elements greater than y
Deterministic Quicksort Let y be the first element of S Split S into two sets: S – S < : elements smaller than y – S > : elements greater than y Qsort ( S )
Performance – Worst Case: O( n 2 ) – Avarage Case: O( nlogn )
Performance – Worst Case: O( n 2 ) (The set is already sorted) – Avarage Case: O( nlogn )
Performance – Worst Case: O( n 2 ) (The set is already sorted) – Avarage Case: O( nlogn )
An Randomzied Algorithm
An algorithm that makes choice (random) during the algorithm execution
Randomized Quicksort (RandQS)
Choose an element y uniformly at random of S
Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen
Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S
Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S Recursively sort S
Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S Recursively sort S – OUTPUT: S
Intuition For some instance Quicksort works very bad O( n 2 )
Intuition For some instance Quicksort works very bad O( n 2 ) Randomization produces different executions for the same input. There is no instance for which RandQS works bad in avarage
Analysis
For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs
Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation
Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation Our Goal: Analyse the expected number of comparisons in an execution of RandQS
Analysis S i : the ith smallest element of S
Analysis S i : the ith smallest element of S S 1 is the smallest element of S
Analysis S i : the ith smallest element of S S 1 is the smallest element of S S n is the largest element of S
Analysis S i : the ith smallest element of S Define the random variable S 1 is the smallest element of S S n is the largest element of S x ik = 1, if S i and S j are compared 0, otherwise
Analysis S i : the ith smallest element of S Define the random variable S 1 is the smallest element of S S n is the largest element of S x ik = 1, if S i and S j are compared 0, otherwise Dado um experimento aleatório com espaço amostral S, uma variável aleatória é uma função que associa a cada elemento amostral um número real
Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons: i = 1 n j > i X ij
Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons: i = 1 n j > i X ij We are interested in the expected number of comparisons E[ ] i = 1 n j > i X ij
Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons: i = 1 n j > i X ij We are interested in the expected number of comparisons E[ ] = i = 1 n j > i X ij i = 1 n j > i E[X ij ] By the linearity of E[]
Analysis P ij : the probability that S i and S j are compared in an execution
Analysis P ij : the probability that S i and S j are compared in an execution Since X ij only assumes the values 0 and 1, ij
Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T
Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S<
Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S>
Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees
Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees There is an comparison between S i and S j if and only if one of these elements is an ancestor of the other
Analysis – Binary Tree T of RandQS Consider the permutation obtained by visiting the nodes of T in increasing order of the level numbers, and in a lef-to-rigth order within each level
Example: S=(3, 6, 2, 5, 4,1)
2
2 1 {3, 6, 5, 4} {1}
Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} {6}{3, 4} {1}
Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1}
Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1}
Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1}
Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1} = (2, 1, 5, 4, 6, 3)
Back to the Analysis To compute p ij we make two observations: – There is a comparison between S i and S j if and only if S i or S j occurs earlier in the permutation than any element S l such that i < l < j Any of the elements S i, S i+1,..., S j is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either S i or S j is exactly 2/(j-i+1)
Back to the Analysis To compute p ij we make two observations: – There is a comparison between S i and S j if and only if S i or S j occurs earlier in the permutation than any element S l such that i < l < j Any of the elements S i, S i+1,..., S j is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either S i or S j is exactly 2/(j-i+1)
Back to the Analysis To compute p ij we make two observations: – There is a comparison between S i and S j if and only if S i or S j occurs earlier in the permutation than any element S l such that i < l < j Any of the elements S i, S i+1,..., S j is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either S i or S j is exactly 2/(j-i+1)
Back to the Analysis To compute p ij we make two observations: – There is a comparison between S i and S j if and only if S i or S j occurs earlier in the permutation than any element S l such that i < l < j Any of the elements S i, S i+1,..., S j is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either S i or S j is exactly 2/(j-i+1)
Análise Therefore, P ij = 2/(j-i+1)
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij =
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij = 2/(j-i+1) i = 1 n j > i
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij = 2/(j-i+1) i = 1 n j > i 2/k i = 1 k=1 n n-1+1
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij = 2/(j-i+1) i = 1 n j > i 2/k i = 1 k=1 n n-1+1 2 i = 1 k=1 n n 1/k
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij = 2/(j-i+1) i = 1 n j > i 2/k i = 1 k=1 n n-1+1 2 i = 1 k=1 n n 1/k Série Harmônica
Análise Therefore, P ij = 2/(j-i+1) i = 1 n j > i E[X ij ] = i = 1 n j > i P ij = 2/(j-i+1) i = 1 n j > i 2/k i = 1 k=1 n n-1+1 2 i = 1 k=1 n n 1/k Série Harmônica 2 nln n
RandQs x DetQs Expected time of RandQs: O( nlogn ) A certain expected value may not garantee a reasonable probability of success. We could have, for example, the following probabilities of executing O( n 2 ) operations of executing O( nlogn ) operations 1-
RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n 2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one
RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n 2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one Objective: Prove that with high probability the RandQS algorithm works well
High Probability Bound
The previous analysis only says that the expected running time is O(nlog n)
High Probability Bound The previous analysis only says that the expected running time is O(nlog n) This leaves the possibility of large “deviations” from this expected value
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays
High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays x´s contribution to the running time is proportional to the number of different subarrays it belongs
High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays
High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots
High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots With probability 1- n -c, for some pos. constant c
GOOD and BAD Splits
We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray
GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad
GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½
GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½ X can participate in at most log 4/3 n good splits
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n M/4
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n M/4 Let M = 32 ln n:
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n M/4 Let M = 32 ln n log 4/3 n 8 ln n Good choice
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n M/4 Let M = 32 ln n log 4/3 n 8 ln n exp(-M/8) 1/n 4 Good choice
High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n M/4 Let M = 32 ln n log 4/3 n 8 ln n exp(-M/8) 1/n 4 The probablility of participating in more than M = 32 ln n splits is less than 1/n 4 Good choice
High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n 3
High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n 3 With probability at least 1-1/n 3, the running time of quicksort is O(n log n)
Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance
Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance
Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance
Minimum Cut Problem Entrada: – Grafo G=(V,E) Saída: – Conjunto S V que minimiza, ou seja, o número de arestas que tem uma extremidade em S e outra em
Minimum Cut Problem Notação – d(v) : grau do vértice v no grafo – N(v) : vizinhança de v
Minimum Cut Problem - Applications Network Reliability: If a graph has a small min- cut, then it is poorly connected Clustering – Web pages = nodes – Hyperlinks = edges Divide the graph into cluster with little connection between different clusters.
Contração de arestas Dado um grafo G=(V,E) e uma aresta e=(u,v), a contração da aresta e produz o grafo G/e=(V’, E’), onde
Contração de arestas - Exemplo G G/e a b c d uv f g a b c d uv f g e
Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G
Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G Prova: podemos associar cada corte em G/e a um corte em G com mesmo tamanho. Basta substituir em S os nós obtidos por contração pelos nós originais.
Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v) k, para todo v V.
Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v) k, para todo v V. Prova: Caso contrário S={ v } seria um corte de tamanho menor que k.
Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E| k.n/2
Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E| k.n/2 Prova: Segue do lema 2 e de que
Randomized MinCut G 0 G Para i=1 até |V|-2 – selecione aleatoriamente e i em G i-1 – faça G i = G i-1 / e i Retorne os vértices de um dos supervértices obtidos
Probabilidade Lema 3: Seja t 1, t 2,..., t k uma coleção de eventos. Temos que Prova: Base: Assuma que vale para k, provar para k+1.
Teorema Seja C = { e 1, e 2,..., e k } um corte mínimo no grafo G=(V,E). Se nenhuma aresta de C é escolhida pelo RndMinCut, então as arestas que sobram no grafo final são as arestas de C.
Teorema - prova Sejam A e B os dois supervértices obtidos e seja A C e B C as componentes conexas obtidas retirando C. Como nenhuma aresta de C foi escolhida:
Teorema - prova Logo, A=A C e B=B C De fato, assuma que a A C e b B C tal que a,b A. Neste caso, existe um caminho em A entre a e b que utiliza somente arestas escolhidas pelo algoritmo. Como qualquer caminho entre a e b tem que utilizar arestas de C, logo uma aresta de C é escolhida. Contradição!
Análise Seja C o conjunto de arestas de um corte mínimo em G. Calculamos a probabilidade do algoritmo não escolher nenhuma aresta de C para contração. Se isto acontece, o algoritmo retorna um corte mínimo.
Análise Sorte i : evento indicando que o algoritmo não sorteou uma aresta de C na i-ésima iteração.
Análise Temos: Segue da relação entre o tamanho do corte e o número de arestas (corolário 1) que:
Análise Na segunda iteração o grafo G 1 tem n-1 vértices e seu corte mínimo tem tamanho |C|. Logo,
Análise Em geral, Segue que,
Análise Logo, conseguimos um min-cut com probabilidade maior ou igual a n=100 => 0,02 % (RUIM) O que fazer ?
Randomized MinCut 2 Repetir o processo RndMinCut várias vezes e devolver o melhor corte encontrado Repetindo K vezes a probabilidade de encontrar o corte mínimo é
Análise Repetindo n 2 /2 vezes a probabilidade de sucesso é (e-1)/e 64 % K repetições => O (K.m) tempo
Complexidade do Algoritmo mais rápido de Karger Running time: O(n 2 log n) Space: O(n 2 ) Este algoritmo encontra um min-cut com probabilidade (1/log n) [D. Karger e C. Stein, Stoc 1993] Quem se habilita ?
Minimum Cut Problem – Deterministic Algorithm Complexity O(nm log n 2 /m) J. Hao and J. B. Orlin[1994] (baseado em fluxo em redes)
Dois tipos de algoritmos randomizados Algoritmos Las Vegas – Sempre produzem a resposta correta – Tempo de execução é uma variável aleatória Exemplo: RandQs – Sempre produz seqüência ordenada – Tempo de término varia de execução para execução em uma dada instância
Dois tipos de algoritmos randomizados Algoritmos Monte-Carlo – Podem produzir respostas incorretas – A probabilidade de erro pode ser cotada – Executando o algoritmo diversas vezes podemos tornar a probabilidade de erro tão pequena quanto se queira Exemplo: Min-Cut
Lema Se o algoritmo contract termina quando o número de vértices no grafo é exatamente t, então a probabilidade de um dado Min-cut sobreviver é de:
Algoritmo Fast Cut Executar o algoritmo até um dado valor de t e rodar um algoritmo determinístico a partir de um dado valor. Problema: Algoritmo determinístico é lento.
Teoremas Teorema – Fast Cut executa em O (n 2 logn) – Prova: Teorema – Algoritmo Fast Cut é bem sucedido com probabilidade
MAX SAT Entrada n variáveis booleanas : x 1,... X n m cláusulas : C 1,... C m Pesos w i >= 0 para cada clausula C i Objetivo : Encontrar uma atribuição de verdadeiro/falso para x i que maximize a soma dos pesos das clausulas satisfeitas
MAX SAT Algoritmo Aleatorizado Para i=1,...,n Se random(1/2) = 1 x i true Senão x i false Com probabilidade ½ dizemos que uma variável é verdadeira ou falsa
MAX SAT Teorema : O algoritmo tem aproximação ½ Prova: Considere a variável aleatória x j Logo,
MAX SAT E(x j ) = Pr(clausula j ser satisfeita) L j : número de literais na clausula j Obs: A clausula j não é satisfeita somente se todos literais forem 0 C j = (x 1 v x 3 v x 5 v x 6 ) Devemos ter x 1 =0, x 3 = 1, x 5 =0 e x 6 =0 Probabilidade = (1/2) 4 Caso Geral=(1/2) Lj
MAX SAT Probabilidade da clausula j ser satisfeita é 1-(1/2) Lj Logo, 0,5-aproximação Obs: é um limite superior
MAX SAT O que aconteceria se toda clausula tivesse exatamente 3 literais? 7/8-aproximação Hastad 97) Se MAXE3SAT tem aproximação (7/8 + para algum > 0, P = NP
MAX SAT: Desaleatorização C 1 = (x 1 v x 2 v x 3 )w 1 C 2 = ( x 2 v x 3 )w 2 Cada folha da árvore corresponde a um atribuição: Cada folha esta associada a um peso (soma dos pesos das clausulas satisfeitas pela atribuição correspondente a folha)
MAX SAT: Desaleatorização E(c(I)) = 7/8 w 1 + w 2 /2