Randomized Algorithms Eduardo Laber Loana T. Nogueira.

Randomized Algorithms Eduardo Laber Loana T. Nogueira

Quicksort Objective

Quicksort Objective – Sort a list of n elements

An Idea

Imagine if we could find an element y  S such that half the members of S are smaller than y, then we could use the following scheme

An Idea Imagine if we could find an element y  S such that half the members of S are smaller than y, then we could use the following scheme  Partition S\{y} into two sets S 1 and S 2

An Idea Imagine if we could find an element y  S such that half the members of S are smaller than y, then we could use the following scheme  Partition S\{y} into two sets S 1 and S 2  S 1 : elements of S that are smaller than y  S 2 : elements of S that are greater than y

An Idea Imagine if we could find an element y  S such that half the members of S are smaller than y, then we could use the following scheme  Partition S\{y} into two sets S 1 and S 2  S 1 : elements of S that are smaller than y  S 2 : elements of S that are greater than y  Recursively sort S 1 and S 2

Suppose we know how to find y

Time to find y: cn steps, for some constant c

Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps

Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps T(n)  2T(n/2) + (c+1)n The total number os steps in out sorting procedure would be given by the recurrence

Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S 1 and S 2 in n-1 additional steps T(n)  2T(n/2) + (c+1)n The total number os steps in out sorting procedure would be given by the recurrence  c’nlogn

What´s the problem with the scheme above? Quicksort

What´s the problem with the scheme above? Quicksort How to find y?

Deterministic Quicksort Let y be the first element of S

Deterministic Quicksort Let y be the first element of S Split S into two sets: S

Deterministic Quicksort Let y be the first element of S Split S into two sets: S – S < : elements smaller than y – S > : elements greater than y

Deterministic Quicksort Let y be the first element of S Split S into two sets: S – S < : elements smaller than y – S > : elements greater than y Qsort ( S )

Performance – Worst Case: O( n 2 ) – Avarage Case: O( nlogn )

Performance – Worst Case: O( n 2 ) (The set is already sorted) – Avarage Case: O( nlogn )

An Randomzied Algorithm

An algorithm that makes choice (random) during the algorithm execution

Randomized Quicksort (RandQS)

Choose an element y uniformly at random of S

Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen

Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S

Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S Recursively sort S

Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S Recursively sort S – OUTPUT: S

Intuition For some instance Quicksort works very bad  O( n 2 )

Intuition For some instance Quicksort works very bad  O( n 2 ) Randomization produces different executions for the same input. There is no instance for which RandQS works bad in avarage

Analysis

For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs

Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation

Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation Our Goal: Analyse the expected number of comparisons in an execution of RandQS

Analysis S i : the ith smallest element of S

Analysis S i : the ith smallest element of S S 1 is the smallest element of S

Analysis S i : the ith smallest element of S S 1 is the smallest element of S S n is the largest element of S

Analysis S i : the ith smallest element of S Define the random variable S 1 is the smallest element of S S n is the largest element of S x ik = 1, if S i and S j are compared 0, otherwise

Analysis S i : the ith smallest element of S Define the random variable S 1 is the smallest element of S S n is the largest element of S x ik = 1, if S i and S j are compared 0, otherwise Dado um experimento aleatório com espaço amostral S, uma variável aleatória é uma função que associa a cada elemento amostral um número real

Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons:   i = 1 n j > i X ij

Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons:   i = 1 n j > i X ij We are interested in the expected number of comparisons E[ ]   i = 1 n j > i X ij

Analysis X ij is a count of comparisons between S i and S j : The total number of comparisons:   i = 1 n j > i X ij We are interested in the expected number of comparisons E[ ] =   i = 1 n j > i X ij   i = 1 n j > i E[X ij ] By the linearity of E[]

Analysis P ij : the probability that S i and S j are compared in an execution

Analysis P ij : the probability that S i and S j are compared in an execution Since X ij only assumes the values 0 and 1, ij

Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T

Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S<

Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S>

Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees

Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S y T S<S< S>S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees There is an comparison between S i and S j if and only if one of these elements is an ancestor of the other

Analysis – Binary Tree T of RandQS Consider the permutation  obtained by visiting the nodes of T in increasing order of the level numbers, and in a lef-to-rigth order within each level

Example: S=(3, 6, 2, 5, 4,1)

2 1 {3, 6, 5, 4} {1}

Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} {6}{3, 4} {1}

Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1}

Example: S=(3, 6, 2, 5, 4,1) 2 15 {3, 6, 5, 4} 46 3 {6}{3, 4} {3} {1}  = (2, 1, 5, 4, 6, 3)

Back to the Analysis To compute p ij we make two observations: – There is a comparison between S i and S j if and only if S i or S j occurs earlier in the permutation  than any element S l such that i < l < j Any of the elements S i, S i+1,..., S j is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in  The probability that this first element is either S i or S j is exactly 2/(j-i+1)

Análise Therefore, P ij = 2/(j-i+1)

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =   2/(j-i+1) i = 1 n j > i

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =   2/(j-i+1) i = 1 n j > i    2/k i = 1 k=1 n n-1+1

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =   2/(j-i+1) i = 1 n j > i    2/k i = 1 k=1 n n-1+1  2  i = 1 k=1 n n  1/k

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =   2/(j-i+1) i = 1 n j > i    2/k i = 1 k=1 n n-1+1  2  i = 1 k=1 n n  1/k Série Harmônica

Análise Therefore, P ij = 2/(j-i+1)   i = 1 n j > i E[X ij ] =   i = 1 n j > i P ij =   2/(j-i+1) i = 1 n j > i    2/k i = 1 k=1 n n-1+1  2  i = 1 k=1 n n  1/k Série Harmônica  2 nln n

RandQs x DetQs Expected time of RandQs: O( nlogn ) A certain expected value may not garantee a reasonable probability of success. We could have, for example, the following probabilities of executing O( n 2 ) operations of executing O( nlogn ) operations 1-

RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n 2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one

RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n 2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one Objective: Prove that with high probability the RandQS algorithm works well

High Probability Bound

The previous analysis only says that the expected running time is O(nlog n)

High Probability Bound The previous analysis only says that the expected running time is O(nlog n) This leaves the possibility of large “deviations” from this expected value

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays

High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array – Splits it into smaller and larger elements – Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays x´s contribution to the running time is proportional to the number of different subarrays it belongs

High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays

High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots

High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots With probability 1- n -c, for some pos. constant c

GOOD and BAD Splits

We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray

GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad

GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½

GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½ X can participate in at most log 4/3 n good splits

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n  M/4

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n  M/4 Let M = 32 ln n:

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n  M/4 Let M = 32 ln n log 4/3 n  8 ln n Good choice

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n  M/4 Let M = 32 ln n log 4/3 n  8 ln n exp(-M/8)  1/n 4 Good choice

High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log 4/3 n  M/4 Let M = 32 ln n log 4/3 n  8 ln n exp(-M/8)  1/n 4 The probablility of participating in more than M = 32 ln n splits is less than 1/n 4 Good choice

High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n 3

High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n 3 With probability at least 1-1/n 3, the running time of quicksort is O(n log n)

Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance

Minimum Cut Problem Entrada: – Grafo G=(V,E) Saída: – Conjunto S  V que minimiza, ou seja, o número de arestas que tem uma extremidade em S e outra em

Minimum Cut Problem Notação – d(v) : grau do vértice v no grafo – N(v) : vizinhança de v

Minimum Cut Problem - Applications Network Reliability: If a graph has a small mincut, then it is poorly connected Clustering – Web pages = nodes – Hyperlinks = edges Divide the graph into cluster with little connection between different clusters.

Contração de arestas Dado um grafo G=(V,E) e uma aresta e=(u,v), a contração da aresta e produz o grafo G/e=(V’, E’), onde

Contração de arestas - Exemplo G G/e a b c d uv f g a b c d uv f g e

Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G

Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G Prova: podemos associar cada corte em G/e a um corte em G com mesmo tamanho. Basta substituir em S os nós obtidos por contração pelos nós originais.

Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v)  k, para todo v  V.

Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v)  k, para todo v  V. Prova: Caso contrário S={ v } seria um corte de tamanho menor que k.

Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E|  k.n/2

Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E|  k.n/2 Prova: Segue do lema 2 e de que

Randomized MinCut G 0  G Para i=1 até |V|-2 – selecione aleatoriamente e i em G i-1 – faça G i = G i-1 / e i Retorne os vértices de um dos supervértices obtidos

Probabilidade Lema 3: Seja t 1, t 2,..., t k uma coleção de eventos. Temos que Prova: Base: Assuma que vale para k, provar para k+1.

Teorema Seja C = { e 1, e 2,..., e k } um corte mínimo no grafo G=(V,E). Se nenhuma aresta de C é escolhida pelo RndMinCut, então as arestas que sobram no grafo final são as arestas de C.

Teorema - prova Sejam A e B os dois supervértices obtidos e seja A C e B C as componentes conexas obtidas retirando C. Como nenhuma aresta de C foi escolhida:

Teorema - prova Logo, A=A C e B=B C De fato, assuma que  a  A C e  b  B C tal que a,b  A. Neste caso, existe um caminho em A entre a e b que utiliza somente arestas escolhidas pelo algoritmo. Como qualquer caminho entre a e b tem que utilizar arestas de C, logo uma aresta de C é escolhida. Contradição!

Análise Seja C o conjunto de arestas de um corte mínimo em G. Calculamos a probabilidade do algoritmo não escolher nenhuma aresta de C para contração. Se isto acontece, o algoritmo retorna um corte mínimo.

Análise Sorte i : evento indicando que o algoritmo não sorteou uma aresta de C na i-ésima iteração.

Análise Temos: Segue da relação entre o tamanho do corte e o número de arestas (corolário 1) que:

Análise Na segunda iteração o grafo G 1 tem n-1 vértices e seu corte mínimo tem tamanho |C|. Logo,

Análise Em geral, Segue que,

Análise Logo, conseguimos um min-cut com probabilidade maior ou igual a n=100 => 0,02 % (RUIM) O que fazer ?

Randomized MinCut 2 Repetir o processo RndMinCut várias vezes e devolver o melhor corte encontrado Repetindo K vezes a probabilidade de encontrar o corte mínimo é

Análise Repetindo n 2 /2 vezes a probabilidade de sucesso é (e-1)/e  64 % K repetições => O (K.m) tempo

Complexidade do Algoritmo mais rápido de Karger Running time: O(n 2 log n) Space: O(n 2 ) Este algoritmo encontra um min-cut com probabilidade  (1/log n) [D. Karger e C. Stein, Stoc 1993] Quem se habilita ?

Minimum Cut Problem – Deterministic Algorithm Complexity O(nm log n 2 /m) J. Hao and J. B. Orlin[1994] (baseado em fluxo em redes)

Dois tipos de algoritmos randomizados Algoritmos Las Vegas – Sempre produzem a resposta correta – Tempo de execução é uma variável aleatória Exemplo: RandQs – Sempre produz seqüência ordenada – Tempo de término varia de execução para execução em uma dada instância

Dois tipos de algoritmos randomizados Algoritmos Monte-Carlo – Podem produzir respostas incorretas – A probabilidade de erro pode ser cotada – Executando o algoritmo diversas vezes podemos tornar a probabilidade de erro tão pequena quanto se queira Exemplo: Min-Cut

Lema Se o algoritmo contract termina quando o número de vértices no grafo é exatamente t, então a probabilidade de um dado Min-cut sobreviver é de:

Algoritmo Fast Cut Executar o algoritmo até um dado valor de t e rodar um algoritmo determinístico a partir de um dado valor. Problema: Algoritmo determinístico é lento.

Teoremas Teorema 10.16 – Fast Cut executa em O (n 2 logn) – Prova: Teorema 10.17 – Algoritmo Fast Cut é bem sucedido com probabilidade

MAX SAT Entrada n variáveis booleanas : x 1,... X n m cláusulas : C 1,... C m Pesos w i >= 0 para cada clausula C i Objetivo : Encontrar uma atribuição de verdadeiro/falso para x i que maximize a soma dos pesos das clausulas satisfeitas

MAX SAT Algoritmo Aleatorizado Para i=1,...,n Se random(1/2) = 1 x i  true Senão x i  false Com probabilidade ½ dizemos que uma variável é verdadeira ou falsa

MAX SAT Teorema : O algoritmo tem aproximação ½ Prova: Considere a variável aleatória x j Logo,

MAX SAT E(x j ) = Pr(clausula j ser satisfeita) L j : número de literais na clausula j Obs: A clausula j não é satisfeita somente se todos literais forem 0 C j = (x 1 v x 3 v x 5 v x 6 ) Devemos ter x 1 =0, x 3 = 1, x 5 =0 e x 6 =0 Probabilidade = (1/2) 4 Caso Geral=(1/2) Lj

MAX SAT Probabilidade da clausula j ser satisfeita é 1-(1/2) Lj Logo, 0,5-aproximação Obs: é um limite superior

MAX SAT O que aconteceria se toda clausula tivesse exatamente 3 literais? 7/8-aproximação Hastad 97) Se MAXE3SAT tem aproximação (7/8 +  para algum  > 0, P = NP

MAX SAT: Desaleatorização C 1 = (x 1 v x 2 v  x 3 )w 1 C 2 = (  x 2 v  x 3 )w 2 Cada folha da árvore corresponde a um atribuição: Cada folha esta associada a um peso (soma dos pesos das clausulas satisfeitas pela atribuição correspondente a folha)

MAX SAT: Desaleatorização E(c(I)) = 7/8 w 1 + w 2 /2

Randomized Algorithms Eduardo Laber Loana T. Nogueira.

Apresentações semelhantes

Apresentação em tema: "Randomized Algorithms Eduardo Laber Loana T. Nogueira."— Transcrição da apresentação:

Apresentações semelhantes

Sobre projeto

Feedback

Login

Autorizar-se através da rede social:

Randomized Algorithms Eduardo Laber Loana T. Nogueira.

Apresentações semelhantes

Apresentação em tema: "Randomized Algorithms Eduardo Laber Loana T. Nogueira."— Transcrição da apresentação:

Apresentações semelhantes

Sobre projeto

Feedback