First edit distance:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  def edit_dist_dp(str1, str2, m, n): store = [[0 for x in range(n + 1)] for x in range(m + 1)] for i in range(m + 1): for j in range(n + 1): if i == 0: store[i][j] = j elif j == 0: store[i][j] = i elif str1[i  1] == str2[j  1]: store[i][j] = store[i  1][j  1] else: store[i][j] = 1 + min(store[i][j  1], store[i  1][j], store[i  1][j  1]) return store[m][n] 
This is dynamic programming version, naive, recursive is extremely ineffective. What is does is, according to dp paradigm, first it creates an array to store temporary values (line 2, where m and n are input strings lengths accordingly) and then in a loop starts filling it bottom up, i.e. counting minimum edit distance.
If the first string is empty we have only insert all the characters of the second string (lines 7, 8, #j operations). If the second string is empty, than we insert all the first string characters (line 13, 14, time complexity i).
If the last characters are the same we ignore them and move to remaining string(lines 12, 14).
If the last characters are different, we have to rise distance by at least one plus recursively find minimum between the all possibilities: Insert, remove and replace (lines 17 to 20). Complexity of the whole thing is O(m*n).
Another is simple, but nice algorithm to shuffle elements of a given array.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  # algorithm for shuffling given subscriptable data structure (Knuth) import random def swap(alist, i, j): """swaps input lists i, j elements""" alist[i], alist[j] = alist[j], alist[i] def shuffle(data): """randomly shuffles element in the input data""" n = len(data) for token in range(n  1): swap(data, token, random.randrange(token, n)) 
Maybe it's obvious (in that case sorry for that:)), but I think the reason for that algorithm being published is, that is easy to make a mistake and write the last line like this:
1  swap(data, token, random.randrange(token, n))

Which is definitely no good:)
Thats it for now, more stuff on github, incuding clasics merge sort, quicksort and binary search; obvioususly more algorithms will come, stay in tuch!
No comments:
Post a Comment