Levenshtein Edit Distance Algorithm

Jun 24, 2005

See here www.merriampark.com/ld.htm for information about the algorithm. This page has a link (http://www.merriampark.com/ldtsql.htm) to a T-SQL implementation by Joseph Gama: unfortunately, that function doesn't work. There is a debugged version in the also-referenced package of TSQL functions (http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=502&lngWId=5), but this still has the fundamental problem that it only works on pairs of strings up to 49 characters.

CREATE FUNCTION edit_distance(@s1 nvarchar(3999), @s2 nvarchar(3999))
RETURNS int
AS
BEGIN
DECLARE @s1_len int, @s2_len int, @i int, @j int, @s1_char nchar, @c int, @c_temp int,
@cv0 varbinary(8000), @cv1 varbinary(8000)
SELECT @s1_len = LEN(@s1), @s2_len = LEN(@s2), @cv1 = 0x0000, @j = 1, @i = 1, @c = 0
WHILE @j <= @s2_len
SELECT @cv1 = @cv1 + CAST(@j AS binary(2)), @j = @j + 1
WHILE @i <= @s1_len
BEGIN
SELECT @s1_char = SUBSTRING(@s1, @i, 1), @c = @i, @cv0 = CAST(@i AS binary(2)), @j = 1
WHILE @j <= @s2_len
BEGIN
SET @c = @c + 1
SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j-1, 2) AS int) +
CASE WHEN @s1_char = SUBSTRING(@s2, @j, 1) THEN 0 ELSE 1 END
IF @c > @c_temp SET @c = @c_temp
SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j+1, 2) AS int)+1
IF @c > @c_temp SET @c = @c_temp
SELECT @cv0 = @cv0 + CAST(@c AS binary(2)), @j = @j + 1
END
SELECT @cv1 = @cv0, @i = @i + 1
END
RETURN @c
END

View 20 Replies

Levenshtein Edit Distance Algorithm

Edit Distance

Nearest Distance

Mirroring Over Distance????

Help W/ Distance Calculation Query

Distance Between Postal Codes

Database Mirrors Over Distance

Cluster Euclidean Distance

Distance Between Two Points Lat/long

Great Circle Distance Calculation

SQL 2012 :: How To Calculate The Distance In Miles

Find Data Within 10m Distance Of Coordinates?

Haversine SQL Trouble - Distance Between Zip Codes

Great Circle Distance Calculation

How To Control The Distance Between The Two Matrix? Or (matirx And Table )

Any Distance Limited For Failover Clustering Solution?

Calculating Distance Based On Latitude And Longitude

T-SQL (SS2K8) :: How To GROUP BY With Shortest Distance By Account Number

Great Circle Distance Function - Haversine Formula

Stored Procedure To Retrieve Zipcodes Within A Specified Zipcode And Distance

SQL Server 2008 :: Spatial Data Not Returning Correct Distance

T-SQL (SS2K8) :: Returning Rows Within Certain Distance In Miles From City Using Longitude And Latitude

Algorithm

What Is The Best Algorithm To Use?

Algorithm

Luhn Algorithm

BINARY_CHECKSUM Algorithm

Algorithm Of The MAX Command In T-SQL

Neural Net Algorithm

Clustering Algorithm

C# Algorithm/ Libraries

Which Algorithm Is Best For Perdiction

Problem With AES_256 Algorithm