# Levenshtein Edit Distance Algorithm

Jun 24, 2005
See here www.merriampark.com/ld.htm for information about the algorithm. This page has a link (http://www.merriampark.com/ldtsql.htm) to a T-SQL implementation by Joseph Gama: unfortunately, that function doesn't work. There is a debugged version in the also-referenced package of TSQL functions (http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=502&lngWId=5), but this still has the fundamental problem that it only works on pairs of strings up to 49 characters.

CREATE FUNCTION edit_distance(@s1 nvarchar(3999), @s2 nvarchar(3999))

RETURNS int

AS

BEGIN

DECLARE @s1_len int, @s2_len int, @i int, @j int, @s1_char nchar, @c int, @c_temp int,

@cv0 varbinary(8000), @cv1 varbinary(8000)

SELECT @s1_len = LEN(@s1), @s2_len = LEN(@s2), @cv1 = 0x0000, @j = 1, @i = 1, @c = 0

WHILE @j <= @s2_len

SELECT @cv1 = @cv1 + CAST(@j AS binary(2)), @j = @j + 1

WHILE @i <= @s1_len

BEGIN

SELECT @s1_char = SUBSTRING(@s1, @i, 1), @c = @i, @cv0 = CAST(@i AS binary(2)), @j = 1

WHILE @j <= @s2_len

BEGIN

SET @c = @c + 1

SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j-1, 2) AS int) +

CASE WHEN @s1_char = SUBSTRING(@s2, @j, 1) THEN 0 ELSE 1 END

IF @c > @c_temp SET @c = @c_temp

SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j+1, 2) AS int)+1

IF @c > @c_temp SET @c = @c_temp

SELECT @cv0 = @cv0 + CAST(@c AS binary(2)), @j = @j + 1

END

SELECT @cv1 = @cv0, @i = @i + 1

END

RETURN @c

END

View 20 Replies
ADVERTISEMENT
Mar 18, 2008

Hi,

please, it is possible to know the edit distance used in the fuzzy lookup/grouping.

On this forum I read fuzzy lookup use 4-gram with fix size.

Does exist any document explaining how fuzzy lookup calculate the similarity? In other word, what kind of edit distance, algorithm is used by fuzzy lookup/grouping?

I hope I was enough clear with my poor english.

Thanks All

View 1 Replies
View Related
Jan 21, 2004

Hi

How do I get a nearest distance of a point? For example, I have two tables A and B and I want to find the nearest distance between the records of the two tables. In addition, one of the tables should also give me the distance. The data I have geo spatial data. Can this be done in SQL

Help will be appreciated

View 12 Replies
View Related
Mar 1, 2007

Is there a recommended practice for mirroring in regards to distance? Is it best practice to mirror with both nodes at the same physical location and use another method for failing over to a remote location or can one just put the other node in the mirror a few thousand miles away? I'm suspecting not.

Any comments??

View 2 Replies
View Related
Mar 28, 2007

I'm trying to run a dyncamic query that returns all records within a specific distance of a certain point. The longitude and latitude of each record is stored in the database. The query is constructed from two dynamic variables $StartLatitude and $StartLongitude with represent the starting point.

SELECT UserID, ACOS(SIN($StartLatitude * PI() / 180) * SIN(Latitude * PI() / 180) + COS($StartLatitude * PI() / 180) * COS(Latitude * PI() / 180) * COS(($StartLongitude - Longitude) * PI() / 180)) * 180 / PI() * 60 * 1.1515 AS Distance

FROM HPN_Painters

HAVING (Distance <= 150)

It runs fine until I add the 'HAVING (Distance <= 150)' clause, in which I recieve the error: Invalid column name 'Distance' It seems that Distance cannot be referenced in the HAVING clause.

View 5 Replies
View Related
Jul 23, 2005

I'm looking to find out how I'd go about setting up a database where avisitor to my site could punch in their postal code, and find out how farthey are from another postal code. For example, AutoTrader has this featureI believe to tell you how far the vehicle is from you. Dating sites havethem so you can do proximity searches.Anyone have any ideas where I could start? I'm thinking the post office,but if anyone else has suggestions, I'm open to hear them.Thanks!

View 4 Replies
View Related
Mar 20, 2007

Various posts have noted that mirroring over distance is not advisable or that either async connections should be used.

Are there any limits/recommendations i.e. if two datacenters are a couple of files part with 10GBs fibre links and <50ms response times would this be acceptable for high-availability mirroring?

View 4 Replies
View Related
Mar 27, 2007

I am new to data mining so please excuse my ignorance. Lets assume

- i have created a cluser model

- identified 3 clusters ( a, b, c)

- each record consists of 15 columns

- collecting new records( 15 variables) real time

what i would like to do is plot these new records programmatically as i collect them realtime. I assume this new record will belong to one of these three clusters. I believe we can find the cluster this new record belongs to by ' SELECT Cluster()....' and distance from the center of the cluster by ClusterDistance(). To plot this on a 2-dimentional space i need (x, y).

ClusterDistance() could be Y but what will be X.

thanks.

View 6 Replies
View Related
Jan 12, 2008

I have a user defined function, I want to determine the distance between the 2 points. I have it working but i'm having a problem getting to print.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Code Snippetcreate function dbo.Distance( @lat1 float , @long1 float , @lat2 float , @long2 float)

returns float

as

begin

declare @DegToRad as float

declare @Ans as float

declare @Miles as float

set @DegToRad = 57.29577951

set @Ans = 0

set @Miles = 0

if @lat1 is null or @lat1 = 0 or @long1 is null or @long1 = 0 or @lat2 is

null or @lat2 = 0 or @long2 is null or @long2 = 0

begin

return ( @Miles )

end

set @Ans = SIN(@lat1 / @DegToRad) * SIN(@lat2 / @DegToRad) + COS(@lat1 / @DegToRad ) * COS( @lat2 / @DegToRad ) * COS(ABS(@long2 - @long1 )/@DegToRad)

set @Miles = 3959 * ATAN(SQRT(1 - SQUARE(@Ans)) / @Ans)

set @Miles = CEILING(@Miles)

return ( @Miles )

end

DECLARE @RC float

EXEC Distance '39.943762', '-78.122265', '32.334709', '-96.633546'

PRINT @RC /* in miles */

View 3 Replies
View Related
Oct 15, 2007

Great Circle distance calculation

Is there any stored procedure or application that implements Great Circle distance calculation

View 1 Replies
View Related
Oct 15, 2015

DECLARE @Latitude NUMERIC(9, 6), @Longitude NUMERIC(9, 6)

DECLARE @MyLatitude NUMERIC(9, 6), @MyLongitude NUMERIC(9, 6)

Set @Latitude = 42.329596;

Set @Longitude = -83.709286;

Set @MyLatitude = 42.430883;

Set @MyLongitude = -82.923642;

Question: How do we calculate the distance in miles between the 2 points.

View 5 Replies
View Related
Nov 22, 2013

I am trying to write a piece of SQL which gives me a list of enquiries within 10 metre distance of a enquiry.

The idea is to identify possible duplicates.

Table: enquiry

Primary key: enquiry_number

Co-ordinates data fields: enquiry.enquiry_easting and enquiry.enquiry_northing.

I will need to self-search on the same table to find possible enquiries within 10m distance.

View 1 Replies
View Related
Jul 20, 2005

I am trying to use the haversine function to find the distance betweentwo points on a sphere, specifically two zip codes in my database. I'mneither horribly familiar with SQL syntax nor math equations :), so Iwas hoping I could get some help. Below is what I'm using and it is,as best as I can figure, the correct formula. It is not however,giving me correct results. Some are close, others don't seem right atall. Any ideas?SET @lat1 = RADIANS(@lat1)SET @log1 = RADIANS(@log1)SET @lat2 = RADIANS(@lat2)SET @log2 = RADIANS(@log2)SET @Dlat = ABS(@lat2 - @lat1)SET @Dlog = ABS(@log2 - @log1)SET @R = 3956 /*Approximate radius of earth in miles*/SET @A = SQUARE(SIN(@Dlat/2)) + COS(@lat1) * COS(@lat2) *SQUARE(SIN(@Dlog/2))SET @C = 2 * ATN2(SQRT(@A), SQRT(1 - @A))/*SET @C = 2 * ASIN(min(SQRT(@A))) Alternative calculation*/SET @distance = @R * @Cthnx,cjrsumner

View 7 Replies
View Related
Oct 15, 2007

Great Circle distance calculation

Is there any stored procedure or application that implements Great Circle distance calculation

View 1 Replies
View Related
Feb 1, 2007

hi everyone:

the report show two tables two matrixs

how can i control the distance between them

I want to set the same distance between the table and matrix

or (table and table )

View 3 Replies
View Related
Jan 2, 2008

Could I implement a failover cluster solution on the two DBs which are based in two different cities?

Possible?

View 6 Replies
View Related
Jun 14, 2006

I need to be able to take the latitude and logitude of two locations and compare then to determine the number of miles between each point. It doesn't need to account for elevation, but assumes a flat plane with lat and long.

Does anyone have any algorithms in T-SQL to do this?

View 5 Replies
View Related
Mar 11, 2014

Given the following example;

declare @CustIfno table (AccountNumber int, StoreID int, Distance decimal(14,10))

insert into @CustIfno values ('1','44','2.145223'),('1','45','4.567834'),

('1','46','8.4325654'),('2','44','7.8754345'),('2','45','1.54654323'),

('2','46','11.5436543'), ('3','44','9.145223'),('3','45','8.567834'),

('3','46','17.4325654'),('4','44','7.8754345'),('4','45','1.54654323'),

('4','46','11.5436543')

How can I show the shortest Distance by AccountID and StoreID. Results would look like this;

AccountNumberStoreID Distance

1 44 2.1452230000

2 45 1.5465432300

3 45 8.5678340000

4 45 1.5465432300

View 7 Replies
View Related
Mar 28, 2007

This function computes the great circle distance in Kilometers using the Haversine formula distance calculation.

If you want it in miles, change the average radius of Earth to miles in the function.

create function dbo.F_GREAT_CIRCLE_DISTANCE

(

@Latitude1 float,

@Longitude1 float,

@Latitude2 float,

@Longitude2 float

)

returns float

as

/*

fUNCTION: F_GREAT_CIRCLE_DISTANCE

Computes the Great Circle distance in kilometers

between two points on the Earth using the

Haversine formula distance calculation.

Input Parameters:

@Longitude1 - Longitude in degrees of point 1

@Latitude1 - Latitude in degrees of point 1

@Longitude2 - Longitude in degrees of point 2

@Latitude2 - Latitude in degrees of point 2

*/

begin

declare @radius float

declare @lon1 float

declare @lon2 float

declare @lat1 float

declare @lat2 float

declare @a float

declare @distance float

-- Sets average radius of Earth in Kilometers

set @radius = 6371.0E

-- Convert degrees to radians

set @lon1 = radians( @Longitude1 )

set @lon2 = radians( @Longitude2 )

set @lat1 = radians( @Latitude1 )

set @lat2 = radians( @Latitude2 )

set @a = sqrt(square(sin((@lat2-@lat1)/2.0E)) +

(cos(@lat1) * cos(@lat2) * square(sin((@lon2-@lon1)/2.0E))) )

set @distance =

@radius * ( 2.0E *asin(case when 1.0E < @a then 1.0E else @a end ))

return @distance

end

Edit: corrected spelling

CODO ERGO SUM

View 20 Replies
View Related
Apr 28, 2008

Hi All,

Does anyone have a Stored Procedure that works perfectly to retrieve all zipcodes within a specified zipcode and distance radius - a zipcode and radius is passed and the Store Procedure result shows all zipcodes that falls within that range.

Thanks in advance

Ade

View 5 Replies
View Related
Apr 29, 2015

I have the two following locations.

They're both towns in Australia , State of Victoria

Fitzroy,-37.798701, 144.978687

Footscray,-37.799736, 144.899734

After running geography::Point(Latitude, Longitude , 4326) on the latitude and longitude provided for each location, my Geography column for each row is populated with the following:

Fitzroy, 0xE6100000010C292499D53BE642C0A7406667511F6240

Footscray, 0xE6100000010C89B7CEBF5DE642C02D23F59ECA1C6240

In my SQL Query, I have the following which works out the distance between both towns. Geo being my Geography column

DECLARE @s geography = 0xE6100000010C292499D53BE642C0A7406667511F6240 -- Fitzroy

DECLARE @t geography = 0xE6100000010C89B7CEBF5DE642C02D23F59ECA1C6240 -- Footscray

SELECT @s.STDistance(@t)

The result I get is

6954.44911927616

I then looked at formatting this as in Australia we go by KM so after some searching I found two solutions one for Miles and the other KM

So I changed Select statement to look like this

select @s.STDistance(@t)/1000 -- format to KM

My result is then

6.95444911927616

When I go to google maps and do a direction request between the locations provided above it says 10.2km (depending on traffic)

Now I'm new to this spatial data within SQL, why would I get a different result from google maps?

Also I would like to round this number so its easier to use within my where statement so I'm using Ceiling as shown here:

SELECT CEILING(@s.STDistance(@t)/1000)

Is ceiling the correct way to go?

Reason I need to round this is because we are allowing the end user to search by radius so if they pass in 50km I will then say

Where CEILING(@s.STDistance(@t)/1000) < 50

View 2 Replies
View Related
Nov 19, 2014

I've got a working query which returns all leads within a supplied proximity to a city. I followed a tutorial I googled a couple months ago (can't find it now). It works, but would love others to look the query over (provided DDL and sample data) and tell me if it's as it should be.

Two things I don't like about query:

1. I have to do a UNION to another query that retrieves everything that is in the same city in order to have complete results.

2. very slow to retrieve results (> 1 minute)

Sample DDL: 2 tables

create table dim_lead

(

date_created datetime,

[contact_first_name] varchar(20),

[contact_last_name] varchar(20),

lead_id int,

[Code] .....

View 9 Replies
View Related
May 22, 2002

Does any have a algorithm that can divide A into B without using the divide

sign (/) or the multiplication sign ( * ).

View 1 Replies
View Related
Nov 24, 2006

I am new to DM and I am not sure which algorithm would be best to use.

I am trying to build a custom comparitor application that companies can use to compare themselves against other companies based on certain pieces of information. I need to group a company with 11 other companies based on 6 attributes. I need the ability to apply weightings to each of the 6 attributes and have those taken into consideration when determining which 10 other companies each company is grouped with. Each group must contain 11 members, the company for the user logged in and 10 other companies that it will be compared against.

At first I thought that clustering would be a good fit for this but I can not see a way to mandate that each cluster contain exactly 11 members, I cannot see a way to weight the inputs, and I think each company can only be in one cluster at a time which do not meet my requirements.

Any help will be greatly appreciated!

View 3 Replies
View Related
Jun 8, 2006

Well, i have read in claude seidman book about data mining that some algorithm inside in microsoft decision tree are CART, CHAID and C45 algorithm. could anyone explain to me about the tree algorithm and please explain to me how the tree algorithm used together in one case?

thank you so much

View 1 Replies
View Related
Dec 11, 2006

Use this to check if Luhn has valid check digitCREATE FUNCTIONdbo.fnIsLuhnValid

(

@Luhn VARCHAR(8000)

)

RETURNS BIT

AS

BEGIN

IF @Luhn LIKE '%[^0-9]%'

RETURN 0

DECLARE@Index SMALLINT,

@Multiplier TINYINT,

@Sum INT,

@Plus TINYINT

SELECT@Index = LEN(@Luhn),

@Multiplier = 1,

@Sum = 0

WHILE @Index >= 1

SELECT@Plus = @Multiplier * CAST(SUBSTRING(@Luhn, @Index, 1) AS TINYINT),

@Multiplier = 3 - @Multiplier,

@Sum = @Sum + @Plus / 10 + @Plus % 10,

@Index = @Index - 1

RETURN CASE WHEN @Sum % 10 = 0 THEN 1 ELSE 0 END

END

Peter Larsson

Helsingborg, Sweden

View 20 Replies
View Related
Jul 23, 2005

Hello,Do you know if the algorithm for the BINARY_CHECKSUM function in documentedsomewhere?I would like to use it to avoid returning some string fields from theserver.By returning only the checksum I could lookup the string in a hashtable andI think this could make the code more efficient on slow connections.Thanks in advanced and kind regards,Orly Junior

View 3 Replies
View Related
Dec 7, 2007

What kind of algorithm does the MAX command uses? I have a table that I need to get the last value of the Transaction ID and increment it by 1, so I can use it as the next TransID everytime I insert a new record into the table. I use the MAX command to obtain the last TransID in the table in this process. However, someone suggested that there is a problem with this, since if there are multiple users trying to insert a record into the same table, and processing is slow, they might essentially come up with the same next TransID. He came up with the idea of having a separate table that contains only the TransID and using this table to determine the next TransID. Will this really make a difference as far as processing speed is concerned or using a MAX command on the same table to come up with the next TransID enough? Do you have a better suggestion?

Thanks

View 3 Replies
View Related
Sep 15, 2006

Hi,

Would anyone be able to provide a reference paper on the neural net algorithm implemented in SQL Server 2005 to better understand how it works?

Thanxs for any info.

View 3 Replies
View Related
Oct 29, 2007

Hi All!

I have few questions regarding Clustering algorithm.

If I process the clustering model with Ks (K is number of clusters) from 2 to n how to find a measure of variation and loss of information in each model (any kind of measure)? (Purpose would be decision which K to take.)

Which clustering method is better to use when segmenting data K-means or EM?

Thanks in advance!

View 4 Replies
View Related
Jan 10, 2006

Hi.

Does anyone know of or where I can find implementation of these C# algorithm /class libraries:

a) RLS - Recursive Least Square algorithm?

b) MWAR - Multi-resolution Wavelet Auto-regresive algorithm?

c) AR - Autoregresive moving awerage algorithm?

d) EWMA - Exponentially Weighted Moving Average

The .NET framework System.Math class do not seem to have these libraries.

Regards

Shorin

View 2 Replies
View Related
Jul 12, 2006

Hi

I want to predict which product can be sold together , Pl help me out which algorithm is best either association, cluster or decision and pl let me know how to use case table and nested table my table structure is

Cust_ID

Age

Product

Location

Income

Thanks

Rajesh Ladda

View 1 Replies
View Related
Feb 14, 2008

hi,

i am using sqlserver2005 as back end for my project.

actually we developing an stand alone web application for client, so we need to host this application in his server. he is not willing to install sql server 2005 edition in his sever so we r going by placing .mdf file in data directory of project.

but before i developed in server2005 i used aes_256 algorithm to encrypt n decrypt the pwd column by using symmetric keys.it is working fine.

but when i took the .mdf file of project n add into my project it is throwing error at creation of symmetric key that

"Either no algorithm has been specified or the bitlength and the algorithm specified for the key are not available in this installation of Windows."

please suggest me a solution

View 1 Replies
View Related