A Relational Technique To Strip The HTML Tags Out Of A Ntext Datatype Field
Nov 27, 2007
I had a problem with the ntext datatype. I need to strip the HTML tags out of a ntext datatype column. I have sample query for that, which works fine for STRING, as stuff is the string function, what to do for ntext field.
=======The Process follows like this =========
--**************************************
--
-- Name: A relational technique to strip
-- the HTML tags out of a string
-- Description:A relational technique to
-- strip the HTML tags out of a string. Th
-- is solution demonstrates how to use simp
-- le tables & search functions effectively
-- in SQL Server to solve procedural / ite
-- rative problems.
-- This table contains the tags to be re
-- placed. The % in <head%>
-- will take care of any extra informati
-- on in the tag that you needn't worry
-- about as a whole. In any case, this t
-- able contains all the tags that needs
-- to be search & replaced.
CREATE TABLE #html ( tag varchar(30) )
INSERT #html VALUES ( '<html>' )
INSERT #html VALUES ( '<head%>' )
INSERT #html VALUES ( '<title%>' )
INSERT #html VALUES ( '<link%>' )
INSERT #html VALUES ( '</title>' )
INSERT #html VALUES ( '</head>' )
INSERT #html VALUES ( '<body%>' )
INSERT #html VALUES ( '</html>' )
go
-- A simple table with the HTML strings
CREATE TABLE #t ( id tinyint IDENTITY , string varchar(255) )
INSERT #t VALUES (
'<HTML><HEAD><TITLE>Some Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css" TYPE="text/css" ></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">
SOME HTML text after the body</HTML>'
)
INSERT #t VALUES (
'<HTML><HEAD><TITLE>Another Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css"></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">Another HTML text after the body</HTML>'
)
go
-- This is the code to strip the tags out.
-- It finds the starting location of eac
-- h tag in the HTML string ,
-- finds the length of the tag with the
-- extra properties if any. This is
-- done by locating the end of the tag n
-- amely '>'. The same is done
-- in a loop till all tags are replaced.
BEGIN TRAN
WHILE exists(select * FROM #t JOIN #html on patindex('%' + tag + '%' , string ) > 0 )
UPDATE #t
SET string = stuff( string , patindex('%' + tag + '%' , string ) ,
charindex( '>' , string , patindex('%' + tag + '%' , string ) )
- patindex('%' + tag + '%' , string ) + 1 , '' )
FROM #t JOIN #html
ON patindex('%' + tag + '%' , string ) > 0
SELECT * FROM #t
rollback
View 1 Replies
ADVERTISEMENT
Oct 28, 2011
I have a table with a column that has html text. The column with html text is pretty big datatye varchar(max)... I wanted to check if any of you have any function that I can use to Strip out the HTML tags... I saw couple of version online, but it was running too slow..
This is the one I used: [URL] .....
View 9 Replies
View Related
Sep 26, 2007
This algorithm can be used to strip out HTML tags too.
With reference to http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=89973
and http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=90000CREATE FUNCTIONdbo.fnParseRTF
(
@rtf VARCHAR(8000)
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE@Stage TABLE
(
Chr CHAR(1),
Pos INT
)
INSERT@Stage
(
Chr,
Pos
)
SELECTSUBSTRING(@rtf, Number, 1),
Number
FROMmaster..spt_values
WHEREType = 'p'
AND SUBSTRING(@rtf, Number, 1) IN ('{', '}')
DECLARE@Pos1 INT,
@Pos2 INT
SELECT@Pos1 = MIN(Pos),
@Pos2 = MAX(Pos)
FROM@Stage
DELETE
FROM@Stage
WHEREPos IN (@Pos1, @Pos2)
WHILE 1 = 1
BEGIN
SELECT TOP 1@Pos1 = s1.Pos,
@Pos2 = s2.Pos
FROM@Stage AS s1
INNER JOIN@Stage AS s2 ON s2.Pos > s1.Pos
WHEREs1.Chr = '{'
AND s2.Chr = '}'
ORDER BYs2.Pos - s1.Pos
IF @@ROWCOUNT = 0
BREAK
DELETE
FROM@Stage
WHEREPos IN (@Pos1, @Pos2)
UPDATE@Stage
SETPos = Pos - @Pos2 + @Pos1 - 1
WHEREPos > @Pos2
SET @rtf = STUFF(@rtf, @Pos1, @Pos2 - @Pos1 + 1, '')
END
SET@Pos1 = PATINDEX('%cf[0123456789][0123456789 ]%', @rtf)
WHILE @Pos1 > 0
SELECT@Pos2 = CHARINDEX(' ', @rtf, @Pos1 + 1),
@rtf = STUFF(@rtf, @Pos1, @Pos2 - @Pos1 + 1, ''),
@Pos1 = PATINDEX('%cf[0123456789][0123456789 ]%', @rtf)
SELECT@rtf = REPLACE(@rtf, 'pard', ''),
@rtf = REPLACE(@rtf, 'par', ''),
@rtf = LEFT(@rtf, LEN(@rtf) - 1)
SELECT@rtf = REPLACE(@rtf, '0 ', ''),
@rtf = REPLACE(@rtf, ' ', '')
SELECT@rtf = STUFF(@rtf, 1, CHARINDEX(' ', @rtf), '')
RETURN@rtf
ENDE 12°55'05.25"
N 56°04'39.16"
View 10 Replies
View Related
Jul 20, 2005
Hi, I've read conflicting articles on updating an ntext field in acolumn.My ntext field will exceed 8,000 characters (typically twice that size-- but just a text string).One article (I think from MicroSoft) said you could NOT use ntext inan UPDATE statement, but I've seen examples from other people usingit...but don't know if it's related to the size/characters issue.Is this true or not?Thanks very much...Kathy
View 2 Replies
View Related
May 5, 2004
does any one has any sql server function that passes some text and returns a string without html tags.
example:
nice day
should return nice day
or if other html tags strip them off.
thanks for your help.
-Fr
View 2 Replies
View Related
Feb 13, 2008
I have a column of string which has html tags attached to it. How can I remove them..other than manually going and doing it? Any funtions?
Thanks!!
Tanya
View 9 Replies
View Related
May 20, 2008
Hi !
i have a function written in c# which removes all html tags from the provide string like
public static string RemoveHTML(string HTML)
{
return Regex.Replace(HTML, "<(.|)*?>", "");
}
how can i apply such functionality to varchar field which removes all the html tags from it in stored procedure
Regards,
DiL
View 12 Replies
View Related
Oct 18, 2007
I ran a CONTAINS query for the word "target" in a bunch of index web pages. I came up with lots of matches -- but they were all inside html tags:
<a href="www.foo.com" target = "_blank">lorem ipsum</a>
Is there a good way to exclude tags (and their attributes) from the full-text index?
Thanks!
View 4 Replies
View Related
Apr 3, 2008
I am trying to do string scrubbing in a sql clr function, including removing certain HTML formatting. I would like to use HtmlDecode method, but it's my understanding that System.Web is not available for Sql Clr (without marking code unsafe - not an option for me as this is for an application we sell externally, and unsafe calls woudl not go over well with customers). Is there any class that IS supported for Sql Clr that exposes this functionality? Thanks.
View 10 Replies
View Related
Jul 10, 2007
Hi, I was wondering if any SQL Server gurus out there could help me...I
have a table which contains text resources for my application. The text
resources are multi-lingual so I've read that if I add a html language
indicator meta tag e.g.<META NAME="MS.LOCALE" CONTENT="ES">and
store the text in a varbinary column with a supporting Document Type
column containing ".html" of varchar(5) then the full text index
service should be intelligent about the language word breakers it
applies when indexing the text. (I hope this is correct technique for
best multi-lingual support in a single table?)However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.DECLARE @SearchWord nvarchar(256)SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources.SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.etc.Any pointers / suggestions would be greatly appreciated. Cheers,Gavin.UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?
View 1 Replies
View Related
Jul 11, 2007
Hi, I was wondering if any SQL Server gurus out there could help me...
I have a table which contains text resources for my application. The text resources are multi-lingual so I've read that if I add a html language indicator meta tag e.g.
<META NAME="MS.LOCALE" CONTENT="ES">
and store the text in a varbinary column with a supporting Document Type column containing ".html" of varchar(5) then the full text index service should be intelligent about the language word breakers it applies when indexing the text. (I hope this is correct technique for best multi-lingual support in a single table?)
However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.
DECLARE @SearchWord nvarchar(256)
SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources.
SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)
I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).
Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):
Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:
Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.
etc.
Any pointers / suggestions would be greatly appreciated. Cheers,
Gavin.
UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>
I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?
View 1 Replies
View Related
Oct 16, 2007
Is there a way to strip off the time portion of a datetime datatype without changing the datatype?
I know I can convert it using CONVERT (NVARCHAR(10), dbo.tblPayments.PaymentDate, 101) but I need to keep it as a datetime datatype?
View 5 Replies
View Related
Oct 26, 2005
Hallo. I need help, how pull out some inquiry string from type "ntext" in MS SQL(it is xml document). Sring has invariable length, in note is always on other position and includes variable text (e.g .:<actionId>xx</actionId>) . Position I can find out by the help of "patindex" but I don't know what then. I tryed to write procedures, but I had trouble with declaration variables (data type). Thanks and sorry for my horrible English.
View 6 Replies
View Related
Oct 3, 2005
I am trying to insert more than 255 characters into ntext dataype in sql server 2000.
But, string is being truncated to 255 characters.
Please help
View 3 Replies
View Related
Mar 4, 2004
Hi, I have a table with ntext and image datatype. I did BCP out succesfully but when I try to to BCP in, I am getting error.
----------
Starting copy...
SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Invalid character value for cast specification
SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Invalid character value for cast specification
SQLState = 22005, NativeError = 0
------------------
I read somewhere that I can use BULK INSERT with a format file.
Can someone suggest how to BCP in siccesfully or How does the format file looks like for this kind of task?. I am using SQL Server 2000.
---------
Here is my table structure..
-----------
CREATE TABLE [dbo].[NOTIFY_TEMPLATE] (
[ID] [numeric](28, 0) NOT NULL ,
[SENDER] [varchar] (128) NOT NULL ,
[SUBJECT] [varchar] (512) NOT NULL ,
[BODY] [ntext] NOT NULL ,
[PRIORITY] [numeric](28, 0) NULL DEFAULT (2),
[ENABLED] [numeric](28, 0) NULL DEFAULT (1),
[LANGID] [numeric](28, 0) NOT NULL DEFAULT (0),
[NOTIFY_TYPE] [numeric](28, 0) NULL DEFAULT (0),
[REQUEST_TYPE] [numeric](28, 0) NULL ,
[CUSTOMIZED] [numeric](28, 0) NOT NULL DEFAULT (0)
) ON [DATA1] TEXTIMAGE_ON [DATA1]
View 13 Replies
View Related
May 30, 2007
Dear experts,
how can i find the ntext datatype columns in a database?
please guide me
View 4 Replies
View Related
Aug 30, 2006
What is the max. number of characters in ntext?
Are there any way we can format the output of ntext? Or it will just come out as one long line?
Thanks.
View 1 Replies
View Related
Nov 8, 2001
hey,
what the best way of stripping out a list of characters from a specified field in a table. e.g If first name consists of ABCD'E-FSA, we wnat to strip the ' and the -. There is about 15-20 characters like that.
what's the best way of doing it other encapsulating in the replace function that many times.
thanks
zoey
View 2 Replies
View Related
Jul 22, 2004
I tried searching forums for the last hour and could not find much.
Rather then doing...
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(RE PLACE(ISNULL(PHONE, ''), ' ', ''), '(', ''), ')', ''), '-', ''), '.', ''), ':', ''), '+', '')
I was hoping there was an equally efficient alternative that utilizes PATINDEX('%[^0-9]%', PHONE)
to some capacity (i.e. uses regular expressions to strip all non-numeric characters from the phone field)
I am certain this has been addressed before, just not sure if the "REPLACE to the nth degree" is the only solution.
THANKS!
View 7 Replies
View Related
Dec 3, 2004
Hi all,
I have a table that I've imported into SQL - there is a field in there for date that must have used now(); as its default value (access).
So the values are something like:
01/11/2004 12:16:42
I need a way to change the data so that the time element removed is the field just holds the date. Failing that a way to insert this from the existing field into a new field stripping the date off en route would be a great help.
Many thanks
Phil
View 2 Replies
View Related
Jun 29, 2015
What would be the best process to change the datatype of ntext fields.
I assume just changing the datatype - is not the way to go?
View 5 Replies
View Related
Sep 11, 2007
Hi folks,
We have a nice issue here. We are running SQL 2005 Dev edition Service Pack 2 and we are trying to copy the contents of one table in a local sql server database to another table in another database on the same local sql server. We use an oledb source and a sql server destination. The table structure is exactly the same. One column is of the datatype ntext, when we try to load the contents the package will stop with the error:
OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 The attempt to send a row to SQL Server failed with error code 0x80004005.
OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "<TABLE>" (3382) failed with error code 0xC02020C7. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0xC02020C7. There may be error messages posted before this with more information on why the thread has exited.
OnError 11-9-2007 14:38:26 11-9-2007 14:38:26 00:00:00 SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E07.
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E07 Description: "Error converting data type DBTYPE_DBTIMESTAMP to datetime.".
OnError 11-9-2007 14:38:26 11-9-2007 14:38:26 00:00:00 A commit failed.
Removing the column from the sql server destination will result in loading the complete table. Using an oledb destination instead of sql server destination fixes the problem. Is this a bug in the SQL server destination component?
Thanks,
Marc
View 4 Replies
View Related
Apr 16, 2008
What datatype is best for storing HTML content?
nText or Nvarchar(MAX) or ?
View 3 Replies
View Related
May 6, 2015
I want to replace div tags with p tags in a column in sql.
<div style: bold> abc </abc>
<div> efgh></div>
required output:
<p>abc</p>
<p>efgh</p>
View 1 Replies
View Related
Jul 11, 2005
I'm using DTS to import data from an Access memo field into a SQL Server ntext field. DTS is only importing the first 255 characters of the memo field and truncating the rest.I'd appreciate any insights into what may be causing this problem, and what I can do about it.Thanks in advance for any help!
View 4 Replies
View Related
Jan 28, 2015
Need to know if the varchar datatype field will ingore leading zeros when compared with numeric datatype ?
create table #temp
(
code varchar(4) null,
id int not null
)
insert into #temp
[Code] .....
View 4 Replies
View Related
Mar 19, 2007
From the SQL Server documentation : "The input parameters and the type returned from a SVF can be any of the scalar
data types supported by SQL Server, except rowversion, text,
ntext, image, timestamp, table, or cursor"This is a problem for me. Here's what I'm trying to do :I have an NTEXT field in one of my tables. I want to run regular expressions on this field, and return the results from a stored procedure. Since SQL Server doesn't provide facilities to perform regular expressions, I need to use an SQLCLR function. I would have no problem doing this if my field was nvarchar. However, this field needs to be variable in length - I cannot set an upper bound. This is why I'm using NTEXT and not nvarchar in the first place.Is there a solution to this problem? I can't imagine that I'm the only person who wants to pass strings of arbitrary size to an SQLCLR function.
View 2 Replies
View Related
Oct 15, 2007
I need to store HTML markup in my database. Is there a prefered datatype to store this? Some of them could be quite long.
View 2 Replies
View Related
Nov 8, 2005
For some reasons I have to use a ntext field for both small strings like "10" and large binalry files.
I need to sort the field to some extend to present the small strings on a sorted nice way - answers to " What country are you from" etc.
To trick the sorting I use a calculated field:
ORDER BY RSort - where Rsort is:
convert(varchar(4), RD.response) as RSort
It works but put a high load on the SQL server when the number of responses increases.
I though of making a non clustered index based on the calculated field, but is not sure that it will work as intended.
What do I do. The last thing would be to change the ntext to vchar(3800) or something like that. :confused:
View 3 Replies
View Related
Jan 15, 2004
Hello All,
Maybe a stupid question but I'm new to the db admin work so please bear with me.
I've imported an Access db into SQL, in the Access db the field type was 'memo' to accomodate the large amount of text (on avg ruffly 4100 chars. with spaces). Now in SQL the field in the table I have set up as an ntext field, which I understood to be equivalent to a memo field in Access.
My problem is when saving data to the field the first time it saves all the data correctly with the exception of the field in question. The data in the field is '<LongText>', now when I try to update the data in the table I get a 'Data Truncated' error message and no update takes place throughout the table.
After testing this and trying different things, I've found that if I shorten this one field and try to save to the db I still get the 'Data Truncated' error message. If I shorten the data in the field AND delete the record from the SQL table then it will save just fine from there on out (which won't work for the reports).
I'm not sure what I'm missing here to get this to work the way it did in Access.
View 11 Replies
View Related
Aug 17, 2004
Hi All,
I will be doing stress test for my app.
Loading thousands of records to the DB through bulk insert.
There's one field NText which I have left NULL because it will be hard to gen dummy flat file to it.
I have another table which has the Ntext Value which i will want to copy and duplicate to the other table.
what is the way to do it?
simply said i want to update a record with NULL value from one table with NText field with the value from another table..
View 1 Replies
View Related
Jul 20, 2005
I am trying to view all the ntext from a profiler trace. The data istruncated at 256 and I am not sure why... The max length is 1820 viathis command:select max(datalength(textdata)) from "monitor forms usage"where textdata like '%gforms%' .I then issueset textsize 8000select (textdata) from "monitor forms usage" where textdata like'%gforms%' and datalength(textdata) >1800and still only 256 is returned. this is true even if I redirect theoutput to a file.Any ideas on how a humble man like me can see all of the data.Mike--Posted via http://dbforums.com
View 2 Replies
View Related
Jan 29, 2008
I'm trying to parse an ntext field that in my SQL View contains an invoice comment in order to be able to group on parts of the comment. I have two problems--one, the syntax to do this, and two, the best way to deal with the parts that I want.
The comment is like: "standard text ABCDE : $99.99" but can have multiple "ABCDE"s, e.g. "standard text ABCDE FGH IJKL $999.99" and I found some that had duplicates like "standard text standard text...".
I want to be able to report in SSRS 2005 by grouping the "ABCDE", "FGH", "IJKL" items.
Any ideas? Please be specific as I'm still learning.
View 1 Replies
View Related