A Relational Technique To Strip The HTML Tags Out Of A Ntext Datatype Field

Nov 27, 2007

I had a problem with the ntext datatype. I need to strip the HTML tags out of a ntext datatype column. I have sample query for that, which works fine for STRING, as stuff is the string function, what to do for ntext field.

=======The Process follows like this =========

--**************************************
--
-- Name: A relational technique to strip
-- the HTML tags out of a string
-- Description:A relational technique to
-- strip the HTML tags out of a string. Th
-- is solution demonstrates how to use simp
-- le tables & search functions effectively
-- in SQL Server to solve procedural / ite
-- rative problems.


-- This table contains the tags to be re
-- placed. The % in <head%>
-- will take care of any extra informati
-- on in the tag that you needn't worry
-- about as a whole. In any case, this t
-- able contains all the tags that needs
-- to be search & replaced.
CREATE TABLE #html ( tag varchar(30) )
INSERT #html VALUES ( '<html>' )
INSERT #html VALUES ( '<head%>' )
INSERT #html VALUES ( '<title%>' )
INSERT #html VALUES ( '<link%>' )
INSERT #html VALUES ( '</title>' )
INSERT #html VALUES ( '</head>' )
INSERT #html VALUES ( '<body%>' )
INSERT #html VALUES ( '</html>' )
go
-- A simple table with the HTML strings
CREATE TABLE #t ( id tinyint IDENTITY , string varchar(255) )
INSERT #t VALUES (
'<HTML><HEAD><TITLE>Some Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css" TYPE="text/css" ></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">
SOME HTML text after the body</HTML>'
)
INSERT #t VALUES (
'<HTML><HEAD><TITLE>Another Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css"></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">Another HTML text after the body</HTML>'
)
go
-- This is the code to strip the tags out.
-- It finds the starting location of eac
-- h tag in the HTML string ,
-- finds the length of the tag with the
-- extra properties if any. This is
-- done by locating the end of the tag n
-- amely '>'. The same is done
-- in a loop till all tags are replaced.

BEGIN TRAN
WHILE exists(select * FROM #t JOIN #html on patindex('%' + tag + '%' , string ) > 0 )
UPDATE #t
SET string = stuff( string , patindex('%' + tag + '%' , string ) ,
charindex( '>' , string , patindex('%' + tag + '%' , string ) )
- patindex('%' + tag + '%' , string ) + 1 , '' )
FROM #t JOIN #html
ON patindex('%' + tag + '%' , string ) > 0
SELECT * FROM #t
rollback

View 1 Replies


ADVERTISEMENT

SQL Server 2008 :: Strip HTML Tags

Oct 28, 2011

I have a table with a column that has html text. The column with html text is pretty big datatye varchar(max)... I wanted to check if any of you have any function that I can use to Strip out the HTML tags... I saw couple of version online, but it was running too slow..

This is the one I used: [URL] .....

View 9 Replies View Related

Strip Those RTF Tags Away

Sep 26, 2007

This algorithm can be used to strip out HTML tags too.
With reference to http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=89973
and http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=90000CREATE FUNCTIONdbo.fnParseRTF
(
@rtf VARCHAR(8000)
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE@Stage TABLE
(
Chr CHAR(1),
Pos INT
)

INSERT@Stage
(
Chr,
Pos
)
SELECTSUBSTRING(@rtf, Number, 1),
Number
FROMmaster..spt_values
WHEREType = 'p'
AND SUBSTRING(@rtf, Number, 1) IN ('{', '}')

DECLARE@Pos1 INT,
@Pos2 INT

SELECT@Pos1 = MIN(Pos),
@Pos2 = MAX(Pos)
FROM@Stage

DELETE
FROM@Stage
WHEREPos IN (@Pos1, @Pos2)

WHILE 1 = 1
BEGIN
SELECT TOP 1@Pos1 = s1.Pos,
@Pos2 = s2.Pos
FROM@Stage AS s1
INNER JOIN@Stage AS s2 ON s2.Pos > s1.Pos
WHEREs1.Chr = '{'
AND s2.Chr = '}'
ORDER BYs2.Pos - s1.Pos

IF @@ROWCOUNT = 0
BREAK

DELETE
FROM@Stage
WHEREPos IN (@Pos1, @Pos2)

UPDATE@Stage
SETPos = Pos - @Pos2 + @Pos1 - 1
WHEREPos > @Pos2

SET @rtf = STUFF(@rtf, @Pos1, @Pos2 - @Pos1 + 1, '')
END

SET@Pos1 = PATINDEX('%cf[0123456789][0123456789 ]%', @rtf)

WHILE @Pos1 > 0
SELECT@Pos2 = CHARINDEX(' ', @rtf, @Pos1 + 1),
@rtf = STUFF(@rtf, @Pos1, @Pos2 - @Pos1 + 1, ''),
@Pos1 = PATINDEX('%cf[0123456789][0123456789 ]%', @rtf)

SELECT@rtf = REPLACE(@rtf, 'pard', ''),
@rtf = REPLACE(@rtf, 'par', ''),
@rtf = LEFT(@rtf, LEN(@rtf) - 1)

SELECT@rtf = REPLACE(@rtf, '0 ', ''),
@rtf = REPLACE(@rtf, ' ', '')

SELECT@rtf = STUFF(@rtf, 1, CHARINDEX(' ', @rtf), '')

RETURN@rtf
ENDE 12°55'05.25"
N 56°04'39.16"

View 10 Replies View Related

DEFINITIVE ANSWER PLEASE -- Can You UPDATE Ntext Datatype Field???

Jul 20, 2005

Hi, I've read conflicting articles on updating an ntext field in acolumn.My ntext field will exceed 8,000 characters (typically twice that size-- but just a text string).One article (I think from MicroSoft) said you could NOT use ntext inan UPDATE statement, but I've seen examples from other people usingit...but don't know if it's related to the size/characters issue.Is this true or not?Thanks very much...Kathy

View 2 Replies View Related

Cleaning Html Tags.

May 5, 2004

does any one has any sql server function that passes some text and returns a string without html tags.

example:

nice day
should return nice day

or if other html tags strip them off.


thanks for your help.

-Fr

View 2 Replies View Related

Remove Html Tags From A String!!!

Feb 13, 2008



I have a column of string which has html tags attached to it. How can I remove them..other than manually going and doing it? Any funtions?

Thanks!!

Tanya

View 9 Replies View Related

How To Remove Html Tags From Varchar Value

May 20, 2008

Hi !
i have a function written in c# which removes all html tags from the provide string like

public static string RemoveHTML(string HTML)
{
return Regex.Replace(HTML, "<(.|)*?>", "");
}

how can i apply such functionality to varchar field which removes all the html tags from it in stored procedure

Regards,
DiL

View 12 Replies View Related

Exclude Html Tags From Full-text Index?

Oct 18, 2007

I ran a CONTAINS query for the word "target" in a bunch of index web pages. I came up with lots of matches -- but they were all inside html tags:

<a href="www.foo.com" target = "_blank">lorem ipsum</a>



Is there a good way to exclude tags (and their attributes) from the full-text index?


Thanks!

View 4 Replies View Related

Strip HTML Encoding Out Of A String In Sql Clr

Apr 3, 2008

I am trying to do string scrubbing in a sql clr function, including removing certain HTML formatting. I would like to use HtmlDecode method, but it's my understanding that System.Web is not available for Sql Clr (without marking code unsafe - not an option for me as this is for an application we sell externally, and unsafe calls woudl not go over well with customers). Is there any class that IS supported for Sql Clr that exposes this functionality? Thanks.

View 10 Replies View Related

Full Text Search Indexing HTML - Does The Filter Expect Certain Tags To Be Present As Standard?

Jul 10, 2007

Hi, I was wondering if any SQL Server gurus out there could help me...I
have a table which contains text resources for my application. The text
resources are multi-lingual so I've read that if I add a html language
indicator meta tag e.g.<META NAME="MS.LOCALE" CONTENT="ES">and
store the text in a varbinary column with a supporting Document Type
column containing ".html" of varchar(5) then the full text index
service should be intelligent about the language word breakers it
applies when indexing the text. (I hope this is correct technique for
best multi-lingual support in a single table?)However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.DECLARE @SearchWord nvarchar(256)SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources.SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.etc.Any pointers / suggestions would be greatly appreciated. Cheers,Gavin.UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!? 

View 1 Replies View Related

Full Text Search Indexing HTML - Does The Filter Expect Certain Tags To Be Present As Standard?

Jul 11, 2007

Hi, I was wondering if any SQL Server gurus out there could help me...

I have a table which contains text resources for my application. The text resources are multi-lingual so I've read that if I add a html language indicator meta tag e.g.
<META NAME="MS.LOCALE" CONTENT="ES">
and store the text in a varbinary column with a supporting Document Type column containing ".html" of varchar(5) then the full text index service should be intelligent about the language word breakers it applies when indexing the text. (I hope this is correct technique for best multi-lingual support in a single table?)

However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.
DECLARE @SearchWord nvarchar(256)
SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources.
SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)

I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).

Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):

Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:

Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.

etc.

Any pointers / suggestions would be greatly appreciated. Cheers,
Gavin.

UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>

I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?

View 1 Replies View Related

How Can I Strip Off The Time Portion Without Changing The Datatype

Oct 16, 2007

Is there a way to strip off the time portion of a datetime datatype without changing the datatype?
I know I can convert it using CONVERT (NVARCHAR(10), dbo.tblPayments.PaymentDate, 101) but I need to keep it as a datetime datatype?

View 5 Replies View Related

Datatype Ntext

Oct 26, 2005

Hallo. I need help, how pull out some inquiry string from type "ntext" in MS SQL(it is xml document). Sring has invariable length, in note is always on other position and includes variable text (e.g .:<actionId>xx</actionId>) . Position I can find out by the help of "patindex" but I don't know what then. I tryed to write procedures, but I had trouble with declaration variables (data type). Thanks and sorry for my horrible English.

View 6 Replies View Related

NTEXT Datatype Problem

Oct 3, 2005

I am trying to insert more than 255 characters into ntext dataype in sql server 2000.

But, string is being truncated to 255 characters.
Please help

View 3 Replies View Related

BCP With Image, Ntext Datatype

Mar 4, 2004

Hi, I have a table with ntext and image datatype. I did BCP out succesfully but when I try to to BCP in, I am getting error.
----------
Starting copy...
SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Invalid character value for cast specification
SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Invalid character value for cast specification
SQLState = 22005, NativeError = 0
------------------
I read somewhere that I can use BULK INSERT with a format file.
Can someone suggest how to BCP in siccesfully or How does the format file looks like for this kind of task?. I am using SQL Server 2000.
---------
Here is my table structure..
-----------
CREATE TABLE [dbo].[NOTIFY_TEMPLATE] (
[ID] [numeric](28, 0) NOT NULL ,
[SENDER] [varchar] (128) NOT NULL ,
[SUBJECT] [varchar] (512) NOT NULL ,
[BODY] [ntext] NOT NULL ,
[PRIORITY] [numeric](28, 0) NULL DEFAULT (2),
[ENABLED] [numeric](28, 0) NULL DEFAULT (1),
[LANGID] [numeric](28, 0) NOT NULL DEFAULT (0),
[NOTIFY_TYPE] [numeric](28, 0) NULL DEFAULT (0),
[REQUEST_TYPE] [numeric](28, 0) NULL ,
[CUSTOMIZED] [numeric](28, 0) NOT NULL DEFAULT (0)
) ON [DATA1] TEXTIMAGE_ON [DATA1]

View 13 Replies View Related

How To Find Ntext Datatype Columns?

May 30, 2007

Dear experts,
how can i find the ntext datatype columns in a database?

please guide me

View 4 Replies View Related

Question On Ntext Datatype In Sql Server

Aug 30, 2006

What is the max. number of characters in ntext?

Are there any way we can format the output of ntext? Or it will just come out as one long line?

Thanks.

View 1 Replies View Related

How To Strip A List Of Characters From A Field

Nov 8, 2001

hey,
what the best way of stripping out a list of characters from a specified field in a table. e.g If first name consists of ABCD'E-FSA, we wnat to strip the ' and the -. There is about 15-20 characters like that.
what's the best way of doing it other encapsulating in the replace function that many times.
thanks
zoey

View 2 Replies View Related

Need To Strip Non-numerics From Phone Field

Jul 22, 2004

I tried searching forums for the last hour and could not find much.

Rather then doing...

REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(RE PLACE(ISNULL(PHONE, ''), ' ', ''), '(', ''), ')', ''), '-', ''), '.', ''), ':', ''), '+', '')

I was hoping there was an equally efficient alternative that utilizes PATINDEX('%[^0-9]%', PHONE)

to some capacity (i.e. uses regular expressions to strip all non-numeric characters from the phone field)

I am certain this has been addressed before, just not sure if the "REPLACE to the nth degree" is the only solution.

THANKS!

View 7 Replies View Related

Strip Time Out Of A Date Field - Sql

Dec 3, 2004

Hi all,

I have a table that I've imported into SQL - there is a field in there for date that must have used now(); as its default value (access).

So the values are something like:

01/11/2004 12:16:42

I need a way to change the data so that the time element removed is the field just holds the date. Failing that a way to insert this from the existing field into a new field stripping the date off en route would be a great help.

Many thanks

Phil

View 2 Replies View Related

SQL 2012 :: Change Datatype Of Ntext Fields

Jun 29, 2015

What would be the best process to change the datatype of ntext fields.

I assume just changing the datatype - is not the way to go?

View 5 Replies View Related

SSIS Ntext Datatype And SQL Server Destination

Sep 11, 2007



Hi folks,

We have a nice issue here. We are running SQL 2005 Dev edition Service Pack 2 and we are trying to copy the contents of one table in a local sql server database to another table in another database on the same local sql server. We use an oledb source and a sql server destination. The table structure is exactly the same. One column is of the datatype ntext, when we try to load the contents the package will stop with the error:


OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 The attempt to send a row to SQL Server failed with error code 0x80004005.
OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "<TABLE>" (3382) failed with error code 0xC02020C7. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
OnError 11-9-2007 14:38:24 11-9-2007 14:38:24 00:00:00 SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0xC02020C7. There may be error messages posted before this with more information on why the thread has exited.
OnError 11-9-2007 14:38:26 11-9-2007 14:38:26 00:00:00 SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E07.
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E07 Description: "Error converting data type DBTYPE_DBTIMESTAMP to datetime.".
OnError 11-9-2007 14:38:26 11-9-2007 14:38:26 00:00:00 A commit failed.

Removing the column from the sql server destination will result in loading the complete table. Using an oledb destination instead of sql server destination fixes the problem. Is this a bug in the SQL server destination component?

Thanks,

Marc

View 4 Replies View Related

What Datatype Is Best For Storing HTML Content

Apr 16, 2008

What datatype is best for storing HTML content?
nText or Nvarchar(MAX) or ?

View 3 Replies View Related

How To Replace Div Tags With P Tags In A Column

May 6, 2015

I want to replace div tags with p tags in a column in sql.

<div style: bold> abc </abc>
<div> efgh></div>

required output:
<p>abc</p>
<p>efgh</p>

View 1 Replies View Related

Problem Importing Data From An Access Memo Field Into A SQL Server Ntext Field.

Jul 11, 2005

I'm using DTS to import data from an Access memo field into a SQL Server ntext field.  DTS is only importing the first 255 characters of the memo field and truncating the rest.I'd appreciate any insights into what may be causing this problem, and what I can do about it.Thanks in advance for any help!

View 4 Replies View Related

T-SQL (SS2K8) :: Varchar Datatype Field Will Ignore Leading Zeros When Compared With Numeric Datatype?

Jan 28, 2015

Need to know if the varchar datatype field will ingore leading zeros when compared with numeric datatype ?

create table #temp
(
code varchar(4) null,
id int not null
)
insert into #temp

[Code] .....

View 4 Replies View Related

Can't Use The NTEXT Datatype In SQLCLR Scalar-valued Functions

Mar 19, 2007

From the SQL Server documentation : "The input parameters and the type returned from a SVF can be any of the scalar
data types supported by SQL Server, except rowversion, text,
ntext, image, timestamp, table, or cursor"This is a problem for me.  Here's what I'm trying to do :I have an NTEXT field in one of my tables.  I want to run regular expressions on this field, and return the results from a stored procedure.  Since SQL Server doesn't provide facilities to perform regular expressions, I need to use an SQLCLR function.  I would have no problem doing this if my field was nvarchar.  However, this field needs to be variable in length - I cannot set an upper bound.  This is why I'm using NTEXT and not nvarchar in the first place.Is there a solution to this problem?  I can't imagine that I'm the only person who wants to pass strings of arbitrary size to an SQLCLR function. 

View 2 Replies View Related

What Is The Prefered Datatype To Store HTML Markup

Oct 15, 2007

I need to store HTML markup in my database. Is there a prefered datatype to store this? Some of them could be quite long.

View 2 Replies View Related

Index In Calculated Field From A Ntext Field

Nov 8, 2005

For some reasons I have to use a ntext field for both small strings like "10" and large binalry files.

I need to sort the field to some extend to present the small strings on a sorted nice way - answers to " What country are you from" etc.

To trick the sorting I use a calculated field:

ORDER BY RSort - where Rsort is:

convert(varchar(4), RD.response) as RSort

It works but put a high load on the SQL server when the number of responses increases.

I though of making a non clustered index based on the calculated field, but is not sure that it will work as intended.

What do I do. The last thing would be to change the ntext to vchar(3800) or something like that. :confused:

View 3 Replies View Related

NText Field In SQL??

Jan 15, 2004

Hello All,

Maybe a stupid question but I'm new to the db admin work so please bear with me.

I've imported an Access db into SQL, in the Access db the field type was 'memo' to accomodate the large amount of text (on avg ruffly 4100 chars. with spaces). Now in SQL the field in the table I have set up as an ntext field, which I understood to be equivalent to a memo field in Access.

My problem is when saving data to the field the first time it saves all the data correctly with the exception of the field in question. The data in the field is '<LongText>', now when I try to update the data in the table I get a 'Data Truncated' error message and no update takes place throughout the table.

After testing this and trying different things, I've found that if I shorten this one field and try to save to the db I still get the 'Data Truncated' error message. If I shorten the data in the field AND delete the record from the SQL table then it will save just fine from there on out (which won't work for the reports).

I'm not sure what I'm missing here to get this to work the way it did in Access.

View 11 Replies View Related

NText Update Field

Aug 17, 2004

Hi All,

I will be doing stress test for my app.

Loading thousands of records to the DB through bulk insert.
There's one field NText which I have left NULL because it will be hard to gen dummy flat file to it.

I have another table which has the Ntext Value which i will want to copy and duplicate to the other table.

what is the way to do it?

simply said i want to update a record with NULL value from one table with NText field with the value from another table..

View 1 Replies View Related

Ntext Field From Profiler

Jul 20, 2005

I am trying to view all the ntext from a profiler trace. The data istruncated at 256 and I am not sure why... The max length is 1820 viathis command:select max(datalength(textdata)) from "monitor forms usage"where textdata like '%gforms%' .I then issueset textsize 8000select (textdata) from "monitor forms usage" where textdata like'%gforms%' and datalength(textdata) >1800and still only 256 is returned. this is true even if I redirect theoutput to a file.Any ideas on how a humble man like me can see all of the data.Mike--Posted via http://dbforums.com

View 2 Replies View Related

Parsing An Ntext Field

Jan 29, 2008



I'm trying to parse an ntext field that in my SQL View contains an invoice comment in order to be able to group on parts of the comment. I have two problems--one, the syntax to do this, and two, the best way to deal with the parts that I want.

The comment is like: "standard text ABCDE : $99.99" but can have multiple "ABCDE"s, e.g. "standard text ABCDE FGH IJKL $999.99" and I found some that had duplicates like "standard text standard text...".
I want to be able to report in SSRS 2005 by grouping the "ABCDE", "FGH", "IJKL" items.

Any ideas? Please be specific as I'm still learning.

View 1 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved