I have a table with a column that has html text. The column with html text is pretty big datatye varchar(max)... I wanted to check if any of you have any function that I can use to Strip out the HTML tags... I saw couple of version online, but it was running too slow..
I had a problem with the ntext datatype. I need to strip the HTML tags out of a ntext datatype column. I have sample query for that, which works fine for STRING, as stuff is the string function, what to do for ntext field.
=======The Process follows like this =========
--************************************** -- -- Name: A relational technique to strip -- the HTML tags out of a string -- Description:A relational technique to -- strip the HTML tags out of a string. Th -- is solution demonstrates how to use simp -- le tables & search functions effectively -- in SQL Server to solve procedural / ite -- rative problems.
-- This table contains the tags to be re -- placed. The % in <head%> -- will take care of any extra informati -- on in the tag that you needn't worry -- about as a whole. In any case, this t -- able contains all the tags that needs -- to be search & replaced. CREATE TABLE #html ( tag varchar(30) ) INSERT #html VALUES ( '<html>' ) INSERT #html VALUES ( '<head%>' ) INSERT #html VALUES ( '<title%>' ) INSERT #html VALUES ( '<link%>' ) INSERT #html VALUES ( '</title>' ) INSERT #html VALUES ( '</head>' ) INSERT #html VALUES ( '<body%>' ) INSERT #html VALUES ( '</html>' ) go -- A simple table with the HTML strings CREATE TABLE #t ( id tinyint IDENTITY , string varchar(255) ) INSERT #t VALUES ( '<HTML><HEAD><TITLE>Some Name</TITLE> <LINK REL="stylesheet" HREF="/style.css" TYPE="text/css" ></HEAD> <BODY BGCOLOR="FFFFFF" VLINK="#444444"> SOME HTML text after the body</HTML>' ) INSERT #t VALUES ( '<HTML><HEAD><TITLE>Another Name</TITLE> <LINK REL="stylesheet" HREF="/style.css"></HEAD> <BODY BGCOLOR="FFFFFF" VLINK="#444444">Another HTML text after the body</HTML>' ) go -- This is the code to strip the tags out. -- It finds the starting location of eac -- h tag in the HTML string , -- finds the length of the tag with the -- extra properties if any. This is -- done by locating the end of the tag n -- amely '>'. The same is done -- in a loop till all tags are replaced.
BEGIN TRAN WHILE exists(select * FROM #t JOIN #html on patindex('%' + tag + '%' , string ) > 0 ) UPDATE #t SET string = stuff( string , patindex('%' + tag + '%' , string ) , charindex( '>' , string , patindex('%' + tag + '%' , string ) ) - patindex('%' + tag + '%' , string ) + 1 , '' ) FROM #t JOIN #html ON patindex('%' + tag + '%' , string ) > 0 SELECT * FROM #t rollback
Hi, I was wondering if any SQL Server gurus out there could help me...I have a table which contains text resources for my application. The text resources are multi-lingual so I've read that if I add a html language indicator meta tag e.g.<META NAME="MS.LOCALE" CONTENT="ES">and store the text in a varbinary column with a supporting Document Type column containing ".html" of varchar(5) then the full text index service should be intelligent about the language word breakers it applies when indexing the text. (I hope this is correct technique for best multi-lingual support in a single table?)However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.DECLARE @SearchWord nvarchar(256)SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources.SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.etc.Any pointers / suggestions would be greatly appreciated. Cheers,Gavin.UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?
Hi, I was wondering if any SQL Server gurus out there could help me...
I have a table which contains text resources for my application. The text resources are multi-lingual so I've read that if I add a html language indicator meta tag e.g. <META NAME="MS.LOCALE" CONTENT="ES"> and store the text in a varbinary column with a supporting Document Type column containing ".html" of varchar(5) then the full text index service should be intelligent about the language word breakers it applies when indexing the text. (I hope this is correct technique for best multi-lingual support in a single table?)
However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g. DECLARE @SearchWord nvarchar(256) SET @SearchWord = 'search' -- Yes, this word is definitely present in my resources. SELECT * FROM Resource WHERE CONTAINS(Document, @SearchWord)
I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).
Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):
Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:
Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.
etc.
Any pointers / suggestions would be greatly appreciated. Cheers, Gavin.
UPDATE: I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>
I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?
I'm currently using an Execute SQL Task to return XML data from a query into an SSIS string variable. In my FOR XML clause in SQL I'm specifying a certain name for my root tag, called "Accounts". This works great in Management Studio, however, the Execute SQL Task appends a <ROOT> and </ROOT> tag to the start and end of the string, so now it looks like:
<ROOT><Accounts>...all my elements...</Accounts></ROOT>
I'd like to remove the ROOT tags so that the <Accounts> tags are actually the root for this doc. What would be the best way to remove the ROOT tags from the SSIS string variable?
declare @xmldoc as xml select @xmldoc = '<Text>This is firstline<Break />This is second line<Break />This is third line</Text>' select @xmldoc.value('(/Text)[1]','varchar(max)')Result is: "This is firstlineThis is second lineThis is third line"
My problem is, that the <Break /> tags within the text are removed in the conversion to varchar. How to preserve the such tags in the varchar output? Or to get the <Break /> tags "translated" to e.g. CHAR(10)?
I am retrieving a field from SQL and displaying that data on a web page. The data contains a mixture of text and html codes, like this "<b>test</b>". But rather than displaying the word test in bold, it is displaying the entire sting as text. How do I get it to treat the HTML as HTML?
Does anyone know how to get rid of rtf tags that are stored in the table? I need to filter out the data and wondering if there is a utility on the SQL Server that can do it.
This algorithm can be used to strip out HTML tags too. With reference to http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=89973 and http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=90000CREATE FUNCTIONdbo.fnParseRTF ( @rtf VARCHAR(8000) ) RETURNS VARCHAR(8000) AS BEGIN DECLARE@Stage TABLE ( Chr CHAR(1), Pos INT )
INSERT@Stage ( Chr, Pos ) SELECTSUBSTRING(@rtf, Number, 1), Number FROMmaster..spt_values WHEREType = 'p' AND SUBSTRING(@rtf, Number, 1) IN ('{', '}')
WHILE 1 = 1 BEGIN SELECT TOP 1@Pos1 = s1.Pos, @Pos2 = s2.Pos FROM@Stage AS s1 INNER JOIN@Stage AS s2 ON s2.Pos > s1.Pos WHEREs1.Chr = '{' AND s2.Chr = '}' ORDER BYs2.Pos - s1.Pos
I have generated a database for my website, I intend on using software that will convert the database into static web pages. Big problem I have I am not a programmer, but I know a tiny bit about tags etc. for search engines. The meta tag description is what I want to create using a field in this database. The software I am about to use has a sql builder is there anyway it could be done be highlighting the relavent field and using sql language. PLEASE someone Help This problem has been driving me around the twist.
There are two tables A and B where asset tags are present, but in one table in rows and in another in column wise.
for eg ASSet Tag SR-062009-00032966 SR-062009-00032962 SR-072009-00020572 SR-072009-00020571 SR-072009-00020585 HH-092009-00038342
Table B field 1 -->Asset TAG Record 1-->SR-072009-00020572,SR-072009-00020571,SR-062009-00020685,SR-072009-00001592,SR-072009-00001376,SR-062009-00020683,SR-092009-00001617
field 2 --> Material code REcord 1-->121 REcord 2-->123
What is the query so that asset tag of A matches with each and every asset tag table of B and output comes as
Output Asset TAg -------- MAterial Code SR-062009-00032966 SR-062009-00032962 SR-072009-00020572 ------121 SR-072009-00020571 -------121 SR-072009-00020585
I tried to remove AdventureWorksDB in the "Add or Remove Programs" of Contol Panel and I got the following errors: (1) AdventureWorksDB Error 1326: Error getting file security: CProgram FilesMicrosoft SQL ServerMSSQL1MSSQLGetLastError: 5. |OK| and (2) Add or Remove Programs Fatal Error during installation (after I clicked the |OK| button). Please help and tell me how I can solve this problem.
I have uninstalled the CTP version of the SQL Server express so that I can install the released version but CTP version is still listed in the add/remove program list but without the change/remove button. I have been to different sites to find information on cleaning this up and I have ran all the uninstall tool I can find but the problem still prevails. I cannot install the released version without completely getting rid of the CTP version. Please help anyone.
I am having a hard time removing my SQL instance inside the Add/Remove program. After i select the SQL Instance name and then I tried to remove it but it won't allow me to delete it. There isn't any error message or whatsoever. Actually, when i try to log it in my SQL Management studio, that certain sql instance name is not existing according to the message box. Is there any way to remove the Sql Instance in my system?
I have looked far and wide and have not found anything that works to allow me to resolve this issue.
I am moving data from DB2 using the MS OLEDB Provider for DB2. The OLEDB source sees the column of data as DT_TEXT. I setup a destination to SQL Server 2005 and everything looks good until I try and run the package.
I get the error: [OLE DB Source [277]] Error: An OLE DB error has occurred. Error code: 0x80040E21. An OLE DB record is available. Source: "Microsoft DB2 OLE DB Provider" Hresult: 0x80040E21 Description: "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done.".
[OLE DB Source [277]] Error: Failed to retrieve long data for column "LIST_DATA_RCVD".
[OLE DB Source [277]] Error: There was an error with output column "LIST_DATA_RCVD" (324) on output "OLE DB Source Output" (287). The column status returned was: "DBSTATUS_UNAVAILABLE".
[OLE DB Source [277]] Error: The "output column "LIST_DATA_RCVD" (324)" failed because error code 0xC0209071 occurred, and the error row disposition on "output column "LIST_DATA_RCVD" (324)" specifies failure on error. An error occurred on the specified object of the specified component.
[DTS.Pipeline] Error: The PrimeOutput method on component "OLE DB Source" (277) returned error code 0xC0209029. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
Any suggestions on how I can get the large string data in the varchar column in DB2 into the varchar(max) column in SQL Server 2005?
I am trying to create a store procedure inside of SQL Management Studio console and I kept getting errors. Here's my store procedure.
Code Block CREATE PROCEDURE [dbo].[sqlOutlookSearch] -- Add the parameters for the stored procedure here @OLIssueID int = NULL, @searchString varchar(1000) = NULL AS BEGIN -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON; -- Insert statements for procedure here IF @OLIssueID <> 11111 SELECT * FROM [OLissue], [Outlook] WHERE [OLissue].[issueID] = @OLIssueID AND [OLissue].[issueID] = [Outlook].[issueID] AND [Outlook].[contents] LIKE + ''%'' + @searchString + ''%'' ELSE SELECT * FROM [Outlook] WHERE [Outlook].[contents] LIKE + ''%'' + @searchString + ''%'' END
And the error I kept getting is:
Msg 402, Level 16, State 1, Procedure sqlOutlookSearch, Line 18
The data types varchar and varchar are incompatible in the modulo operator.
Msg 402, Level 16, State 1, Procedure sqlOutlookSearch, Line 21
The data types varchar and varchar are incompatible in the modulo operator.
For the life of me I cannot figure out why SSIS will not convert varchar data. instead of using the table to table method, I wrote a SQL query so that I could transform the datatype ntext to varchar 512 understanding that natively MS is going towards all Unicode applications.
The source fields from Access are int, int, int and varchar(512). The same is true of the destination within SQL Server 2005. the field 'Answer' is the varchar field in question....
I get the following error
Validating (Error)
Messages
Error 0xc02020f6: Data Flow Task: Column "Answer" cannot convert between unicode and non-unicode string data types. (SQL Server Import and Export Wizard)
Error 0xc004706b: Data Flow Task: "component "Destination - Query" (28)" failed validation and returned validation status "VS_ISBROKEN". (SQL Server Import and Export Wizard)
Error 0xc004700c: Data Flow Task: One or more component failed validation. (SQL Server Import and Export Wizard)
Error 0xc0024107: Data Flow Task: There were errors during task validation. (SQL Server Import and Export Wizard)
DTS used to be a very strong tool but a simple import such as this is causing me extreme grief and wondering of SQL2005 is ready for primetime. FYI SP1 is installed. I am running this from a workstation and not on the server if that makes a difference...
I have a table that contains a lot of demographic information. The data is usually small (<20 chars) but ocassionally needs to handle large values (250 chars). Right now its set up for varchar(max) and I don't think I want to do this.
How does varchar(max) store info differently from varchar(250)? Either way doesn't it have to hold the container information? So the word "Crackers" have 8 characters to it and information sayings its 8 characters long in both cases. This meaning its taking up same amount of space?
Also my concern will be running queries off of it, does a varchar(max) choke up queries because the fields cannot be properly analyzed? Is varchar(250) any better?
Should I just go with char(250) and watch my db size explode?
Usually the data that is 250 characters contain a lot of blank space that is removed using a SPROC so its not usually 250 characters for long.
I was doing some research on how SQL stores data on disk. MSDN states that when storring a varchar, only the length of the data itself is used plus two bytes. So, if you store "car" in a VarChar(50) it will take 5 bytes. But when you store "car" in a VarChar(500) it will also take 5 bytes.
What is the reason users should define the parameter lenght? Can you use VarChar(8000) whole the time, without any drawback?
I've got a question regarding record inserts via a from (textbox, multiple lines) on a web page.
I'm using a textbox within a form to enter data for a record. When this box contains several lines and the data is inserted into the database (SQL2005 Ent.) it separates each line with a character similar to a square (probably a line break symbol or something). This is all fine for storage, but retrieving the data and displaying it on a web page, just displays one long line, not at all how I entered it in my textbox.
Is there a way to "replace" all of the line breaks with.. say.. "<br>", so that it displays more correctly? Or perhaps I need to implement a HTML-based editor on my form (I've seen them out there, but should this really be necessary?).
I'm using ASP.NET (*.asp pages) and VBScript..
Not sure if this is the right forum for this, so please move if necessary.
I would like to know if someone has any idea on how to make a "<select></select>" tag hidden. for a textbox it's simply: <input type="hidden" id="textCustom2" name="textCustom2" value>. Is there such a thing for options? a javascript perharps?
Im not really sure if this is the right thing to do. But i want to save a copy of the html from my invoice to sql so that i can keep a history of the invoices in case their are changes done to them. Anybody know what would be the best way to do this?