What Technique To Use To Update Production Table From Staging Table
Mar 9, 2007
Hi I have a production table in SQL Server 2005 that has approx 500,000 records---every 6 hours this table needs to be truncated and filled
The basic SSIS package uses a script compentant and imports the data into a staging table which has the same structure as the production table. I have a final Execute SQL Task that Truncates the production table and does a Insert into production-select * from stage table.
Takes around 30 seconds to run this last Execute SQL Task--problem is there is a risk that our webservice will query this table during the Execute SQL Task and return incorrect results.
Q1: In this last Execute SQL Task if I used a BEGIN TRANSACTION and COMMIT TRANSACTION; will this be any quicker ?
Q2: In this last Execute SQL Task- would it be better to use a RENAME TABLE technique in TSQL--any code examples ??
Q3:Is there any way in TSQL I can compare EVERY FIELD in two tables ie Stage and Production which have identical data structures and figure out a way to update only the records that changed? Is SQL Server Replication the best way to do this
Hi, Is this anyway to finding updated/ deleted recored using anyother data flow transformation tasks without using sql task. Can find the new records using merge join task.
Is there better way to merge master table using staging table?
We need to Insert/Update a Fact Table from staging Table. currently we are using a SP which update Fact Table for Each region. this process is schedule, every 5 min job is run and Update fact table.but time of Insert and Update too long from staging to Fact, currently we are using merge statement for Insert and update.in my sp we are looping number how many region we need to update and at a time single Region we are updating using while loop in current SP.
Hello, Maybe anyone have done that before? I have table where i store SOURCE_TABLE_NAME and DESTINATION_TABLE_NAME, there is about 120+ tables. i need make SSIS package which selects SOURCE_TABLE_NAME from source ole db, and loads it to DESTINATION_TABLE_NAME in destination ole db.
I made such SSIS package. set ole db source data access mode to table or view name variable. set ole db destination data access mode to table or view name variable. set to variables defoult values (names of existing tables) but when i loop table names is changed, it reports error, that can map columns, becouse in new tables is different columns.
What is the best way to transfer data from the staging table into the main table.
Example: Staging Table Name: TableA_satge (# of rows - millions) Main Table Name: TableA_main (# of rows - billions)
Note: Staging table may have some data same as the main table.
Currently I am doing: - Load data into staging table (TableA_stage) - Remove any duplication of rows from the staging table (TableA_stage) - Disable all indexes on main table (TableA_main) - Insert into main table (TableA_main) from staging table (TableA_stage) - Remove any duplication of rows from the main table using CTE (TableA_main) - Rebuild indexes on main_table (TableA_main)
The problem with the above method is that, it takes a lot of time and log file size grows very big.
I am trying to insert bulk data into main table from staging table in sql server 2012. If any error comes, this total activity is rollbacked. I don't want that to happen. I want to know the records where ever the problem persists, and the rest has to be inserted.
I have a database db1 on server1 and server2.The Db On server1 is a production db and the Db on server2 is a staging Db.All the new data will be coming into production Db.And i wanted to update the data and database structures on staging Db from production Db on weekly basis.So how can I reflect the data and datastructures on my staging Db from my production Db.
We are in SQL2005 . Is there anyone have SSIS package developed for automation of database refresh Say we want to do weekly refresh prod to staging. I want to use that as a model template for me to develop the package. i would need one package for refresh one specific database and one for user databases. MOST IMPORTANT dev permission should be restored after production data sync on staging. Secondly I want to create a job based on ssis package to run on schedule.
I'm trying to import the contents of a CSV file into a staging table that I've created in SQL server 2005. To perform the import I have used the BCP utility with the use of a BCP format file.
The problem I'm having is that the data in some of the fields in the csv file are longer than the length of the corresponding field in my staging table. So when I try to import I get the following error:
[Microsoft][ODBC SQL Server Driver]String data, right truncation
And the record with the error does not import.
My question would be, is there a way of telling BCP to trim the data before importing into the staging table? Could I somehow specify in the BCP file to trim the data before importing or is there a switch that i can specify in the BCP command which tells BCP to trim data to the length of the destination column.
If it helps I'm using the command below to run BCP.
I have a flat file with columns from a geographical hierarchy such as:
Country Zone State County City Store Sub Store , etc.
The file also has data columns for months to the right of the above columns such as:
Jul Aug Sept ......... basically 25 of these columns for two years' data for one product and another set of 25 columns for another kind of product. A typical record in the file looks like:
Country Zone State County City Store Substore
USA Southeast FL Hillsborough Tampa walmart Fletcher
I need to upload this data into a staging table in SQL Server 2005 using SSIS, I created a table with the geographical hierarchy columns but am trying to figure out a way to load the monthly data. I can create 50 columns for the 50 months ( 25 months for each product) but that would be very crude.
Is there a better way of inserting data from this flat file into a destination table? I need all the data in the staging table in one upload.
I am extracting source data which is in txt fille to OLE DB destination. But data of each day I want to save in different staging table. For Eg; tblProduct20081206, tblProduct20081207. How can it be done. I have seen lots of posting and script when destination is Txt. I want to use same table for staging but want to create different table for each day with adding date extension.
I have been working on this without any success. Anybody out there to help?
1. There are new files uploaded in the FTP site everyday by other regions. 3 regions and 1 file each. That means 3 files at 3 different times. The file name will be that same day's date.csv. Example; today's file will be A_20071106.csv , B_20071106.csv and C_20071106.csv. Tomorrow's will be A_20071107.csv,B_20071107.csv and C_20071107.csv. How do I do this to run everyday and take the file only for that day from FTP straight into SQL Staging table?
2. Everyday, immediately after each file is uploaded, to indicate the FTP file is loaded successfully, there will be a 20071106.dummy and so on for all three files will show up on the same folder in FTP. I must make this package runable only if the .done file arrives for that days date and then execute the .csv file. Otherwise check after 30 minutes if the .dummy is there yet or not. Do this until the .dummy file comes. Then execute the package that is on that time. Then do the same for the other two for that particular day. The times are 4pm for A, 6pm for B and 8.30pm for C
3. Then, if there is a new row, it should INSERT, any changes (based on 5 columns), it should update and if there was a row yesterday and it is not there today, DELETE.
I'm attempting to load some data into an explicit hierarchy in MDS 2012 via the staging table and struggling with the HierarchyName field. Specifically I'm loading data into stg.[Entity Name]_Consolidated and using the exact name of the explicit hierarchy I've set up in the front end web application.
Originally my hierarchy was labelled "Reporting Hierarchy" and when loading the data into staging using this name then running the batch from the Import Data screen I can see the error message "Error - The HierarchyName is missing or is not valid.". I've checked the table mdm.tblHierarchy and can see that the name there is exactly as it was in the staging table and have since renamed the hierarchy as "Reporting_Hierarchy" with the same results.
I have a large table with 100 Million records that has around 1 million duplicate records that need to be deleted.
I am running a script that creates a staging table called,DuplicateTable that collects all the duplicates and then I want to write a an effecient delete statement.
Is it possible to write something like:
delete from OrigTable O join DuplicateTable D on O.Key = D.key
Or do I have to run a loop on the DuplicateTable and run a delete statement record by record ?
I'm looking at SSIS and SqlBulkCopy as a possible method to quickly insert and process large amounts of data, my current method uses the sp_xml_preparedocument and OPENXML to parse an xml document of the data I want to process and insert into the database, however I'm noticing the performance of SqlServer parsing the xml document isn't that good.
I found the C# SqlBulkCopy method (new in .NET 2.0) and I was thinking it would be faster to use that to load my data into a staging table and then use SSIS to extract the data from the staging table, process and transform it as necessary and finally load it into the final destination tables. I was able to create an Integration Services Project that selects the data from the staging table, does a bit of processing on one of the fields (by calling a stored procedure), and finally loads the processed data into the final table.
The problem with this is I need to clean out the rows that were extracted from the staging table and I'm not sure how I can accomplish this (and I can't just issue a "delete from staging_table" because there maybe new records in the staging table that were not processed), perhaps I can either delete each record as it is proccessed or somehow get the last proccessed identity id from the staging table and delete all records less than or equal to that id.
thanks in advance for any help you can provide, maybe there is an easy way to accomplish what I'm trying to do.
We are designing a Staging layer to handle incremental load. I want to start with a simple scenario to design the staging.
In the source database There are two tables ex, tbl_Department, tbl_Employee. Both this table is loading a single table at destination database ex, tbl_EmployeRecord.
The query which is loading tbl_EmployeRecord is, SELECT EMPID,EMPNAME,DEPTNAME FROM tbl_Department D INNER JOIN tbl_Employee E ON D.DEPARTMENTID=E.DEPARTMENTID.
Now, we need to identify incremental load in tbl_Department, tbl_Employee and store it in staging and load only the incremental load to the destination.
The columns of the tables are,
tbl_Department : DEPARTMENTID,DEPTNAME
tbl_Employee : EMPID,EMPNAME,DEPARTMENTID
tbl_EmployeRecord : EMPID,EMPNAME,DEPTNAME
How to design the staging for this to handle Insert, Update and Delete.
In my ETL job I would like to truncate stg table but before truncating stging table, I want to make sure that all the records are inserted in the data model. The sample is as below
create table #stg ( CreateID int, Name nvarchar(10) ) insert into #stg select 1, 'a' union all select 2, 'b' union all
[Code] ....
How can I check among these tables and make sure that all values are loaded into the data model and the staging table can be truncated.
I'm trying to use Excel in SSIS to import the data from spreadsheet to a staging table. The package runs well from the web server using SSMS. But when I deploy and try to execute the package, I'm getting the below error. I've a question, whether I've to install the AccessDatabaseEngine driver in SQL database server or the web server where I'm executing the SSIS?
Error: The requested OLE DB provider Microsoft.Jet.OLEDB.4.0 is not registered. If the 64-bit driver is not installed, run the package in 32-bit mode.
I am having one store procedure which use to load data from flat file to staging table dynamically. everything is working fine.Staging_temp table have single column.All the data stored in that single column below is the sample Data.
where including/excluding a single column in an empty staging table would influence a resultset returning from distributed query? Both servers are SQL Server 2012. Nothing special about the staging table. It contains 12 columns with a mixture of INT and NVARCHAR(256) columns. In one case I exclude the column and the query returns in 17 seconds. When I include it the query does not return. Excluding the INSERT INTO the staging table and query returns in 17 secs with and without the column.
I am working on an HR project and I have one final component that I am stuck on.
I have an Excel File that is loaded into a folder every month.
I have built a package that captures the data from the excel file and loads it into a staging table (transforming a few bits of data).
I then combine it with another table in a view.
I have another package that loads that view into a Master table and I have added a Slowly Changing Dimension so that it only updates what has been changed. (it’s a table of all employees, positions, hire dates, term dates etc).
Our HR wants to have this data in a report (with charts and tables) and they wanted it to be in a familiar format. So I made a data connection with Excel loading the data into a series of pivot tables.
I have one final component that i cant seem to figure out. At the end of every year I need to capture a count of all Active Employees and all Termed employees for that year. Just a count.
The data is in one table labeled [EEMaster]. To test the count I have the following.
SELECT COUNT([PersNo]) AS HistoricalHC FROM [dbo].[EEMaster] WHERE [ChangeStatus] = 'Current' AND [EmpStatusName] = 'Active'
this returns the HistoricalHC for 2013 as 418.
SELECT COUNT([PersNo]) AS NumbOfTermEE FROM [dbo].[EEMaster] WHERE [ChangeStatus] = 'Current' AND [EmpStatusName] = 'Withdrawn' AND [TermYear] = '2013'
This returns the Number of Termed employees for 2013 as 42.
I have created a table to report from called [dbo.TORateFY] that I have manually entered previous years data into.
After the staging_temp data gets inserted into main table.my probelm is to handle such a file where number of columns are more than the actual table.
If you see the sample rows there are 4 column separated by "¯".but actual I am having only 3 columns in my main table.so how can I get only first 3 column from the satging_temp table.
Hi,I have table with three columns as belowtable name:expNo(int) name(char) refno(int)I have data as belowNo name refno1 a2 b3 cI need to update the refno with no values I write a query as belowupdate exp set refno=(select no from exp)when i run the query i got error asSubquery returned more than 1 value. This is not permitted when thesubquery follows =, !=, <, <= , >, >= or when the subquery is used asan expression.I need to update one colum with other column value.What is the correct query for this ?Thanks,Mani
My applications group wants to do a table transfer from one server to another as a means of refreshing the information on the target on a scheduled basis. That means that reports could be running against the target table during the transfer. This has introduced erroneous reports because the object transfer initially truncates the table, then transfers the data. If reports are running during the transfer, they might retrieve no data or only part of a table. I wrote a procedure to make a copy of the table to be transferred, then transfer that new table to the target machine, then drop the target table, then rename the transferred table. Now, I have the slightly smaller problem of doing a DROP TABLE when users are accessing the table. I can't set the database to 'dbo use only' because there are other tables in the database that must be available. Is this really a replication application?
- I have one production database and one test database. - In the test database i have done a few changes on the structure - mostly adding new fields in the tables. - Now i want to apply the table changes to my production database. Does anyone have any suggestions on how i should do that?
I mean, of course i could just manually check every field and update the changes - but that could take a while - i guess there are some better ways this could be done, like: - backup data - delete tables - recreate tables using structure of test database - reinsert data
Is this the correct way? Are there any easy ways to do this?
The changes i made should not make problems regarding data that is already stored (like changing a string field to int)
I am debugging one of our programs and ran the fix in Test. I would liketo compare table 1 between Production and Test. I want the query to outputcolumn 1 if Production <> Test output.What is the best way to achieve this?jeff--Message posted via http://www.sqlmonster.com
I imported a SQL Table into SQL DataBase, But I can not update this table even with SQL Server management Studio When I change any data on mentioned table above, Red exclamation sign appears left of the record . How can I correct this problem? Thanks.
This program gets the values of A and B passed in. They are for table columns DXID and CODE. The textbox GET1 is initialized to B when the page is loaded. When I type another value in GET1 and try to save it, the original initialized value gets saved and not the new value I just typed in. A literal value, like "222" saves but the new GET1.TEXT doesn't.