Integration Services :: Move Files Based On Filename?
May 12, 2015
I have a requirement to move files from HOLD folder to input folder. In HOLD folder I receive multiple files starting with af, ai, ar i.e. af*.txt , ai*.txt, ar*.txt . I need to move one file at a time to input folder as each file is to be loaded into database before next file is processed. In all the files the SSIS has to look at ai*.txt files first followed by af*.txt and lastly ar*.txt. If there are multiple files of same group the file with oldest date has to be moved first. How do I achieve this?
I need to move specific files from a server to another server on a monthly basis. There are hundreds of files that are in the source directory and I need to move approximately 40 of those to the destination server. I would like to easily add or delete the file list as needed. I have seen where several variables were created for for each file name (and one for the path) and the ForEach Loop would go through them. With 40 or more I was thinking that I could make a connection to an Excel spreadsheet or text file with a record for each file name and read in and and move to the next record and make that value become the content of a "FileName" variable. Then if I wanted to add another file name I could just add another record to spreadsheet/text file or remove and the package would handle automatically....
I am very new to SSIS. I need a package to iterate through files in folders and subfolders, evaluate the date of the file and delete them if they are over 90 days. I am using the following as a guide. The reason I am not using a maintenance plan for this is that I did not see a way to use a built in maintenance plan task to copy files. I already built a package that copies files to the destination. Now I need to delete files in the destination that are over 90 days old.
[URL] ...
I have a File System Task to delete files. I have a Script Task with a precedence constraint where I need to build an expression on the constraint between the Script Task and the File System Task for the delete.
I have database backups in the format of SQLInstanceName_DatabaseName_BackupType_YYYYMMDD_XXXXXXX.bak.
I created a variable to store the filename and I need an expression that will convert a file name like :
MySQLInstance_MyDB_DIFF_20150803_1700000.bak into 20150803, compare it to getdate -90 days.
I created another variable for the Max File Age. I built the below but when I click on Evaluate Expression I get an error stating, "The expression might contain an invalid token, an incomplete token, or an invalid element."
This is for SSIS 2008r2. I am generating flat files successfully with a datetime stamp (filename_yyyymmddhhmm). Now I need to append a MAX(FILEDATE) from the file. I have a query to do this, but am not sure about two things:
1) Is it advised to put the query in a Script Task (with db conn and so forth) or is it better to put it in an Execute SQL Task? I am thinking the latter but am not 100% sure.
2) How would I pass the result of this query (yyyymm) to the filename. The filename format would be (filename_yyyymm). I am assuming that I would probably need to pass the result into a variable/expression but am not quite sure of the steps involved.
I have one package through for each loop container I am loading all flat files into target table and its working fine.Now I my requirement is to capture the .txt file and loading into sqlserver table.Please check below file name.
CCCLOC_DDDLOC_LOC_20151203_240000_trigger.txt
from above file name I have to derived below detail.I have variable @File_Name which is used to stored the file name at rune time.Now I want to derived 5 values from the file name as below
I have a scenario where I need to move a series of files from within a directory of many files. The files follow no nameing convention and are more of less random. However the file names never change from week to week. I tried various different options in a 'file system task', no go. Any ideas on how to move only a list of files?
or
can I load only specific files into a 'Foreach Loop container' from a certain directory. I tried delimiting file names in the file source, that did not work.
I am trying to move data from 8 different tables that are dependent on each other through foreign key relationship.
Basically they have millions of rows in each table and they have data for the past 5 years. I want to move data for the past 120 days and move it to 8 new tables in the same database. So I created the new tables along with their relationships. Now I need to move in the order (parent table first).
The child table has 50million rows data to move The intermediate tables have 10 mil 10mil 10mil and 40 mil 50mil and 20 mil rows to move The parent table has 10 mil rows to move
if I choose to move this data through an SSIS package what is the best way? Or is there a better way to move this data faster?
I will be doing this move only once. After that I have maintenance purge jobs that will cleanup data on a daily basis.
Need to know how I can get the dynamic filename created in the FlatFile destination for insert into a package audit table?
Scenario: Have created a package that successfully outputs Dynamiclly named flat files { Format: C:Test’Comms_File_’ + ‘User::FileNumber’+’_’+Date +’.txt’
E.g.: Comms_File_1_20150724.txt, Comms_File_2_20150724.txt etc} using Foreach Loop Container :
* Enumerator Set to: “Foreach ADO Enumerator” with the ADO object source variable selected to identify how many total loop iterations there are i.e. Let’s say 4 thus 4 files to be created
*Variable Mappings : added the User::FileNumber – indicates which file number current loop iteration is i.e. 1,2,3,4
For the DataFlow task have a OLDBSource and a FlatFile Destination where Flat File ConnectionString is set up as:
I have all this gray space at the top of my tasks. When I <Ctrl>+<Alt>+<Left-click> to select all of my tasks and then try sliding it up, it creates even more gray space and actually moves it down. If instead when I <Ctrl>+<Alt>+<Left-click> and then <Ctrl>+<Arrow-up> it doesn't move it. Is there any easy way to eliminate all this gray space at the top?
I attempted to use Move Directory to move the contents of one directory to another. I encountered the 'different volume' issue that others have experienced. While this error is frustrating I can work past this particular issue. My more pressing question is why is the move directory command overwriting a destination directory? When I setup the Move directory file task I provided two vars to hold src and dest location:
dest var: estserveroutput src var: devserverdev estfiles Set overwrite destination = TRUE
Why would Move Directory overwrite output folder at destination? Shouldn't it only overwrite if the testfiles directory exists at destination? This is very frustrating since I cannot find enough information in the official documentation to understand what is happening here.
Is it just me or does the documentation for Move directory seem.....incomplete?
I am using the following script to check existence of table in the Database and create it dynamically...
This is working when table not existed, it error-ed when the table existed...
This script i am using in the Exec Sql Task.....
[Execute SQL Task] Error: Executing the query "declare @ODSDB varchar(50) declare @SQLSTMT varcha..." failed with the following error: "There is already an object named 'addressTable' in the database.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
declare @ODSDB varchar(50) declare @SQLSTMT varchar(max) set @ODSDB = 'SampleDB' begin set @SQLSTMT = ' IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(''' + @ODSDB + '.dbo.addressTable'') and Type=''U'')
In the first image as can be seens i have 2 different data sources and then they are being joined using "Merge Inner Join". The "sort" is on BusinessEntityID column of Person table and "Sort1" is on "PersonID" of Customer table. The merge join of these 2 result in 19,119 rows.
On the other hand, if i use single data source and use a query with inner join on tables used in the first image (ie. 2 tables being used in 2 different data sources) as depicted in second image. Also, since merge cannot operate without SortKey i have defined TerritoryID as sort key in the advanced editor. The number of rows i get after this is "10,274". My select query was :
SELECT P.BusinessEntityID, P.PersonType, P.Title, P.FirstName, P.MiddleName, P.LastName, P.Suffix, C.TerritoryID FROM stg.Person AS P INNER JOIN stg.Customer AS C ON C.CustomerID = P.BusinessEntityID ORDER BY C.TerritoryID;
According to me, it should have been the same as in first case i am using merge inner join and in second case i am using SELECT query with inner join. Upon drilling down i found that in the first case , my sort keys are BusinessEntityID and PersonID, if i modify this to CustomerID and BusinessEntityID as this is my join condition (in ithe inner join query shown above), i get the desired output. What i was wondering was, how the sort order change the Join Condition?
I am copying files from one server to another and I have specific format for all jpg files. which is in 3 format
filename_reg.jpg, filename_kat, filename_pag
and I want to copy _reg files only using file system task.I have already created file sytem task using foreach loop and it is copying files but I want to copy only _reg files.
i have multiple folders in a directory and each folder contains multiple files of same extension but with different formats(columns) and names(xmp: file aand file b). We have a data task in which we are joining(merge) both files and loading into table..should i use foreach, but then it takes 1 file at a time and i need the other file also to join it in data flow.
i need to add the double quotes in all the records from start and end.
source data col1 col2 col3 col4 1 abdul this is email it was very good ,and very relative posts. Target data col1 col2 col3 col4 "1" "abdul" "this is email" "it was very good, and very relative posts"
i want to load these three files three different destinations customer file should go one destination table, employee file should go one destination table, student file should go one destination table tomorrow if i get some more files in same folder , those files also should go separate destinations these should happen dynamically.
I have one small requirement.. I want to load the different types of files(.txt, .csv, .tsv, .xlsx).
Using one forearch loop container how can I load the files to database and I shouldn't use the script task to split the filenames. Is there any other way to load all the files using forearch loop container, exesql task..
Client uses an Amazon S3 bucket which they load flats files to . They also expect files to be delivered there to.So at the minute I have an SSIS package (SQL2012 ) which I use to generate some files but then have to manually import the files to the S3 bucket as well as export others.Now Mike Yin ( For SQL2008R2 ) mentioned that you need to obtain PostgreSQL ODBC driver so that you can use the .Net ProvidersOdbc Data Provider for ADO.NET Source component to connect to the Amazon cloud storage. After that, you can use a OLE DB Destination to load the data to SQL Server database.
Installed both 32 and 64bit 9.03. New connection Manager ADO.NET - New then drop the provider down to ODBC.Dataprovider.Then what ? Do I put the S3 bucket address within the use connection string ? Is there and example ? Why do I need the PostgreSQL ODBC as Im not connecting to a database just a S3 Bucket?
We run std 2008 r2, I need to recreate flat files from their varbinary(max) equivalents in our db. I have a mix of excel, pdf, word etc to recreate. Will ssis be a good tool for doing this? I'm wondering what transform(s) would be involved.
Perhaps I need to cast to varchar 1st and then land the data but if I recall correctly there is a maximum record length in ssis destination flat file rows. And I'm thinking I would have to map the varbinary (or cast equiv) to a row in the destination once for each file created.
We have a few customers dropping files in Amazon S3. how to load this data into SQL Server 2008 R2 database using SSIS? We are 2008 R2 BIDS environment.
I currently have a directory of csv import files, all of which have the same data structure but different header information.
For example:
File 1 This is header info. This is header info. This is header info. ID,Name, DOB, etc…
File 2 This is header info. This is header info. This is header info. This is header info. This is header info. ID,Name, DOB, etc…
The data starts with the column title row, ie ID,Name, DOB.What I need to happen is process that removes all the header rows up to the title row so that all import file structures will be the same.
I was thinking of using a ForEach Loop container that will run a script on each of the files to remove the header.
I have several regular reports that are produced by different offices that I need to import the data from. The challenge lies in the fact that the reports are not simply columns of data. Some cells are labels, others contain data, and some contain both. Also, the formatting of the reports isn't strictly uniform from office to office. Is it possible to read this kind of sheet? Excel data sources seem to just define everything in a column as data, and that doesn't work for me. Is there an alternative, or perhaps a more manual way of defining what cells contain data?
We are building a dataload application where parameters are store in a table. And there are multiple packages for each load.There is a column IsChecked column if it is 1 then only the child package should execute.Created a master package. In which i have taken execute SQL task in that storing a results in variable and based on the result the child package should execute. But In executesql task i selected result set as full result set. I am getting the below error.
[Execute SQL Task] Error: Executing the query "SELECT isnull(ID ,0) AS ID FROM DataLoadParameter..." failed with the following error: "The type of the value (DBNull) being assigned to variable "User::LoadValue" differs from the current variable type (Int32). Variables may not change type during execution. Variable types are strict, except for variables of type Object.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
I have implemented a package to load multiple files to a destination. Since the source was a txt file, i have created as flat file source. However now we are getting files in excel format as well.
Is there anyway the source gets changed dynamically based on the file extension, output of the foreach file enumerator? I can think one solution to have 2 dataflow tasks based on precedence constraining and expression one is for .txt and other one is for .xls.
I have to load on SS2012 hundeds of excel files produced by an application over the last five years, during time few columns have been added to the initial set.I created on SS2012 a table to match with the full set of columns and want to load all the files inside the table leaving the missing cells to NULL. I think SSIS can do the job but every trial failed do far.
Every day an application creates new tables and dumps static info into them.
I would like to create a package to dynamically export those database tables to raw files for long term archive, one file per table. Here is what I have so far and the issue I am having.
1) Get a list of un-archived tables. 2) Foreach table do the following.
a. Export the table into raw file. b. Zip the raw file. c. Update archive tracking table.
As long as the metadata for each table is the same this package seems to work fine. However, I have many tables with different metadata. How can I dynamically get the package update the metadata column collection when it hits a new table? When it hits a table with different metadata I am getting warnings like this:
The column "some_column" needs to be added to the external metadata column collection.
The "external metadata column "someother_column" (103)" needs to be removed from the external metadata column collection.
Then I get this error: Error: 0xC004706B at dump the table into a raw file, DTS.Pipeline: "component "OLE DB Source" (1)" failed validation and returned validation status "VS_NEEDSNEWMETADATA"