2014年7月31日星期四

Importing Excel Data into Access with esProc

In daily work we have frequent use of text data or spreadsheets, we need to import these data into database for further statistical analytics. For this task, esProc is a very handy tool.
In the following example, we will import Excel data into an Access database, to demonstrate how to migrate text data into database with esProc.
In the directory “D:\files\BoxOffice” we stored some box office data for movies. They are Excel files with the extension of “xlsx”. The first row of the Excel files contains field names, such as:
Now we need to store each filesin Access database, for better data analytics.
Within esProc we need to establish a data connection for Access first, either through system ODBC data source or direct use of mdb or accdb files:
 
Connect to Access data source, and then we need to find the list of files to be imported:
 

Once we get the data file list, we can then run a loop to read the data from each Excel file and import them to Access:

 

When processing each file, we must first read the datasheet, and use its file name (without file extension) as the name for tables in Access database. In order to make the data and table names more in line with the standards for database, spaces in table name will be replaced by "_", and possible blank data records will be deleted. Then, according to the structure and content of data tables, table creation statements will be generated in B5, and data update statements will be generated in C5. When generating the statements, field names and data in the first row must be read, and determine the data types:


Line 17 executes table creation statements. At this point, we must consider the situation where a table with the same name could already exists in the database. If this is true, A1.rollback () must be executed to do a rollback.
Line 19 imports the data from datasheets to Access database.
When the loop is done, we need to close the data source to avoid the existence of too many connections:

From this example, we can learn how to meet complicated requirements in esProc with simple codes. In particular, the same approach works not only for Excel files and Access database, but also for txt/xml files and various databases.
There are also some other approaches to import text data into the database. For example, the most primitive method is manual data input, which is obviously time consuming, laborious, boring and error prone. Programmers often solve these problems by coding. However, coding with common high-level languages (Java, C#) or scripting languages (Perl, Python) means a lot of workload, and is quite difficult to complete. Of course, Excel and Access are all Microsoft Office products. You can import Excel data directly from inside Access, with Excel as an external source. However, every time you can only import a single file, which makes it too troublesome when there are too many files to be imported. Also, this only works for Access database. In contrast, when you need to import batch text data into database, esProc is a nice tool.

没有评论:

发表评论