N.B This project is in its alpha phase
Shif
<TODO : make sure to later add how milestone inforamtion is organised, if there is code then add a section on File structure and Usage instructions and Install Requirments>
Background
This milestone mainly deals with the installation and setup of all the necessary software and extension modules needed for the project. The outline and description of each of the services and packages used are provided below.
SQL Server is a relational database management system (RDBMS). Its primary function is storing and retrieving data requested by applications. These applications may run on the same computer or another computer across a network or the internet.
The Adventure Works sample is an OLTP (online transaction processing) database. The AdventureWorks database supports a fictitious, multinational manufacturing company called Adventure Works Cycles.
link to DB -> (https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks)
On the other hand, Visual Studio is an integrated development enviroment (IDE), a tool used to develop computer programmes, web apps, web services and mobile apps. Visual Studio comes with various workloads, a bundle of tools that aid in further productive development.
Then, SQL Server Data Tools (SSDT) is a modern development tool for building SQL Server relational databases, Azure SQL Databases, Analysis Services data models, Integration Services packages and Reporting Services reports.
Finally, SSIS, SSAS and SSRS are incorporated as part of SQL Server Data Tools. This is present as the extensions for Visual Studio, the extensions are SQL Server Integration Services Projects (SSIS), then Microsoft Reporting Services Projects (SSRS) and then finally, Microsoft Analysis Services Projects (SSAS).
Outline
In this milestone, we first setup the SQL Server, then the we download the sample database AdventureWorks dataset version 2016. The SQL Server (Developer version) 2019 is then made to connect with SQL Server Management Studio 2019. Then, the database is restored as shown below.
Next, we install Visual Studio 2019 (the Community edition). Three workloads need to be chosen, the ASP.Net Core web development workload, the .NET Framework for desktop development and cloud engineering and the Data storage and processing. Once setup, the apppropriate SSIS, SSRS and SSAS extensions need to be installed.
Background
This milestone focuses on developing an ETL pipeline using SSIS.
The SQL Server Integration Services (SSIS) is a platform for building enterprise level data integration and data transformation solutions. With SSIS, complex business problems can be solved copying and downloading files, sending email messages in response to events. Can be used to updating data warehouses, cleaning and mining data, extract, transform data from various sources and also managing SQL Server Objects and data.
ETL stands for extract, transform and load. It is a process that sets a data pipeline. First, we collect data from various soruces. Second, we transform the data according to business rules (e.g. filtering, sorting, aggregating, joining, cleaning, de-duplication and/or validating) and thirdly we load data into a destination data store or a data warehouse.
Data can be extracted data homogenous (one source) or hetergenous (several) sources. Transformation process is heavily influenced by business requirements, e.g. the right type and format of data needs to be ensured.
Outline
In this milestone we setup our ETL pipeline. First, we download source source tables as part of AdventureWorks lightweight tables (i.e. AdventureWorks LT) version 2019. This can be obtained from https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks. We download the AdventureWorksLT2019.bak and restore this into our SSMS as a seperate table. This table acts as the source table.
Secondly, we set te destination database, we create a new database. We give it the name CustomerSampling. This table will have the same schema as the source database table SalesLT.Customer hence the primary keys, the columns and the column types will mimic exactly the ones for the source so in order to prevent issues when transferring from source to destination. The script for generating the destination table is in the file Create_Customer_Sampling_DB. The process is outline below.
Thirdly, in order to transfer the database, we are going to develop our own ETL pipeline. We first setup a new SSIS project called SSIS Amazon ETL Pipeline. Next, we configure the source and destination for the SSIS project. Setup a dataflow, the source and the destination. Ensure that the Microsoft NT Integrated Scurity is selected for a robust authentication when setting the Source Assistant. We then ensure if the connection is successfull as shown below. After that we should have both the source and destination assistant setup with their tables loaded. Next we perform the appropriate transformations.
Firstly, we then setup a simple transformation process using the Derived Columns toolbox. The transformation involves setting the CompanyName column to have uppercased characters. In the properties, we set the UPPER expression from the String Functions, add the company name in the expression and set the mapping for the Destination table assistant from CompanyNameUpper to CompanyName as shown below.
Once the pipeline has been set and run, the Destination table should update as expected. We can see from below that all of the entries for the CompanyName column have been uppercased. This concludes the steps necessary for finishing Milestone 2.
Once the migration
With the success
Having setu
The migration is completed, it is time that business presents
- In the scenerio if VS 2019 is unable to install Microsoft ASP45.Net, simply enable updates and modules installer in the Services manager.
Copyright @ MIT. All rights reserved.
Licensed under the MIT license.
Mahed Javed - ksfmahed@outlook.com
For additional questions, contact me at the email provided below.
- GitHub: mahedjaved
- Email: mahed95@gmail.com






