Tuesday, May 14, 2013

ETL Project Life Cycle





Business Requirement Collection :- 
-> The business requirement gathering start by business Analyst, onsite technical lead and client business users.
-> In this phase,a Business Analyst prepares Business Requirement Document ( BRD ) (or) Business Requirement Specifications ( BRS )
-> BR collection takes place at client location.
-> The o/p from BR Analysis are ->[ BRS :- Business Analyst will gather the Business Requirement and document in BRS, SRS :- Senior technical people (or) ETL architect will prepare the SRS which contains s/w and h/w requirements.]
The SRS will includes
a) O/S to be used (  unix )
b) RDBMS required to build database ( oracle, Teradata etc )
c) ETL tools required ( Informatica,Datastage )
d) OLAP tools required ( Cognos ,BO )
The SRS is also called as Technical Requirement Specifications ( TRS )
Designing and Planning the solutions :-
------------------------------------------------
-> The o/p from design and planning phase is
a) HLD ( High Level Design ) Document
b)LLD ( Low Level Design ) Document
HLD ( High Level Design ) Document : -
An ETL Architect and DWH Architect participate in designing a solution to build a DWH.
An HLD document is prepared based on Business Requirement.
LLD ( Low Level Design ) Document : -

Based on HLD,a senior ETL developer prepare Low Level Design Document
The LLD contains more technical details of an ETL System.
An LLD contains data flow diagram ( DFD ), details of source and targets of each mapping.
An LLD also contains information about full and incremental load.
After LLD/specs  then Development Phase will start
Development Phase ( Coding ) :-
--------------------------------------------------
-> Based on LLD/, the ETL team will create mapping ( ETL Code )
-> After designing the mappings, the code ( Mappings ) will be reviewed by developers.
Code Review :-
-> Code Review will be done by developer.
-> In code review,the developer will review the code and the logic but not the data.
-> The following activities takes place in code review
-> You have to check the naming standards of transformation,mappings of data etc.
-> Source and target mapping ( Placed the correct logic or not in mapping )
Peer Review :-
-> The code will reviewed by your team member ( third party developer )
Testing:-
--------------------------------
The following various types testing carried out in testing environment.
1) Unit Testing
2) Development Integration Testing
3) System Integration Testing
4) User Acceptance Testing
Unit Testing :-
-> A unit test for the DWH is a white Box testing,It should check the ETL procedure and Mappings.
-> The following are the test cases can be executed by an ETL developer.
1) Verify data loss
2) No.of records in the source and target
3) Dataload/Insert
4) Dataload/Update
5) Incremental load
6) Data accuracy
7) verify Naming standards.
8) Verify column Mapping
-> The Unit Test will be carried by ETL developer in development phase.
-> ETL developer has to do the data validations also in this phase.
Development Integration Testing -
-> Run all the mappings in the sequence order.
-> First Run the source to stage mappings.
-> Then run the mappings related to dimensions and facts.
System Integration Testing :-
-> After development phase,we have to move our code to QA environment.
-> In this environment,we are giving read-only permission to testing people.
-> They will test all the workflows.
-> And they will test our code according to their standards.
User Acceptance Testing ( UAT ) :-
-> This test is carried out in the presence of client side technical users to verify the data migration from source to destination.
Production Environment :-
-> Migrate the code into the Go-Live environment from test environment ( QA Environment ).





Development activities:

New Enhancements will be discussed with onsite coordinator and with the internal team once the requirement is confirmed by the business/client we will start with the impact analysis on the existing system/process and will identify the stream which are getting impacted. Once analysis is done we will come up with efforts estimation that will be submitted to business/cleint then we will start with design and coding 

Unit testing will be done each individual who developed the code. We will take backup of the existing job(master) and will modify the  changes on the backed up job. Once unit testing is completed that will be shared and will go for multiple reviews/ comments / approvals

Once ur tesing done will replace the back up job with the original job( we will take back up of the existing master job again)

We will consolidate all the jobs from each individual and plan for sequence level coding. 

UAT/Intergration testing:

We dont have admin right to QA. To support QA environment we have third party vendor(cognizant). A new change request(CR) in internal tool(remedy ticket) will be raised for each deployment in QA. 
.dsx will be placed in a shared drive mentioning all the job names path and the changes modified to the particular job and jobs moved after validation of jobs are done.

We have separate  team from china who will be providing the data for QA team. We will be monitoring(Director) the jobs and get notified if any aborts/issues.

Data quality Issues will be raised thru remedy for the existing test/qa process. Developers will take it up accordingly and resolve the issues.

Once testing team give sign off we will plan for production deployment. Before deploying we will go for the documentation like modifying the existing design document, providing test cases and test results, check lists( all validations in dbs,naming conventions of jobs,existing paths present or not etc).

Once above docs are attached to the production request (CR)  we will undergo approvals based up on hierarchy. 

After approvals deployment will be planned.  it will be scheduled on weekend. 





1 comment:

KITS Technologies said...

Would you be interested in trading links or maybe guest writing a blog post or vice-versa?
P0wer bi onlinetraining from india
P0wer bi onlinetraining