By default the Row Generator stage runs sequentially, generating data in a single partition. You can, however, configure it to run in parallel and meaningful data.
a) Job Design :
b) RowGenerator Stage :
- Double click on Stage
- Fill the properties tab, Fill the No of Rows you want to generate. ( Here I filled 50 )
- Now, clock on column tab and define the column you needed on o/p file, Need to define Column Name, type, length etc.
-- Now Here we took our Magic Step :-) We will edit the Meta Data of Column
-- When you double click on Column 1 (Name), This Window will open, As you can see, we can edit a lot of metadata of a column here.
-- We want to generate some meaningful data, so we use the Generator properties here.
Here we selected Algorithm ( of data generation ) is Cycle which repeat the data from start to end. and we have passed some values ( Names )
-- Same we will follow for Second Coulmn ( Salary )
- And then Click OK, we are done with Column Generator Stage.
c) Seq File Stage
- Define the Seq File stage properties here, Like, O/p file location, delimiter, column name, quotes etc in o/p file.
- and Keep it all other tab as it is.
- Now Save the job design,Compile and RUN.
Output File will look like below -
Here, as we can see, Name is repeating in a Cycle again n again and
Salary column also follow the same. So here we have some meaningful data
for our dummy job test.
No comments:
Post a Comment