The Join block enables you to combine observations from two datasets into a single working dataset.
The following demonstrates how to use the Join block to link information in two input datasets
- lib_books.csv, which contains observations that describe a range of books available from a lending library.
- ddn_subjects.csv, containing observations that link the Dewey Decimal Number to subject descriptions.
Both tables use a common variable Dewey_Decimal_Number:
- Import the datasets lib_books.csv and ddn_subjects.csv into a Workflow using a Text File Import block for each dataset.
- Right-click the lib_books.csv dataset output, click Rename and enter Lib Books.
- Right-click the ddn_subjects.csv dataset output, click Rename and enter Book Subjects.
- Expand the Data Preparation group in the Workflow palette, then click and drag a Join block onto the Workflow canvas.
- Click the Output port of the lib_books dataset block and drag a connection towards the Input port of the Join block. Repeat for the Book Subjects dataset.
- Double-click the Join block to display the the Join Editor view.
The view displays a table for each dataset, with each table containing the dataset's variable names.
- From the Lib Books table, click the variable Dewey_Decimal_Number and, holding the left mouse button down, drag across to the DDN variable in the Book Subjects table, then release the left mouse button.
A connection is drawn between Dewey_Decimal_Number in Lib Books and DDN in Book Subjects.
- Close the Join Editor view and save the configuration when prompted.
A green execution status is displayed in the Output ports of the Join block and the new Working Dataset. The dataset contains variables from both input datasets matched using the Dewey_Decimal_Number and DDN variables.