Teradata warehouse introduction
A single Teradata system can support a maximum of 16,384 vprocs (virtual processor). The maximum number of vprocs per node can be as high as 128, but is typically between 6 and 12
The AMP is the heart of the Teradata Database. The AMP is a vproc that controls the
management of the Teradata Database and the disk subsystem, with each AMP being assigned to a vdisk.
Vprocs provide the parallel environment that enables the Teradata Database to run on SMP and MPP systems. Vprocs come in two types:
• Access Module Processors (AMPs) ( manage a portion of disk
• Parsing Engines (PEs). (decompose sql and return answer (dispatcher; parser;opti;)
Query are scattered to BYNET linked CLIQUE (Fully inter-linked nodes and arrays), AMP get instruction and return data (3 steps: LOCK for serialization and consistency;
************************************************
Teradata Training Program
Total Training Duration: 3 PD ( 24 Hour)
- Basics (2Hours)
- Architecture (3 hours)
- SQL (4 Hours)
- ETL Utilities (3 Hours)
- Case Study (12 Hours)
Basics:
- Basic Architecture of Teradata
- The uniqueness of Teradata SQL and to know different functions available in Teradata.
- Different Utilities (FastLoad, FastExport, MultiLoad, Tpump,Teradata Warehouse Miner) in Teradata.
- How Teradata is ideal for Datawarehouse application.
Architecture:
Hardware components:
The hardware that supports Teradata RDBMS software is based on off the-shelf Symmetric Multiprocessing (SMP) technology. Hardware Components are:
• SMP Systems
• MPP Systems
• BYNET
• VPROCS
- Parsing Engine (PE )
- Access Module Processor (AMP)
Software Architecture:
• Parallel Database Extensions (PDE)
• Teradata File System
• Trusted Parallel Applications (TPA )
• Virtual Processors - PE and AMP
• PERM Space
• SPOOL Space
SQL:
- Teradata DDL(create, Drop, Alter)
- Teradata DML(Select, Insert,Update, Delete),
- Teradata DCL(Grant, Revoke).
- Teradata extensions to SQL like HELP, SHOW, EXPLAIN, CREATE MACRO, REPLACE MACRO.
- Primary index / Secondary Index
- SET operators(UNION,INTERSECT,EXCEPT, MINUS)
ETL Utilities:
Fload: This utility provides us with facility for loading new tables onto Teradata database
Fast export: This utility exports a large volume of data from Teradata to a host file or user-written application.
Multiload: This utility supports INSERT,UPDATE, DELETE and UPSERTS typically with batch inputs from a host file.
TPump: This utility allows the real time update from transactional systems into warehouse.
Case Study:
Administration:
- To create a user , database and understand the hierarchy of user and database.
- To create a table with columns having datatype integer, char, decimal, date and having one primary index.
- To drop a table in a database or a user.
SQL:
- To use SELECT statement with ORDERBY and GROUPBY clause.
- To use inner join, LEFT Outer Join, Right Outer Join and Full Outer Join on two tables.
- To use CASE, COALESCE, NULLIF, IFNULL
- To use set operators(UNION,INTERSECT,EXCEPT)
Fast Load:
- To create a empty table in TeraData and prepare a flat file containing data. Prepare a script for loading the data from flat file into the empty table.
Fast Export:
- To use the FastExport to create an export file that contains one record for each transaction . Columns from two different tables has to be joined inorder to create the export file.
MultiLoad:
- To use Multiload to delete rows from your three tables. An input file will be created which will contain a control letter(A-Accounts, C-customer, T-trans) followed by a primary index value for the appropriate table
TPUMP:
- Use FastExport to export the records into a flat file. Prepare Tpump script which performs UPSERT operation on another table.Validate your results.
Labels: data mining