Watch videos with subtitles in your language, upload your videos, create your own subtitles! Click here to learn more on "how to Dotsub"

Data Warehouse and Business Intelligence_ Systems Architecture and OLTP vs. OLAP

0 (0 Likes / 0 Dislikes)
This is Minder Chen; I'm a professor of Management Information Systems at Martin Business, School of Business and Economics at CSU Channel Island. In this lecture of data warehousing and business intelligence, I'm going to talk about the architecture of data warehouse and business intelligence and compare that with online transaction processing systems database Up first let's look at where business intelligence can be used. BI can be used to improve operational efficiency by tracking, reporting the, our KPI, on the product profitability, and analyzing the risk in operation, and we did mention like produce a lot of KPI for balanced scorecard. On the customer interaction side, we use business intelligence quite a bit to conduct sales analysis, sales forecasts, segmentation of the market, try to come up with cross-selling products, apply analytics to the CRN database, and analyzing our marketing campaign, and analyzing profitability across various customer segmentation. So we do ask a lot of what and why questions or even how. Some belong to the online transaction processing application, like the traditional database. But some belong to data warehousing. Business intelligence relates to the application If you ask questions about specific order, specific product items, specific customers, in terms of product inventory levels for specific products, status of a customer's order, and customer's payment record and et cetera, that's online transaction processing system area. A broader more aggregated summarize deeper analysis related question belonged to data warehousing business intelligence. For instance: what factors affect order processing time, and how did various product line or category contribute to profit last quarter, and which product has the lowest gross margin, et cetera, that belongs to business intelligence arena. This diagram shows, showed us the architecture for data warehouse and business intelligence; and kind of the process [of] how the row data migrate to the data warehouse and eventually be used by the end user for analytical purposes. Step 1: Over here in this diagram is the source, data source we have multiple sources for data: it could be a traditional transaction processing system, it could be flat file and excel workbook, can all be sources for the data which needs to be integrated and consolidated and put into a specialized database which we call data warehouse. So data warehouse conceptually is still a database but sometimes we do use specialized database technology to handle data stores in the so-called data warehouse. [We] get the data into the data warehouse it's from multiple sources, and requires a lot of effort, and the process is called ETL (Extract, Transform, and Load). You extract data from multiple sources, you will perform some kind of transformation to clean up the data, converting - for instance - the currency to a standard currency such as the U.S. dollar, and eventually in batch mail you would load the data into the data warehouse. So the data warehouse is enterprise wide in nature, but sometimes - let's say the finance department or the registrar's office in our university - needs to look at enrollment data and need to look at student payments. We may actually create a subset of the data from our data warehouse into something called data mart. So think of data marts just like a subset of a data warehouse. And we still use kind of an ETL tool (extract, transform and loading) tool to create this data mart. And from the data mart, or the data warehouse, we can create something called a cube, data cube, OLAP cube. Online Analytical Processing Cube. The cube is kind of conceptually a multidimensional data set that's derived from the data warehouse and, which are ready to be used by the front end tool. The front end tool includes simple query tools, or reporting tools, analytical analysis tools, or even data mining tools. And if you remember we talked about excel and we mentioned pivot table, pivot table is an example of an analytical tool that's feeding into excel. We can extract data from the data warehouse into excel's pivot tables which allow us to analyze it in a multidimensional way. So this is, this architecture gives you a kind of overview of what constitutes of the environment, the software/hardware environment for data warehousing business intelligence application. The reason we would want to feed a data warehouse for business intelligence application, is that the online transection - that should be - online transaction processing the online transaction processing application's database are usually fully normalized. Database normalization is one of the techniques in database design. It's particularly applicable for online transaction processing applications database. When you normalize a database you usually you have much smaller table, for instance, instead of grouping order into a single table we have order, and order detail Instead of combining category data into products we have a category table which is separate from the products. And this allows us to process transactions very efficiently if you're changing say the product's pricing, there's only one place in the product table which you need to worry about. However, when we try to use a fully normalized database allow time for reporting purposes, for analytical purpose. A lot of times we need to join multiple tables which can be very expansive and slow the process. In one of our assignments for database, for instance, asks you to to come up with total sales, annual sales, for each of the products. You actually need to go through all order details and order tables and join that with the product table in order to get all the information about, for instance, order date (you'll probably need that), you need the unit price, you need a quantity, and you probably need a product name in order to come up with the query results. So the minimum, in this case, we're joining three tables together. And if you have a thousand products you may have a hundred thousand order details in the year and then the query can be very expansive which will slow you down in terms of your operational side of transactions. In transaction processing systems people who place an order expect to get a kind of confirmation for their order to go through in two seconds, no more than two seconds. But if you conduct such an expansive query like the one we just mentioned, you may slow things down dramatically, and that's one reason we want to move the data from this normalized database design to, to some extent, a denormalized and aggregated specialized database called a data warehouse. And once we denormalize it actually the data will be kind of aggregated, combined together which makes it very easy for end users to understand what, what is in there. Later I'll show you an example of a simple design of a data warehouse and you'll understand the simplicity of the data model for a data warehouse. In this slide we have a table comparing the differences between online transaction processing systems and online analytical processing systems. An online transaction processing system is basically for handling business transactions and recording data generated during the business event. Online analytical processing systems basically rely on data warehouse and some of the data analysis tools to generate business intelligence. Let's look at one attribute at a time. The source data comes from operational data for OLTP. And for OLAP - data warehouse - its consolidated data and that data comes from a variety of OLTP databases. The purpose of data for transaction processing systems is to run and control fundamental business tasks. And for data warehouse is to help with planning, problem solving, and decision support. The data in transaction processing system reveals a snapsot about ongoing business activities or processes. For data warehouse, it's a multidimensional view of different kinds of business activities. For inserts and updates in transaction processing systems we do handle the insert, update, and delete of the data we have in the database usually they are initiated by the end-user. For data warehouse we rely on periodical, long running batch jobs to refresh the data we have in the data warehouse. Query: a typical transactional processing system are simpler and usually they return relatively few records. Queries against a data warehouse often are quite complex and involve a lot of joined operations and aggregation. I mean applying aggregation functions in the query. Processing speed for operational systems need to be really fast. I would say that the response time we expect something to be less than two seconds. For data warehouse operations it depends on the volume of the data involved. It tends to be slower, particularly for data refresh and complex queries. However, it is possible that we can speed up the query by creating indexes on some frequently used columns. We can do the same thing for operational systems, by the way, in terms of creating indexes. Space requirements in operational systems since the data size is relatively small the space requirement is relatively small. For data warehouses the amount of data is quite large therefore, the size of the data in the data warehouse tends to be much bigger. For database design in a transaction processing system databases tends to be highly normalized; much smaller data on the database table and many many more tables. For data warehouses we have denormalized tables which are fewer and usually they are in the shape of a soa, we call a star schema or snowflake schema, which is reflecting the structure of the data warehouse's design. Backup recovery - we do need to take care of backup/recovery for operational data. Operational transaction processing systems cannot afford to lose the data. And for data warehouse, you can certainly do regular backups but a lot of data can be recreated through the online transaction processing system. I hope this table gives you a kind of rough idea of the comparison between online transaction processing systems and online analytical processing systems. Once again one reason we have a separate data warehouse is because we want to organize data to make it easier for query reporting and analysis. Also, we want to separate the transaction processing system and data warehouse because we don't want someone to apply a complicated query against the operational system database and slow everything else down. Okay, so I'm going to conclude this lecture here and I hope to see you soon next time. Thank you, bye-bye.

Video Details

Duration: 17 minutes and 30 seconds
Year: 2013
Country: United States
Language: English
License: Dotsub - Standard License
Genre: None
Producer: Minder.Chen
Director: Minder.Chen
Views: 80
Posted by: christineward on Nov 17, 2015

Data Warehouse and Business Intelligence_ Systems Architecture and OLTP vs. OLAP. November 10, 2013. Retrieved from: https://youtu.be/DffOEbxr9fI. ----- No changes have been made to the video except the addition of accurate close captioning. ----- Discuss the basic archietcture for Data Warehouse and Business Intelligence; Compare OLTP vs.OLAP

Caption and Translate

    Sign In/Register for Dotsub to translate this video.