Watch videos with subtitles in your language, upload your videos, create your own subtitles! Click here to learn more on "how to Dotsub"

DAX2_001

0 (0 Likes / 0 Dislikes)
  • Embed Video

  • Embed normal player Copy to Clipboard
  • Embed a smaller player Copy to Clipboard
  • Advanced Embedding Options
  • Embed Video With Transcription

  • Embed with transcription beside video Copy to Clipboard
  • Embed with transcription below video Copy to Clipboard
  • Embed transcript

  • Embed transcript in:
    Copy to Clipboard
  • Invite a user to Dotsub
>> There are basically two things that you can create with the DAX language, either calculated columns or calculated measures. Calculated columns are part of the table in your data model, whereas calculated measures are just calculated on the fly as you use them in your visualisations. In these videos, Alberto will go through the details of each of those two things and explain some of the differences and where you would use one over the other. >> Let's move to the demo machine instead of looking at the slide. Okay, I will basically do some demo and write some code. If you want, if you think it's useful, you can replicate everything on your PC. I will try to not run to fast. If you prefer just to look at the screen and follow the lecture, that is fine for me too. You're not expected to follow a lab in a very specific way. So do merely whatever you want. But what I wanted to show you is how to create, for example, a column. In order to do that, you should have on your desktop, probably, a folder DAX Session that contains only one PBIX file, that is a Power BI desktop file that you can use, you can simply open it. Where is it? Open it and we will run all the demos and all the code only on top of this data model. This is a very simple data model. Oh, I forgot to ask you a question. Who has never used Power BI? Only a few of them. I'm not going to explain you what Power BI is and how to open Power BI desktop, so just try to follow what I will do just because all the remaining people would be bored if I start explaining all these details. So I will simply use Power BI expecting that you know where to go and what to do. We will use Power BI for all the DAX code. This model contains Date, Customer, Sales, Store, Product, Subcategory, and Category, so it contains many tables with a very simple set of relationships. The only complex one is the one starting from Sales and going to Product, and then Subcategory, and Category. And the dataset is coming from Contoso that sells goods in stores and on the internet. It does the same job that before was done by AdventureWorks, but AdventureWorks was selling bikes and this is selling electronics and other stuff. So it's a very standard sales data model. Now you can easily create your columns by writing a new column either in Data view, so when you see all the data, or straight in the Report view. I typically do write in the Data view because so I can immediately see the results of the column that I compute instead of creating the column and then using in the report. I prefer to write the code straight here. And for example, you can create a calculated column here in the sales by creating a New Column. You need to provide a name, and for example, let's use as a name, "Sales Amount," where sales amount you can click. On Unit Cost, and it provides sales unit cost. It's probably easier to zoom, both for you and me, because getting older, I'm no longer able to read smalls scripts, small things. Sales unit cost and then you typically not unit cost, but "Sales [Unit Price]" and then you multiply by Sales [Quantity]. This is going to create the calculated column and the code that you write in your formula says, "Oh, I want to multiply unit price by the quantity." The general syntax, the complete syntax of referencing a column is lightly more complex than this. You should enclose into single quotes the table name and then into square brackets the column name. So the standard syntax is single quotes, and then the table, and then a square brackets, and the column name. In reality, the single quotes can be avoided whenever your table is a good name, so it doesn't start with a number and it doesn't contains spaces. In that case, you can avoid the quotes, which are annoying when looking at the complex formula. And if you want, you can also avoid putting the table name and just write unit price multiplied by quantity. This is legal to it's simply multiplies unit price by quantity. And if the table name is not included in the specification of the column, it simply searches in the current table for that column. As it already happened, this is something that I tell you once and I ask you to forget immediately, whenever you write a reference to a column, like in this case, please always, and I really mean always put the table name in front of it. I cannot tell you the reason right now, it will come later today, but there is a very good reason for that. It's not DAX. It's not any problem with the DAX language. The formula is totally clear. The problem is you humans. When I read an expression, I expect to find the table name in front of any column. And then if you hit Enter... Sales amount is already there, so we do "Sales Amount" without space. And now you see that we created a column, a new column in the model that is basically the quantity multiplied by the unit price and contains the SalesAmount. This is very simple, it's very Excel like. So typically, all the users coming from Excel love calculated columns because they're simple. They show the calculation in a very simple and easy way. But what I wanted to tell you is that we are creating a calculated column, calculated columns are computed at refresh time, so when you refresh your model, calculated column are computed. And this is fine because they might include some very complex code that is executed only once at process time, at refresh time, and not the query time, but they have a drawback, that is, they use RAM, they use space in your model. So if you create a calculated column on a table that contains a few 100,000 rows, that is not an issue at all because the table is very small and the model works just fine. But if you have a very large table with a few hundred millions or few billions rows and you create a calculated column, they're expected to use a lot of RAM. And RAM is expensive. You don't want to waste that RAM just by creating calculated columns. So use them with care. The second point and it is the first pedantic moment of the day, is this point. You see that the SalesAmount value is computed for all the rows. And in order to get the value of sales quantity, the value which is highlighted there, I write "Sales" that is the table name, and then "[Quantity]" the column name. There is a small piece of information which is missing. So when I write Sales[Quantity], and I want to transform it into a value, that is, a number, what is the information that is missing? Or try to think it in a different way. If I ask anybody of you, can you please give me the value of sales quantity? What would you ask me? >> Which row? Sales quantity doesn't mean anything unless you have a current row. If you have a current row, then sales quantity has a different value for every row. Of course, I chose quantity, it doesn't have different values. But if you take, for example, on unit cost or other values, every row has a different value. When you write a calculated column, you don't need to specify the current row, so the row from where you want the value to come because this is somewhat implicit. It's a calculated column, it's computed row by row, so for every row, this will have a different value. As we will see later, this is not always the case. In case you want to get the value of a single row, you will always need to specify for which row you want the value to come from. Otherwise that will return you an error. Okay. Oh, I'm not used to that. Let's go to the slides again. So the standard syntax is table name in quotes, and then column name in square brackets. The square brackets and the column name are the only mandatory part, but get used to write the code in the right way. And we have just created a calculated column, probably a different one than what is here in the slide. You can create the Sales Amount column by multiplying Sales[Quantity] by Sales[Net Price]. That creates a column and the column is then visible for all the rows. Now calculated columns are basically simple DAX expression that compute values, compute the value for the current row. And whenever you write Product[Price], what you really mean is the value of the price column in the product table for the current row. Please remember that current row because it's gonna be important later from, later now. Calculated columns are easy to author, they are easy to understand, especially for Excel users. But they are not the only way of creating calculation in DAX. There is another kind of calculation which is known as measures. Let's see that with an example. And again, you can do that with me while I write the code. Now we created a calculated column that computes the sales amount. Imagine that you want to compute the gross margin. So you want to show the gross margin in a report. Let's go to the demo machine please. Thanks. Now we have the SalesAmount and we have Net Price. Imagine that we want to compute the gross margin. The gross margin can be computed as another calculated column. So I create a new column, and I call it "GrossMargin." Where the GrossMargin is basically the SalesAmount... No, sorry. The GrossMargin is the Net Price, And that is a good number. It says that for each of those products, I have $4.9 of gain. And of course, for different rows, I will have different values. I can use the gross margin in a report. So I can go on the report, what I have calendar year, I have the month, I have the quantity, and in the sales, I can take, for example, the SalesAmount, and I can put the GrossMargin. And this number has some meaning because I have the SalesAmount, which is the amount sold, and I have the GrossMargin, how much I gain from the sales that I did. Imagine that now we want to do something slightly different. I don't want to see the gross margin as a value. I want to see it as a percentage against the sales amount. So I don't want just a number, I want to show it as a percentage. I can follow the same technique. So I go here on the sales table, and I create a new column. And this time we call it "Gross Margin Pct" or whatever you prefer, and I compute it by taking the Sales[GrossMargin], and divide it by the Sales[SalesAmount]. The one without the space, because we need to use the column we created a few seconds ago. The one with the space is a different measure. I get a result, it's 0.49... Blah-blah-blah. In this way, it doesn't mean anything. It's much better if we format it as a percentage. Okay. Okay. Now it say's we sell at 49.05 percent, and that is fine. For each individual row, we computed the value and looks like useful. Now I can go to the report, I have the GrossMargin, and I simply click on the Gross Margin Pct. And the result, as you see, it's not really sexy. Because what is happening is that it's not computing the gross margin per month, it's not computing the gross margin per color or whatever you put in your report. Its sum means the gross margin percentage that happens row by row. So this number has no meaning at all. The point is that, in such a case, I needed to do a calculation that cannot be done row by row and aggregated later. What I need to do is do the calculation in a different way. First, I aggregate the values and then I perform the division on it later. To do that, I cannot use a calculated column, I need to use the other way of computing values in DAX, which are measures. Measures which if you're a user of Excel 2013, in Excel 2013 are called calculated fields, then likely they gained again some brain and they went back to measures. Measures are created instead of using a new column, you go to a new measure. Here it is. You can create a new measure from here. The user interface is the same. You don't see the measure in the columns because the measure is not computed row by row. And you can compute the measure are "Gross Margin Pct," let's add "Measure" so we know it's a measure. And this time, I can start writing the same expression which is Sales[Sales Amount] divided by sales... No, sorry. It's Sales[GrossMargin] divided by Sales[Sales Amount]. That's the very same expression that I used a few seconds ago. The point is that if you try to write it, as a result, you get an error. The error, as it happens for all the errors coming from Microsoft, it's so long that you typically don't even read it because it contains somewhere the description of the problem, but it's probably easier to look at the code and try to understand what is wrong. What is wrong here is that we are writing a measure, we are no longer writing a calculated column. And so if I write sales gross margin, you should ask me, "Oh, you want sales gross margin. For which row do you want sales gross margin?" Whereas I had a current row in a calculated column, so I could use the column name to get a value. In a measure, there is no concept of current row. The measure is computed at the report level over an aggregate of values. So you simply cannot get a value for the gross margin. And in that case what you need to do is surround your column reference with an aggregator. So you write SUM(Sales[GrossMargin]) / SUM(Sales[Sales Amount]). This is going to state something totally different, that is, I don't want to get the value of gross margin, I want to get the sum of the values of gross margin for all the rows that are in the current sale of the chart I'm using. So if I add just "SUM" around the two, then at this point, I should be a bit more happy, just because the measure at least computes correctly. I can format it as a percentage and then I can go in my report add gross margin, hopefully... Yes. It's a measure. And now the gross margin shows as a correct percentage. And of course, it will compute the value no matter how you make the report. So this is just to say that you can easily live with Power BI without either writing a single line of DAX as long as the only kind of aggregation that you want to perform is simple sum, count, or discount, the calculation that happens to be straight inside Power BI. But as soon as you want to compute something like a simple percentage, then it's time to roll up your sleeves and start writing some sort of DAX code. The difference between measures and calculated column is that calculated columns are computed at refresh time and stored in the model, whereas measures are not stored in the model, meaning that just the source code of the measure is stored in the model, but then the calculation happens at query time when the user starts to build reports based on data. Now the difference between measures and calculated column is that they are written using DAX. They do not work row by row, instead, they work at the aggregate level, and so they don't have the concept of current row, as an example, we have seen the GrossMargin and the GrossMargin as a percentage. A quick question for you. Am I running too fast or too slow or just fine? >> Just fine. >> Okay, if at some point, I'm running too fast, just slow me down. And if I'm running too slow, I typically see because start to sleep. So it's easy to understand what is happening. So the gross margin percentage that we computed was computed as a measure. And that's it, we already did it. So we can go back. This is what we were speaking a few seconds ago, so when to use what, when to use a measure and when to use a calculated column. You typically use a calculated column whenever you want the value to have a physical representation. So imagine that I want to be able to slice customers by doing ABC analysis or I want to compute a column in the customer table that divides customer by age range, then I want to use the age range in a slicer or in a report or to make filtering, in that case, a calculated column is needed because I want to be able to say, "Okay, now show me all the customers that are in the given age range. And to apply this filter, I need to have a calculated column, whereas, in all the other scenarios, typically, a measure is your best friend. Keep in mind another small detail that you might be used, you might be already used to use tabular without ever writing... Not tabular, sorry, Power BI without ever writing a measure because Power BI pre-aggregates everything for you using sum or whatever. So here when I... Oh, you don't see that. But on the demo before, you probably have still on the screen, when I put gross margin in the report, I get the sum of gross margin in an automatic way. This works as long as you use Power BI. But if you plan to use the beautiful feature that this morning you have seen, that is the ability to connect from Excel to Power BI, then Excel does not do any kind of automatic pre-aggregation, and if you want to see the sum of gross margin, then you'll need to author a measure that shows the sum of gross margin so that the next cell will be able to show you a sum.

Video Details

Duration: 20 minutes and 50 seconds
Country: Andorra
Language: English
License: Dotsub - Standard License
Genre: None
Views: 8
Posted by: csintl on Jun 9, 2016

DAX2_001

Caption and Translate

    Sign In/Register for Dotsub to translate this video.