How to set up smartphones and PCs. Informational portal
  • home
  • Programs
  • Practical work on olap technology in excel. Practical work in computer science Ms

Practical work on olap technology in excel. Practical work in computer science Ms

The problems of analytics, OLAP, data warehouses are of increasing interest to Russian IT specialists. To date, our computer press and the Internet have published a lot of good, academic in the way of presenting materials on this topic, including introductory ones. We bring to your attention an article in which we deliberately try to explain OLAP “on the fingers”, using a specific example. Practice shows that such an explanation is necessary for some IT professionals and especially for end users.

So, OLAP *1 as a first approximation, “on the fingers”, can be defined as a special way of analyzing data and generating reports. Its essence is to provide the user with a multidimensional table that automatically summarizes data in various sections and allows you to interactively manage calculations and the form of the report. This article will talk about the technology and basic operations of OLAP using the example of analyzing invoices for an enterprise engaged in the wholesale trade in food products.

*one. OLAP - On-Line Analytical Processing, operational data analysis.

As a tool, an OLAP system of the simplest and most inexpensive class, an OLAP client *1, will be considered. For example, the simplest product from the number of OLAP clients is chosen - “Contour Standard” from Intersoft Lab. (For clarity, later in the article, generally accepted OLAP terms will be indicated in bold and accompanied by English counterparts.)

*one. For more information about the classification of OLAP systems, see the article "OLAP Made in Russia" in PC Week/RE, No. 3/2001.

So let's get started with the system. First you need to describe the data source (data source) - the path to the table and its fields. This is the task of the user who knows the physical implementation of the database. For end users, it translates the name of the table and its fields into domain terms. Behind the “data source” is a local table, a table or view (view) of an SQL server, or a stored procedure.

Most likely, in a particular database, invoices are stored not in one, but in several tables. In addition, some fields or records may not be used for analysis. Therefore, a Sample (result set or query) is then created, in which the following are configured: the algorithm for combining tables by key fields, filtering conditions and the set of returned fields. Let's call our selection “Invoices” and put all the fields of the data source “Invoices” into it. Thus, an IT specialist, by creating a semantic layer, hides the physical implementation of the database from the end user.

Then the OLAP report is configured. This can be done by a subject matter expert. First, the fields of a flat data sample are divided into two groups - facts (facts or measures) and measurements (dimensions). Facts are numbers, and measurements are "cuts" in which the facts will be summarized. In our example, the dimensions will be: “Region”, “City”, “Customer”, “Product”, “Date”, and the fact will be one - the “Amount” field of the invoice. For a fact, one or more aggregation algorithms must be selected. OLAP is able not only to sum up the results, but also to perform more complex calculations, up to statistical analysis. Selecting multiple aggregation algorithms will create virtual, calculated facts. In the example, one aggregation algorithm is selected - “Sum”.

A special property of OLAP systems is the generation of dimensions and data for older time periods from a date and the automatic calculation of totals for these periods. Let's select the periods "Year", "Quarter" and "Month", while there will be no data for each day in the report, but the generated dimensions "Year", "Quarter" and "Month" will appear. Let's name the report "Sales Analysis" and save it. The work on creating the interface of the analytical application is completed.

Now, when running this interface on a daily or monthly basis, the user will see a table and graph in which invoices are summarized by item, customer, and period.

In order to make manipulations with data intuitive, the tools for managing a dynamic table are the elements of the table itself - its columns and rows. The user can move, delete, filter, and perform other OLAP operations. The table automatically calculates new subtotals and final totals.


For example, by dragging (“move” operation) the “Product” column to the first place, we will get a comparison report - “Comparison of sales volumes of products for the year”. To aggregate data for a year, just drag the “Quarter” and “Month” columns to the top of the table - “inactive measurements area”. The “Quarter” and “Month” dimensions transferred to this area will be closed (“close dimension” operation), i.e. excluded from the report; the facts will be summed up for the year. Despite the fact that the dimensions are closed, you can set specific years, quarters and months for data filtering (“filter” operation).

For greater clarity, let's change the type of graph illustrating the OLAP table and its location on the screen.

Drilling down into the data (“drill down” operation) allows you to get more detailed information about the sales of the product of interest to us. By clicking on the “+” sign opposite the “Coffee” product, we will see its sales volumes by region. Having opened the Ural region, we will get sales volumes in the context of the cities of the Ural region, delving into the data on Yekaterinburg, we will be able to view data on wholesale buyers of this city.

You can also use open measurements to set filters. To compare the dynamics of candy sales in Moscow and Yekaterinburg, let's set filters for the "Product" and "City" dimensions.

Close unnecessary measurements and select the type of graph "Line". On the resulting graph, you can trace the dynamics of sales, evaluate seasonal fluctuations and the relationship between drops and growth in sales of goods in different cities.

Thus, we were convinced that OLAP technology allows the user to release dozens of types of various reports from one interface, managing a dynamic OLAP table with the mouse. The task of a programmer who owns such a tool is not routine coding of reporting forms, but setting up an OLAP client for databases. At the same time, the methods of managing the report are intuitive for the end user.

Indeed, OLAP is a natural extension and development of the idea of ​​spreadsheets. In essence, the OLAP visual interface is also a spreadsheet, but equipped with a powerful calculation engine and a special standard for presenting and managing data. Moreover, some OLAP clients are implemented as add-ins to MS Excel. Therefore, the millionth army of “white collars”, who are confident in spreadsheets, are also quickly mastering OLAP tools. For them, this is a “velvet revolution” that provides new opportunities, but does not involve the need to retrain.

If the reader, after reading this article, has not lost interest in OLAP, he can refer to the materials mentioned at the beginning. Collections of such materials are posted on a number of sites on the Internet, including the site Intersoft lab - www.iso.ru. From it you can also download a demo version of the Kontur Standard system with the example described in the article.

Analytical Analytical Processing (OLAP) is a technology that is used to organize large business databases and support business intelligence. OLAP databases are divided into one or more cubes, and each cube is organized by the cube administrator to suit how data is retrieved and analyzed to make it easier to create and use the PivotTable and PivotChart reports that you'll need.

In this article

What is business intelligence?

A business analyst often wants to get a bigger picture of the business, to view broader trends based on aggregated data, and to view trends broken down into any number of variables. Business intelligence is the process of extracting data from an OLAP database and analyzing that data to provide insights that can be used to make informed business decisions and take action. For example, with OLAP and business intelligence, you can answer the following questions about business data.

    How does the total sales of all products in 2007 compare to sales since 2006?

    How does this compare to the date and time for the benefit period over the past five years?

    How much money did customers spend on 35 last year, and how has that behavior changed over time?

    How many products were sold in two specific countries/regions this month as opposed to the same month last year?

    For each customer age group What is the breakdown of profitability (both margin percentage and total) by product category?

    Search for top and bottom sellers, distributors, suppliers, customers, partners and customers.

What is Online Analytical Processing (OLAP)?

OLAP (Online Analytical Processing) databases simplify business intelligence queries. OLAP is a database technology that is optimized for query and reporting, not transaction processing. The data source for OLAP is online transaction processing (OLTP) databases, which are typically stored in data warehouses. OLAP data is extracted from this historical data and combined into structures that allow complex analysis. OLAP data is also organized hierarchically and stored in cubes rather than tables. It is a complex technology that uses multidimensional structures to provide fast access to data for analysis. In this organization, for a PivotTable report or a PivotChart report, you can easily display high-level summary data, such as sales totals for an entire country or region, and also display details about sites where sales are particularly strong or weak.

OLAP databases are designed to speed up the loading of data. Because the OLAP server calculates aggregated values, not Microsoft Office Excel, less data must be sent to Excel when you create or edit a report. This approach allows you to work with a larger amount of source data than if the data was organized in a traditional database where Excel retrieves all the individual records and calculates aggregated values.

OLAP databases contain two main types of data: measures, which are numeric data, quantities, and averages, which are used to make informed business decisions, and dimensions, which are categories used to organize these measures. OLAP databases help you organize your data with many levels of detail using the same categories you know for data analysis.

The following sections describe each component in detail below.

Cubic A data structure that combines measures across levels and hierarchies of each dimension that you want to analyze. Cubes combine multiple dimensions such as time, geography, and product lines with summary data such as sales and inventory. Cubes are not "Cubes" in the strict mathematical sense, as they don't necessarily have the same sides. However, they represent an apt metaphor for a complex concept.

measurements A set of values ​​in a cube based on a column in the cube's fact table, which is usually a numeric value. Measures are central values ​​in a Cube that are pre-processed, processed, and analyzed. The most common examples are sales, income, income, and expenses.

Member An element in a hierarchy that represents one or more occurrences of data. An element can be either unique or non-unique. For example, 2007 and 2008 represent unique members at the year level of the time dimension, while January represents non-unique members at the month level because there is more than one January in the time dimension because it contains data for more than one year.

Calculated member A member of a dimension whose value is evaluated at run time using an expression. The values ​​of calculated members can be derived from the values ​​of other members. For example, the calculated item "profit" can be determined by subtracting the value of the item, as well as costs, from the value of the item, sales.

dimension A set of one or more ordered cube level hierarchies that the user understands and uses as the basis for data analysis. For example, a geographic dimension might include country/region, state/region, and city levels. In addition, a time dimension can include a hierarchy with year, quarter, month, and day levels. In a PivotTable report or PivotChart report, each hierarchy becomes a set of fields that you can expand and collapse to show lower or higher levels.

Hierarchy A logical tree structure that arranges the members of a dimension such that each member has one parent and zero or more children. A child element is a member of an earlier group in the hierarchy that is directly related to the current member. For example, in a time hierarchy containing quarter, month, and day levels, January is a child of Qtr1. The parent element is the lower level member in the hierarchy that is directly related to the current member. The parent value is usually a consolidation of the values ​​of all child elements. For example, in a time hierarchy containing quarter, month, and day levels, Qtr1 is the parent of January.

Level In a hierarchy, data can be organized into lower and higher levels of detail, such as years, quarters, months, and day levels in a time hierarchy.

OLAP Functions in Excel

Getting OLAP data You can connect to OLAP data sources in the same way as you can connect to other external data sources. You can work with databases created with Microsoft SQL Server OLAP Services version 7.0, Microsoft SQL Server Analysis Services version 2000 and Microsoft SQL Server Analysis Services version 2005, Microsoft OLAP Server Products. Excel can also work with third party OLAP products that are compatible with OLE-DB for OLAP.

OLAP data can only be displayed as a PivotTable or PivotChart report, or in a worksheet function converted from a PivotTable report, not as an external data range. You can save OLAP PivotTable and PivotChart reports in report templates, and you can create Office Data Connection (ODC) files to connect to OLAP databases for OLAP queries. When you open an .odc file in Excel, an empty PivotTable report is displayed and is ready to be placed.

Create cube files for offline use You can create a standalone cube (.cub) file with a subset of data from an OLAP server database. Offline cube files are used to work with OLAP data when you are not connected to a network. With a cube, you can work with more data in a PivotTable report or PivotChart report than you otherwise would, and get the data faster. You can only create cube files if you are using an OLAP provider such as Microsoft SQL Analysis Services version 2005 that supports this feature.

Server actions A server action is an optional function that an OLAP cube administrator can define on a server that uses a cube element or measure as a parameter in a query to retrieve information in a cube, or to launch another application such as a browser. Excel supports URLs, Reports, Rowsets, drill-down, and drill-down detailed server actions, but does not support its own native operator and dataset.

KPI A KPI is a special calculated measure defined on the server that allows you to track "Key Performance Indicators", including status (the current value corresponds to a specific number). and trend (values ​​over time). When they are displayed, the server can send the appropriate icons, similar to the new Excel icon, to line up above or below the status levels (for example, for a stop icon), as well as scroll the value up or down (for example, a directional arrow icon).

Formatting on the server Cube administrators can create measures and calculated members using color formatting, font formatting, and conditional formatting rules that can be assigned as an enterprise standard business rule. For example, a server-side format for income could be a number currency format, the cell color is green if the value is greater than or equal to 30,000 and red if the value is less than 30,000, and the font style is bold if the value is less than 30,000, and if the value is positive - ordinary. greater than or equal to 30,000. For more information, see .

Office interface language A cube administrator can define translations for data and errors on the server for users who need to view PivotTable information in another language. This feature is defined as a file connection property, and the user's computer's locale and country must match the interface language.

Software components required to access OLAP data sources

OLAP provider To set up OLAP data sources for Excel, you need one of the following OLAP providers.

    Microsoft OLAP provider Excel includes a data source driver and client software for accessing databases created with Microsoft SQL Server olap version 7.0, Microsoft SQL Server olap version 2000 (8.0), and Microsoft SQL Server Analysis services version 2005 (9 ,0).

    Third Party OLAP Providers Other OLAP products require additional drivers and client software to be installed. To use Excel's OLAP data manipulation capabilities, a third-party product must comply with the OLE-DB for OLAP standard and be compatible with Microsoft Office. For information about installing and using a third-party OLAP provider, contact your system administrator or OLAP product provider.

Server Databases and Cube Files The Excel OLAP client software supports connections to two types of OLAP databases. If the database on the OLAP server is available on the network, you can get source data from it directly from it. If you have a standalone cube file that contains OLAP data or a cube definition file, you can connect to that file and get source data from it.

Data sources A data source provides access to all data in an OLAP database or offline cube file. Once you create an OLAP data source, you can base reports on it and return OLAP data to Excel as a PivotTable or PivotChart report, or as a sheet function converted from a PivotTable report.

Microsoft Query With Query, you can retrieve data from an external database such as Microsoft SQL or Microsoft Access. You do not need to use a query to retrieve data from an OLAP PivotTable associated with a cube file. Additional information .

Differences in OLAP Functions and Non-OLAP Source Data

If you are working with PivotTable reports and PivotCharts from OLAP source data and other types of source data, you will see some feature differences.

Data Extraction The OLAP server returns new data to Excel whenever the report layout changes. With other types of external data sources, you query all of the source data at once, or you can set parameters for the query only when displaying different report filter field items. In addition, you have several more options for updating the report.

In reports based on OLAP source data, report filter field options are not available, background query is not available, and the memory optimization option is not available.

Note: The memory optimization option is also not available for OLEDB data sources and PivotTable reports based on a range of cells.

Field types OLAP source data. dimension fields can only be used as rows (rows), columns (category), or page fields. Measure fields can only be used as value fields. For other types of source data, all fields can be used in any part of the report.

Access to detailed data For source OLAP data, the server determines the available levels of detail and calculates summary values, so the detail records that make up the summary values ​​may not be available. However, the server may provide property fields that you can display. Other source data types do not have property fields, but you can display basic information for data field and element values, and display elements without data.

OLAP report filter fields may not have all elements and command Show report filter pages not available.

Initial sort order For OLAP source data, the items are first displayed in the order they are returned by the OLAP server. You can sort or manually reorder the elements. For other types of source data, the elements of the new report are first sorted by element name in ascending order.

Nimi OLAP servers provide summary values ​​directly to the report, so it is not possible to change the summary functions for value fields. For other source data types, you can change the aggregate function for the value field and use multiple summary functions for the same value field. You cannot create calculated fields and calculated members in reports with OLAP source data.

Subtotals In reports with OLAP source data, you cannot change the summary function for subtotals. With other types of source data, you can change the summary functions for subtotals and show or hide subtotals for all row and column fields.

For OLAP source data, you can include or exclude hidden items when calculating subtotals and grand totals. For other types of source data, you can include hidden items in report filter fields in subtotals, but hidden items in other fields will be excluded by default.

Amazing close...

In the course of work, I often needed to make complex reports, I always tried to find something in common in them in order to compose them more simply and universally, I even wrote and published an article on this subject “The Osipov Tree”. However, they criticized my article and said that all the problems that I raised have long been resolved in MOLAP.RU v.2.4 (www.molap.rgtu.ru) and recommended to look at the pivot tables in EXCEL.
It turned out to be so simple that, having applied my ingenious hands to this, I got a very simple scheme for uploading data from 1C7 or any other database (hereinafter, 1C means any database) and analysis in OLAP.
I think many OLAP upload schemes are too complicated, I choose simplicity.

Characteristics :

1. Only EXCEL 2000 is required to work.
2. The user himself can design reports without programming.
3. Uploading from 1C7 in a simple text file format.
4. For accounting entries, there is already a universal processing for unloading that works in any configuration. For unloading other data, there are sample processing.
5. You can pre-design report forms and then apply them to different data without re-designing them.
6. Pretty good performance. At the first long stage, the data is first imported into EXCEL from a text file and an OLAP cube is built, and then any report can be built quite quickly based on this cube. For example, data on sales of goods in a store for 3 months with an assortment of 6000 goods are loaded into EXCEL in 8 minutes on Cel600-128M, the rating by goods and groups (OLAP report) is recalculated in 1 minute.
7. Data is downloaded from 1C7 in full for the specified period (all movements, for all warehouses, firms, accounts). When importing into EXCEL, it is possible to use filters that load only the necessary data for analysis (for example, from all movements, only sales).
8. Currently, methods have been developed for analyzing movements or residues, but not movements and residues together, although this is possible in principle.

What is OLAP : (www.molap.rgtu.ru)

Suppose you have a trading network. Let the data on trading operations be uploaded to a text file or a table of the form:

Date - transaction date
Month - month of operation
Week - week of operation
Type - purchase, sale, return, write-off
Counterparty - an external organization participating in the operation
Author - the person who issued the invoice

In 1C, for example, one line of this table will correspond to one line of the invoice, some fields (Contractor, Date) are taken from the invoice header.

Data for analysis is usually uploaded to the OLAP system for a certain period of time, from which, in principle, another period can be distinguished by using load filters.

This table is the source for OLAP analysis.

Report

measurements

Data

Filter

How many goods and for what amount are sold per day?

Date, Product

Quantity, Amount

View="sale"

Which counterparties supplied which goods for which amount per month?

Month, Contractor, Product

Sum

View="purchase"

For what amount did the operators of invoices of what type write out for the entire period of the report?

Sum

The user himself determines which of the fields of the table will be Dimensions, which Data and which Filters to apply. The system itself builds a report in a visual tabular form. Dimensions can be placed in the row or column headings of a report table.
As you can see, from one simple table, you can get a lot of data in the form of various reports.


How to use on your own :

Unpack the data from the distribution package exactly to the c:\fixin directory (for a trading system it is possible to c:\reports) . Read readme.txt and follow all instructions in it.

First you must write a processing that uploads data from 1C to a text file (table). You need to define the composition of the fields that will be uploaded.
For example, ready-made universal processing that works in any configuration and unloads postings for a period for OLAP analysis unloads the following fields for analysis:

Date|Day of the Week|Week|Year|Quarter|Month|Document|Company|Debit|DtNomenclature
|DtGroupNomenclature|DtSectionNomenclature|Credit|Amount|ValAmount|Quantity
|Currency|DtContractors|DtGroupContractors|KtContractors|KtGroupContractors|
CTMiscellaneousObjects

Where under the prefixes Dt (Kt) there are subcontos of Debit (Credit), Group is a group of this subconto (if any), Section is a group of a group, Class is a section group.

For a trading system, the fields can be as follows:

Direction|Type of Movement|For Cash|Product|Quantity|Price|Amount|Date|Company
|Warehouse|Currency|Document|WeekDay|Week|Year|Quarter|Month|Author
|ProductCategory|MovementCategory|CounterpartyCategory|ProductGroup
|ValAmount|Cost price|Contractor

For data analysis, tables "Analysis of movements.xls" ("Analysis of accounting.xls") are used. When opening them, do not disable macros, otherwise you will not be able to update reports (they are triggered by macros in the VBA language). These files take their initial data from the files C:\fixin\motions.txt (C:\fixin\buh.txt), otherwise they are the same. Therefore, you may need to copy your data to one of these files.
In order for your data to be loaded into EXCEL, select or write your own filter and click the "Generate" button on the "Conditions" sheet.
Report sheets begin with the prefix "From". Go to the report sheet, click "Refresh" and the report data will change according to the latest data loaded.
If you are not satisfied with the standard reports, there is a sheet OtchTemplate. Copy it to a new sheet and customize the report view by working with a pivot table on this sheet (more about working with pivot tables - in any book on EXCEL 2000). I recommend setting up reports on a small data set, and then run them on a large array, because there is no way to disable table redrawing every time the report layout changes.

Technical Notes :

When uploading data from 1C, the user selects the folder where to upload the file. I did this because it is likely that several files (leftovers and movements) will be uploaded in the near future. Then, by clicking the "Send" --> "To OLAP analysis in EXCEL 2000" button in the Explorer, the data is copied from the selected folder to the C:\fixin folder. (for this command to appear in the list of the "Send" command, you need to copy the file "For OLAP analysis in EXCEL 2000.bat" to the C:\Windows\SendTo directory) Therefore, upload the data immediately giving names to the motions.txt or buh.txt files.

Text file format:
The first line of the text file contains the column headings separated by "|", the remaining lines contain the values ​​of these columns separated by "|".

To import text files into Excel, Microsoft Query (part of EXCEL) is used; for its operation, it is necessary to have a shema.ini file in the import directory (C:\fixin) containing the following information:


ColNameHeader=True
Format=Delimited(|)
MaxScanRows=3
CharacterSet=ANSI
ColNameHeader=True
Format=Delimited(|)
MaxScanRows=3
CharacterSet=ANSI

Explanation: motions.txt and buh.txt is the name of the section, corresponds to the name of the imported file, describes how to import a text file into Excel. The remaining parameters mean that the first line contains the names of the columns, the column separator is "|", the character set is Windows ANSI (for DOS - OEM).
The field type is determined automatically based on the data contained in the column (date, number, string).
The list of fields does not need to be described anywhere - EXCEL and OLAP will determine for themselves which fields are contained in the file by the headings in the first line.

Attention, check your regional settings "Control Panel" --> "Regional Settings". In my processing, the numbers are uploaded with a comma separator, and the dates are in the "DD.MM.YYYY" format.

When you click the "Generate" button, the data is loaded into the pivot table on the "Base" sheet, and all reports on the "Return" sheets take data from this pivot table.

I understand that lovers of MS SQL Server and powerful databases will start grumbling that everything is too simplified for me, that my processing will die on a yearly sample, but first of all I want to give the benefits of OLAP analysis to medium-sized organizations. I would position this product as a yearly analysis tool for wholesalers, quarterly analysis for retailers, and operational analysis for any organization.

I had to tinker with VBA so that the data was taken from a file with any list of fields and it was possible to prepare report forms in advance.

Description of work in EXCEL (for users):

Instructions for using reports:
1. Send the downloaded data for analysis (check with the administrator). To do this, right-click on the folder into which you have uploaded data from 1C and select the "Submit" command, then "To OLAP analysis in EXCEL 2000".
2. Open the "Motion Analysis.xls" file
3. Select the Filter value, the filters you need can be added on the "Values" tab.
4. Click the "Generate" button, and the downloaded data will be loaded into EXCEL.
5. After loading the data into EXCEL, you can view various reports. To do this, just click the "Refresh" button in the selected report. Report sheets begin with Rep.
Attention! After you change the filter value, you need to click the "Generate" button again so that the data in EXCEL is reloaded from the upload file in accordance with the filters.

Processing from the demo:

Processing motionsbuh2011.ert is the latest version of unloading transactions from Accounting 7.7 for analysis in Excel. It has the “Append to file” checkbox, which allows you to upload data in parts by periods, attaching them to the same file, and not uploading to the same file again:

Processing motionswork.ert uploads sales data for analysis in Excel.

Report examples:

Chess by posting:

Workload of operators by types of invoices:

P.S. :

It is clear that according to a similar scheme, you can organize the unloading of data from 1C8.
In 2011, I was contacted by a user who needed to finalize this processing in 1C7 so that it could upload large amounts of data, I found an outsourcer and did this job. So the development is quite relevant.

Motionsbuh2011.ert processing has been improved to handle large data uploads.

The first pivot table interface, also called pivot reports, was included with Excel back in 1993 (Excel version 5.0). Despite the many useful features, it is practically not used in the work of most Excel users. Even experienced users often mean by the term "summary report" something built using complex formulas. Let's try to popularize the use of pivot tables in the daily work of economists. This article discusses the theoretical basis for creating summary reports, gives practical recommendations for their use, and also provides an example of accessing data based on several tables.

Terms of multivariate data analysis

Most economists have heard the terms "multidimensional data", "virtual cube", "OLAP technologies", etc. But with a detailed conversation, it usually turns out that almost everyone does not really understand what is at stake. That is, people mean something complex and usually not relevant to their daily activities. Actually it is not.

Multidimensional data, dimensions

It's safe to say that economists deal with multidimensional data almost all the time, but they try to present it in a predefined way using spreadsheets. Multidimensionality here means the ability to enter, view or analyze the same information with a change in appearance, the use of various groupings and sortings of data. For example, a sales plan can be analyzed according to the following criteria:

  • types or groups of goods;
  • brands or product categories;
  • periods (month, quarter, year);
  • buyers or groups of buyers;
  • sales regions
  • etc.

Each of the above criteria in terms of multivariate data analysis is called a "dimension". It can be said that a dimension characterizes information on a certain set of values. A special type of measurement of multidimensional information is "data". In our example, the sales plan data can be:

  • volume of sales;
  • Selling price;
  • individual discount
  • etc.

Theoretically, data can also be a standard dimension of multidimensional information (for example, you can group data by sales price), but data is usually a special type of value.

Thus, we can say that in practical work, economists use two types of information: multidimensional data ( actual and planned numbers with many features) and directories (characteristics or measurements of data).

OLAP

The abbreviation OLAP (online analytical processing) in literal translation sounds like “real-time analytical processing”. The definition is not very specific; almost any report of any software product can be summed up under it. By definition, OLAP means a technology for working with special reports, including software, for obtaining and analyzing multidimensional structured data. One of the popular software products that implement OLAP technologies is SQL Server Analysis Server. Some even mistakenly consider him the only representative of the software implementation of this concept.

Virtual Data Cube

"Virtual cube" (multidimensional cube, OLAP cube) is a technical term coined by some specialized software vendors. OLAP systems usually prepare and store data in their own structures, and special analysis interfaces (such as Excel summary reports) access the data from these virtual cubes. At the same time, the use of such a dedicated storage is not at all necessary for processing multidimensional information. In general, virtual cube- this is an array of specially optimized multidimensional data that is used to create summary reports. It can be obtained both through specialized software tools and through simple access to database tables or any other source, such as an Excel spreadsheet.

pivot table

"Pivot Report" (pivot table, Pivot Table) is a user interface for displaying multidimensional data. Using this interface, you can group, sort, filter and change the location of data in order to obtain various analytical samples. The report is updated using simple user interface tools, the data is automatically aggregated according to the specified rules, and no additional or re-entry of any information is required. The Excel pivot table interface is perhaps the most popular software product for working with multidimensional data. It supports both external data sources (OLAP cubes and relational databases) and internal spreadsheet ranges as a data source. Starting with version 2000 (9.0), Excel also supports a graphical form of displaying multidimensional data - a Pivot Chart.

The PivotTable interface implemented in Excel allows you to arrange dimensions of multidimensional data in the worksheet area. For simplicity, you can think of a pivot table as a report lying on top of a range of cells (in fact, there is a certain binding of cell formats to the fields of the pivot table). An Excel PivotTable has four display areas: filter, columns, rows, and data. Data dimensions are named pivot table fields. These fields have their own properties and display format.

Once again, I would like to draw your attention to the fact that the Excel pivot table is intended solely for data analysis without the possibility of editing information. Closer in meaning would be the widespread use of the term "pivot report" (Pivot Report), and that's how this interface was called until the year 2000. But for some reason, in subsequent versions, the developers abandoned it.

Editing pivot tables

By its definition, OLAP technology, in principle, does not imply the possibility of changing the source data when working with reports. Nevertheless, a whole class of software systems has emerged on the market that implements the possibilities of both analysis and direct editing of data in multidimensional tables. Basically, such systems are focused on solving budgeting problems.

Using Excel's built-in automation tools, you can solve many non-standard tasks. An example implementation of editing for Excel pivot tables based on worksheet data can be found on our website.

Preparing multidimensional data

Let's approach the practical application of pivot tables. Let's try to analyze sales data in different directions. File pivottableexample.xls consists of several sheets. Sheet Example contains basic information about sales for a certain period. For simplicity of the example, we will analyze the only numerical indicator - the volume of sales in kg. The following key data dimensions are available: product, buyer, and carrier (transport company). In addition, there are several additional dimensions of data that are attributes of the product: type, brand, category, supplier, as well as buyer: type. This data is collected on the Directories sheet. In practice, there may be many more such measurements.

Sheet Example contains a standard data analysis tool - autofilter. Looking at the example of filling in the table, it is obvious that sales data by dates (they are arranged in columns) lend themselves to normal analysis. In addition, using the autofilter, you can try to sum up the data by combinations of one or more key criteria. There is absolutely no information about brands, categories and types. It is not possible to group data with automatic summarization by a specific key (for example, by customers). In addition, the set of dates is fixed, and it will not be possible to view summary information for a certain period, for example, 3 days, by automatic means.

In general, the presence of a predefined date location in this example is the main drawback of the table. By arranging the dates by columns, we kind of predetermined the dimension of this table, thus depriving ourselves of the opportunity to use analysis using pivot tables.

First, we need to get rid of this shortcoming - i.e. remove the predefined location of one of the dimensions of the source data. An example of a valid table is a sheet Sales.

The table is in the form of an information entry log. Here, the date is an equal data dimension. It should also be noted that for subsequent analysis in pivot tables, the relative position of rows relative to each other (in other words, sorting) is completely indifferent. Records in relational databases have these properties. It is the analysis of large volumes of databases that is primarily focused on the interface of pivot tables. Therefore, you must adhere to these rules when working with a data source in the form of cell ranges. At the same time, no one forbids using Excel interface tools in your work - pivot tables analyze only data, and formatting, filters, grouping and sorting of source cells can be arbitrary.

From autofilter to summary report

Theoretically, on the Sales sheet data, it is already possible to analyze in three dimensions: goods, buyers and carriers. There is no data on the properties of products and buyers on this sheet, which, accordingly, will not allow them to be shown in the summary table. In the normal mode of creating a pivot table for source data, Excel does not allow you to link data from multiple tables by certain fields. You can bypass this limitation by software - see an example-addition to this article on our website. In order not to resort to programmatic methods of information processing (especially since they are not universal), you should add additional characteristics directly to the log entry form - see the SalesAnalysis sheet.

The use of VLOOKUP functions makes it easy to supplement the original data with missing characteristics. Now, by applying the autofilter, you can analyze the data in different dimensions. But the problem of groupings remains unresolved. For example, tracking the amount only for brands on certain dates is quite problematic. If you are limited to Excel formulas, then you need to build additional samples using the SUMIF function.

Now let's see what features the pivot table interface provides. On the sheet CodeAnalysis built several reports based on a range of cells with sheet data SalesAnalysis.

The first analysis table is built through the Excel 2007 interface Ribbon \ Insert \ PivotTable(in Excel 2000-2003 menu Data\PivotTable).

The second and third tables are created through copying and subsequent customization. The data source for all tables is the same. You can check this by changing the original data, then you need to update the summary reports data.

From our point of view, the advantages in the visibility of information are obvious. You can swap filters, columns and rows, hide certain groups of values ​​from any dimension, use manual drag and drop and automatic sorting.

Properties and Formatting

In addition to directly displaying data, there is a large set of options for displaying the appearance of pivot tables. Extra data can be hidden using filters. For a single element or field, it is easier to use the context menu item Delete(in version 2000-2003 Hide).

It is also desirable to set the display of other elements of the pivot table not through cell formatting, but through setting the field or element of the pivot table. To do this, move the mouse pointer to the desired element, wait for the appearance of a special cursor shape (in the form of an arrow), then select the selected element with a single click. After selection, you can change the view through the ribbon, context menu, or call the standard cell format dialog:

In addition, Excel 2007 introduced many predefined PivotTable display styles:

Notice that the control filters and drag areas are active in the chart.

Access to external data

As already noted, perhaps the greatest effect from the use of pivot tables can be obtained when accessing data from external sources - OLAP cubes and database queries. Such sources usually store large amounts of information, and also have a predefined relational structure, which makes it easy to define dimensions of multidimensional data (pivot table fields).

Excel supports many types of external data sources:

The greatest effect from the use of external sources of information can be achieved by using automation tools (VBA programs) both for obtaining data and for their preliminary processing in pivot tables.

Working with an OLAP cube in MS Excel

1. Get permission to access the SQL Server Analysis Services (SSAS) OLAP cube
2. MS Excel 2016 / 2013 / 2010 must be installed on your computer (MS Excel 2007 is also possible, but it is not convenient to work in it, and MS Excel 2003 has very poor functionality)
3. Open MS Excel, launch the wizard for setting up a connection to the analytical service:


3.1 Specify the name or IP address of the current OLAP server (sometimes you need to specify an open port number, for example, 192.25.25.102:80); domain authentication is used:


3.2 Select a multidimensional database and an analytical cube (if you have access rights to the cube):


3.3 The settings for connecting to the analytical service will be saved in an odc file on your computer:


3.4 Select the type of report (pivot table/graph) and specify the location for its placement:


If a connection has already been created in the Excel workbook, then it can be used again: the main menu "Data" -> "Existing connections" -> select the connection in this workbook -> insert the pivot table into the specified cell.

4. Successfully connected to the cube, you can start interactive data analysis:


When starting interactive data analysis, it is necessary to determine which of the fields will be involved in the formation of rows, columns and filters (pages) of the pivot table. In general, the PivotTable is three-dimensional, and we can think of the third dimension as being perpendicular to the screen, and we're seeing sections that are parallel to the plane of the screen and are determined by which "page" is selected for display. Filtering can be done by dragging the appropriate dimension attributes to the report's filters area. Filtering limits the space of the cube, reducing the load on the OLAP server, so it is preferable to install the necessary filters first. You then place dimension attributes in the row and column areas and measures in the PivotTable data area.


Every time the PivotTable changes, an MDX statement is automatically sent to the OLAP server and the data returned when executed. The larger and more complex the volume of processed data, calculated indicators, the longer the query execution time. You can cancel the execution of the request by pressing the key Escape. The last performed operations can be undone (Ctrl+Z) or redone (Ctrl+Y).


As a rule, for the most frequently used combinations of dimension attributes, the cube stores pre-calculated aggregated data, so the response time of such queries is several seconds. However, it is impossible to calculate all possible combinations of aggregations, since this can require a lot of time and storage space. Massive requests for data at the level of detail can require significant server computing resources, so their execution time can be long. After reading data from disk drives, the server places it in the RAM cache, which allows subsequent such requests to be executed instantly, since the data will be retrieved from the cache.


If you think that your request will be used frequently and its execution time is unsatisfactory, you can contact the analytical development support service to optimize the execution of the request.


After placing the hierarchy in the rows/columns area, it is possible to hide the individual levels:


For key attributes(less often - for attributes higher in the hierarchy) dimensions can have properties - descriptive characteristics that can be displayed both in tooltips and as fields:


If you want to display several field properties at once, you can use the corresponding dialog list:


User Defined Sets

Excel 2010 introduces the ability to interactively create custom (user-defined) sets from dimension members:


Unlike sets created and stored centrally on the side of the cube, custom sets are stored locally in an Excel workbook and can be used later:


Advanced users can create sets using MDX constructs:


Setting PivotTable Properties

The "Pivot Table Options..." item of the context menu (right-click within the pivot table) provides the ability to customize the pivot table, for example:
- "Output" tab, "Classic layout of the pivot table" option - the pivot table becomes interactive, you can drag and drop fields (Drag&Drop);
- "Output" tab, "Show elements without data in rows" parameter - the pivot table will display empty rows that do not contain any indicator values ​​for the corresponding dimension elements;
- "Layout and format" tab, "Keep cell formatting when updating" parameter - in the pivot table, you can override and save the cell format when updating data;


Creating PivotCharts

For an existing OLAP pivot table, you can create a pivot chart - pie, bar, histogram, graph, scatter and other types of charts:


In this case, the pivot chart will be synchronized with the pivot table - when you change the composition of indicators, filters, dimensions in the pivot table, the pivot table is also updated.

Creating Dashboards

Select the original pivot table, copy it to the clipboard (Ctrl + C) and paste its copy (Ctrl + V), in which we will change the composition of the indicators:


To manage multiple pivot tables at the same time, let's insert a slicer (new functionality available starting from MS Excel 2010). Let's connect our Slicer to pivot tables - right-click within the slicer, select "Connections to Pivot Table ..." from the context menu. It should be noted that there can be multiple slicer panels that can serve PivotTables on different sheets at the same time, allowing you to create coordinated Dashboards.


Slicer panels can be configured: select the panel, then see "Size and properties...", "Slice settings", "Assign macro" in the right-click context menu or in the "Options" item of the main menu. So, it is possible to set the number of columns for the elements (buttons) of the slicer, the sizes of the slicer buttons and the panel, define the color scheme and design style for the slicer from the existing set (or create your own style), define your own panel title, assign a program macro through which you can expand panel functionality.


Executing an MDX query from Excel

  1. First of all, you need to perform the DRILLTHROUGH operation on some indicator, i.e. go down to the detailed data (the detailed data is displayed on a separate sheet), and open the list of connections;
  2. Open the connection properties, go to the "Definition" tab;
  3. Select the default command type, and in the command text field place a pre-prepared MDX request;
  4. When the button is pressed, after checking the correct syntax of the request and the availability of appropriate access rights, the request will be executed on the server, and the result will be presented in the current sheet in the form of a regular flat table.
    You can view the text of the MDX query generated by Excel by installing a free add-on that also provides other additional functionality.

Translation into other languages

The analytical cube supports localization into Russian and English (if necessary, localization into other languages ​​is possible). Translations apply to the names of dimensions, hierarchies, attributes, folders, measures, as well as elements of individual hierarchies if there are translations for them on the side of accounting systems / data storage. To change the language, you need to open the connection properties and add the following option in the connection string:
Extended Properties="Locale=1033"
where 1033 - English localization
1049 - Russian localization


Additional Excel Extensions for Microsoft OLAP

The ability to work with Microsoft OLAP cubes will increase if you use additional extensions, for example, OLAP PivotTable Extensions, thanks to which you can use fast search by dimension:


website 2011-01-11 16:57:00Z Last modified: 2017-10-15 16:33:59Z Age audience: 14-70

Top Related Articles