Vlamis to Present at BIWA Summit January 31 - February 2, 2017!

Vlamis Software Solutions has multiple sessions accepted for BIWA Summit 2017. BIWA Summit takes place from January 31 - February 2, 2017 at Oracle headquarters in Redwood Shores, CA.

The following sessions will be presented this year:

  • Data Visualization for Oracle BI 12c and Visual Analyzer - Tim Vlamis and Dan Vlamis
  • Business Analytics in the Oracle 12.2 Database: Analytic Views - Dan Vlamis and Cathye Pendley
  • Data Visualization at SoundExchange - A Case Study - Cathye Pendley, Olly Wray, and Luis Bonilla
  • From Zero to Oracle BI Cloud Administrator Hero - Jonathan Clark and Arthur Dayton
  • Moving Virtual Machines to the Cloud (and Back) Painlessly? - Arthur Dayton and Dan Vlamis
  • Clustering Data with Oracle Data Mining and Oracle Business Intelligence - Tim Vlamis
  • Forecasting and Time Series Analysis in Oracle Business Intelligence - Tim Vlamis and Dan Vlamis
  • Oracle BI at Jet Propulsion Laboratory - Dan Vlamis and Cathye Pendley
  • Deploy Custom Maps in OBIEE for Free - Arthur Dayton and Dan Vlamis
  • Learn Predictive Analytics in 2 hours!! Oracle Data Miner Hands on Lab - Charlie Berger, Tim Vlamis, Brendan Tierney, and Karl Rexer

Visit www.biwasummit.org to view the preliminary session schedule and register for BIWA Summit 2017. Abstracts for our presentations are available on our presentations page.

We are very excited to have the opportunity to present at BIWA Summit 2017! Looking forward to seeing you at Oracle headquarters!

Using the Oracle Database for an Analytic Warehouse

Using the Oracle Database for an Analytic Warehouse

Dan and Tim Vlamis, Vlamis Software Solutions, Inc.

November 10, 2016

Introduction:

Data Warehouses have evolved since their heydays in the 1980s. Many people have given up on the dream of capturing and codifying into one place all the internal data for an organization and consider the task to be too difficult. In the quest to capture everything, they have focused almost exclusively on how to get data into a data warehouse through ETL processes. Instead, we believe the value lies not in storing data, but rather in organizing it, using it to calculate new measures, and facilitating its presentation to users. By focusing on results and uses, you can achieve what organizations truly care about: fast, deep, meaningful analyses.

Comparison between Data Warehouse and Analytic Warehouse:

A traditional data warehouse focuses on storing as much business data as possible. We propose identifying the data deemed most useful for affecting business decisions and focusing on analyzing data. We call a purpose-built warehouse of data for analysis an analytic warehouse. Most processing in a traditional data warehouse is done externally in ETL (Extract Load Transform). Most processing in an analytic warehouse is done internally by processes that:

  • define relationships and structures to organize data elements
  • create new measures, fields, and structures through analytical processing

Additional differences between traditional data warehouses and analytic warehouse sometimes include the following:

  •  In traditional data warehouses, one of the most important metadata facts about a particular data element is its source or provenance. The data element’s importance was directly related to how recently it was stored and sometimes, to how much it was replicated. In analytic warehouses, the most important information about a particular data element is its use, either in a subsequent analytic process or a call as a raw data element. The more times an element is used in a calculation or is called for presentation to a user, the more important it is.
  •  In the case when traditional data warehouses are significantly larger than their sources, it’s almost always because they are replicating data and storing it multiple times. Analytic warehouses can also be larger than their sources, but when they are, it’s almost always because they are creating new data elements and structures through descriptive, diagnostic, predictive, and prescriptive analytic workflows and machine learning techniques.
  • Traditional data warehouses were defined by the inclusion of data. Analytic warehouses often make calls on external data stores for analytic workflows. There is often no need to store the data in the warehouse; just use it to produce new data elements and structures. This emphasis on data federation and analytic processing is a key difference between traditional data warehouses and analytic warehouses.

In short, traditional data warehouses are about getting all available data in for storage; analytic warehouses are about getting valuable data out for presentation.

Description of Analytic Warehouse:

Analytic warehouses are characterized by the following:

  • Organization around logical structures designed for analysis
  • A distinction between the processing/query engine (which can reach out to many sources) and the storage layer (which contains the most valuable and frequently accessed data)
  • Lots of derived measures, comparative values, and the generation of new data elements and structures
  • Emphasis on relationships, hierarchies, and structures (both discovered and assigned) within and between data elements
  • Emphasis on the fast processing and delivery of queries
  • Ability to federate data and execute queries and analytic processes in external data storage systems
  • Ability to perform complex statistical, graphical, and high mathematical processes in parallel

Analytic warehouses often contain dimensional structures that organize data into hierarchical taxonomies and include aggregations at different levels. Facilitating the fast presentation of data at different levels of aggregation is a primary purpose of analytic warehouses. While counting and aggregation are important internal processes, newer machine learning, statistical analytics, and data mining functions are now critical to delivering value to business users.

By focusing on a subset of an organization’s data, analytic warehouses can be surprisingly easier to build than full data warehouses. Compared with a full data warehouse, 80% of the value may be obtained with only 20% of the data. If data elements are not often used for analysis purposes, it’s better to leave them in their sources systems and use them through an external process on the rare occasion they are needed.

Use of Analytic Warehouse:

Whereas a traditional data warehouse supports historic reporting, an analytic warehouse focuses on predictive and prescriptive analytics. Analytic warehouses may be used for historic reporting also, but they are designed to predict the future. Often the emphasis is on additional calculated measures, forecasts, or predictions. Depending on the technology used, this may imply that measures in an analytic warehouse may not have the same precision as data in a data warehouse.  For example, a data warehouse may report that a customer has bought $23,753.03 worth of gadgets in the past year, but in an analytic warehouse, we may compute next year’s gadget sales are between $24,000 and $28,000 with a 95% probability associated with that assertion.  Since this is not truly known, we may end up calculating this value on the fly instead of storing the actual value.  Even though the value is an estimate, the fact that the value is directionally higher than last year may be very valuable. 

An important use for analytic warehouse is for comparison purposes.  Historic reporting often emphasizes precision and absolute values in isolation.  Analytic warehouses emphasize comparisons within a context.  Sometimes the context is historical (e.g. sales are trending up) and sometimes the context is in the future (e.g. sales for California are likely to exceed Texas in two years).

Oracle Database and Options for the Analytic Warehouse

An Oracle database is an excellent foundation for an analytic warehouse. In addition to the traditional relational data store, modern Oracle databases have the ability to store spatial data, dimensions, cubes and can calculate analytics directly in the database, eliminating the need for moving data between servers. Specifically, the Advanced Analytics option of the Oracle Enterprise Edition database has a host of embedded analytic algorithms such as clustering, regression, attribute importance, and more. Moreover, Oracle R Enterprise brings the power of the language ‘R’ to the Oracle Database. With the Database In Memory option, tables can be placed in memory and special memory-optimized algorithms can aggregate data as it is being read for extremely fast aggregation on the fly. Analytic Views (new in release 12.2) enable OLAP-style calculations on fact tables, without having to store data in a multi-dimensional format. Oracle Database is capable of processing data in XML, JSON, and through Big Data SQL can interact with external data stores such as Hadoop, NoSQL, and integrate Apache Spark processes. This also allows Oracle Database’s advanced security features (e.g. VPD, redaction) to be used with these data sources. All of these capabilities make the Oracle Database the optimal vehicle for realizing the analytic warehouse.

Analytic Warehouses in the Oracle Cloud

The movement towards storing and processing data in the cloud makes analytic warehouses even more compelling.  The High Performance Package of the Oracle Database Cloud Service contains all of the options necessary for an analytic warehouse.  The Extreme Performance Package adds the Database In Memory option for additional performance.  An Oracle Database in the cloud can reach out to other cloud data sources, as well as external databases, without having to replicate data.  In addition, cloud databases enable solutions that can scale to any size without having to pre-determine storage and compute requirements for an analytic warehouse.

Formatting SQL with Notepad++

Tidbits

If you have ever had to format SQL in various editors of various flavors you most likely have experienced severe frustration at the lack of ability to apply personal formating needs to SQL.

I experienced just that issue not long ago trying to format TSQL to line break every 3rd comma. Thankfully with the help of every programmers best friend (Notepad++) and most programmers source of utter befuddlement (Regular Expressions) I was able to solve the issue easily.

Imagine you have a SQL statement that looks like this -
select [Field1],[Field2],[Field3],[Field4],[Field5],[Field6]
    from [Some_Table]

but it's more like 65 fields with a preceding insert statement and proceeding group by clause and the editor you are using doesn't support formatting in any useful way. You could manually format it and then when you get another statement do it again and then jam a pen in your eyeball or you could use the dark and mysterious art of regular expressions.

So to insert a line break every third comma you just need to put your SQL into Notepad++, use TextFX Unwrap text option to get it all on a single line, sacrifice a chicken while dancing backwards around 13 tuples chanting "Chamberlin Boyce Codd" and enter the following in the Find / Replace dialog in Notepad++ -

The ((.?,.?)(.?,.?)[,]) means find and select everything between the starting point and the 3rd comma and the \1\r\n means replace it with itself and a newline.

Why does it work? Magic of course. Hopefully this helps some poor soul.

If you want to learn more I borrowed my logic from -

Notepad++: A guide to using regular expressions and extended search mode

Replace every nth instance of a character with a new line in Notepad++

DVD 12.2.2 and OBIEE 12.2.1.2. New Features Webinar

Join Dan Vlamis, Oracle ACE Director, as he breaks down the exciting new features in Data Visualization Desktop 2.0 (internally known as 12.2.2) and 12.2.1.2 release of Oracle Business Intelligence Enterprise Edition, available starting October, 2016 from Oracle. 

Expect a fast-paced presentation with live demos of DVD and OBIEE on a new SampleApp image, on as many of the new features as time allows including: 

DVD 2.0
• Improved data prep with Data Flows
• New more intuitive pages for creating data sources
• 25+ new data sources
• New smart insights feature to help you better understand data
• 8 new visualizations
• New BI Ask feature
• Multiple Canvases added with easier to use interface
• Printing, email and export to pdf and PowerPoint options
• New SDK for developing 1st Class Custom Visualizations

OBIEE 12.2.1.2
• Export/Import Visual Analyzer projects
• Single sign-on to Visual Analyzer

Just like our previous demonstrations of new BI releases (see www.vlamis.com/obiwebinars for a list), we will give a live demonstration of as many of these features as we have time for.

DVD 12.2.2 and OBIEE 12.2.1.2. New Features Webinar is scheduled for Tuesday November 22, 2016 at 11:00AM Central. The replay of this webinar is now available.