Customer Experience Matrix: qliktech

Showing posts with label qliktech. Show all posts

Wednesday, March 07, 2012

BI Vendor QlikTech Reveals QlikView Pricing: I Modestly Help to Clarify

Business Intelligence software vendor QlikTech* officially published its price list last month, after years of keeping it a not-very-closely-held secret. I was personally pleased, since people occasionally ask me what QlikView costs. But then I looked more closely at the list and realized I wasn’t quite certain what it meant. Happily, it didn’t take long to set up a briefing and clarify matters. Just in case anyone else is also confused, here’s what they told me:

- The lowest entry price for a fully functional version is $1,350. Although this is called a “Named User License”, it does NOT require connection to a server—the specific point I wasn’t sure of. What distinguishes this from the free Personal Edition is that the Named User License can read QlikView files created on other machines, while the Personal Edition cannot. Thus, a company could buy two Named User Licenses for $2,700 and those systems could share files back and forth. Let me state clearly that as a confirmed QlikView fan, I think this is a terrifically low entry price.

- Companies with many infrequent users can purchase a “Concurrent License” that allows one user at a time for $15,000. This figure is so high that I thought it might be a typographic error, but QlikTech assures me it’s correct. In fact, they say it’s a bargain because they’ve found many clients can share more than 11 users on one license. These would be salespeople or other non-analysts who want to occasionally view a report. It seems to me that Murphy’s Law would ensure they all access the system during the same five minutes – presumably the evening before their monthly reports are all due – but I’ll take QlikTech’s word that this isn’t the case. After all, they're the analytical experts, eh?

- Companies who don’t want to spend $1,350 for casual users have some other options. These include a $350 “Document License” for one user for a single document, a $70,000 “Information Access Server” allowing unlimited users to access a single document (usually over a public Web site), and a $3,000 “Extranet Server Concurrent License” that allows one external user at a time to read documents on an $18,000 “Extranet Server”.

There are various other licenses for larger systems and special purposes. The descriptions are more or less self-explanatory, but of course you’d want to talk to QlikTech itself for detailed explanations.

One thing you definitely won’t find is a free or low-cost “reader” licence that lets users view but not change a QlikView document. This is a disappointment to me personally, since we at Left Brain DGA use similar readers to send reports to our clients today. I can’t see those clients paying for Named User Licenses or Concurrent Licenses. But QlikTech is philosophically opposed to the idea of a limited-function reader, which it argues “goes against the current trend toward the democratization of software — in which line of business users can become as adept with an analytics tool as any business analyst or developer.” I can’t say I agree: QlikView takes considerable effort to learn, and many business users don’t have the time, need, or inclination to bother. They would be perfectly happy to consume existing reports without drilling any deeper, but are unlikely to pay $1,350 for the privilege.

I can’t judge how much business, other than mine, the lack of reader is costing QlikTech. Surely some people end up buying the full Named User License and using it as a reader, which makes up for some of the people who don’t buy at all. QlikView also has a strong argument that their total cost of ownership is lower than competitors, even at current pricing levels. The company grew 40% year-on-year as of its last financial report, so they’re clearly doing just fine with their existing approach.

_________________________________________________
* QlikTech is the company, QlikView is the product

Monday, December 06, 2010

QlikView's New Release Focuses on Enterprise Deployment

I haven’t written much about QlikView recently, partly because my own work hasn’t required using it and partly because it’s now well enough known that other people cover it in depth. But it remains my personal go-to tool for data analysis and I do keep an eye on it. The company released QlikView 10 in October and Senior Director of Product Marketing Erica Driver briefed me on it in a couple of weeks ago. Here’s what’s up.

- Business is good. If you follow the industry at all, you already know that QlikView had a successful initial public stock offering in July. Driver said the purpose was less to raise money than to gain the credibility that comes from being a public company. (The share price has nearly doubled since launch, incidentally.) The company has continued its rapid growth, exceeding 15,000 clients and showing 40% higher revenue vs. the prior year in its most recent quarter. Total revenues will easily exceed $200 million for 2010. Most clients are still mid-sized businesses, which is QlikView’s traditional stronghold. But more big enterprises are signing on as well.

- Features are stable. Driver walked me through the major changes in QlikView 10. From an end-user perspective, none were especially exciting -- which simply confirms that QlikView already had pretty much all the features it needed.

Even the most intriguing user-facing improvements are pretty subtle. For example, there’s now an “associative search” feature that means I can enter client names in a sales rep selection box and the system will find the reps who serve those clients. Very clever and quite useful if you think about it, but I’m guessing you didn’t fall off you chair when you heard the news.

The other big enhancement was a “mekko” chart, which is bar chart where the width of the bar reflects a data dimension. So, you could have a bar chart where the height represents revenue and the width represents profitability. Again, kinda neat but not earth-shattering.

Let me stress again that I’m not complaining: QlikView didn’t need a lot of new end-user features because the existing set was already terrific.

- Development is focused on integration and enterprise support. With features under control, developers have been spending their time on improving performance, integration and scalability. This involves geeky things aimed at like a documented data format for faster loads, simpler embedding of QlikView as an app within external Web sites, faster repainting of pages in the AJAX client, more multithreading, centralized user management and section access controls, better audit logging, and prebuilt connectors for products including SAP and Salesforce.com.

There’s also a new API that lets external objects to display data from QlikView charts. That means a developer can, say, put QlikView data in a Gantt chart even though QlikView itself doesn’t support Gantt charts. The company has also made it easier to merge QlikView with other systems like Google Maps and SharePoint.

These open up some great opportunities for QlikView deployments, but they depend on sophisticated developers to take advantage of them. In other words, they are not capabilities that a business analyst -- even a power user who's mastered QlikView scripts -- will be able to handle. They mark the extension of QlikView from stand-alone dashboards to a system that is managed by an IT department and integrated with the rest of the corporate infrastructure.

This is exactly the "pervasive business intelligence" that industry gurus currently tout as the future of BI. QlikView has correctly figured out that it must move in this direction to continue growing, and in particular to compete against traditional BI vendors at large enterprises. That said, I think QlikView still has plenty of room to grow within the traditional business intelligence market as well.

- Mobile interface. This actually came out in April and it’s just not that important in the grand scheme of things. But if you’re as superficial as I am, you’ll think it’s the most exciting news of all. Yes, you can access QlikView reports on iPad, Android and Blackberry smartphones, including those touchscreen features you’ve wanted since seeing Minority Report. The iPad version will even use the embedded GPS to automatically select localized information. How cool is that?

Friday, September 11, 2009

A Heartwarming Story of Social Media, Family and QlikView

My son works as a sports researcher at a cable television network. His job seems mainly to be looking things up in online databases and, on broadcast days, watching several games simultaneously. It's nice work if you can get it.

In terms of technology, Microsoft Word and Excel meet most of his needs. But I did introduce him to QlikView several years back, and he learned enough to analyze statistics for his college basketball team. When QlikView introduced its free Personal Edition, he decided to use it at work to track a database of college recruiting prospects. Despite (or because of) his lack of technical background, and without any formal QlikView training, he created a very nice system to find prospects based on different characteristics and create ad hoc statistical summaries.

The centerpiece is a map that displays the number of recruits by state. Because this is QlikView, the map is automatically redrawn each time he makes a selection: so he can see recruits for a certain position, or going to a particular school, or whatever. This is the sort of thing that gets sports people excited. In fact, his colleagues were so pleased that there’s talk of using a version of the map on-air.

The only fly in this ointment was that neither he nor I could find a way to get the map to show the numbers for all the states simultaneously. We could get different sized bubbles reflecting the state counts, and we could see the actual figure for each state by hovering over it. Recognizing my own limits as a QlikView developer, I asked for help on the QlikView user forum and from friend on the QlikView consulting staff. The consultant didn't think it was possible, so we let the matter drop.

Fast-forward one month, to yesterday, when I received a notification that someone had responded to my forum query with a solution. It took a couple of tries, and some additional help from forum members, to get it to work on my son’s map. But you can imagine how pleased we were when we finally saw the map as originally envisioned.

This story illustrates quite a bit about QlikView. Building the original map was easy – my son was able to do it with little help, even though QlikView was doing some very sophisticated processing under the hood. (Specifically, on-the-fly data aggregation along user-defined calculated dimensions, without touching the underlying database). But getting the system to do exactly what he needed did take some special knowledge. (He had to use the number of students by state as his primary dimension, not the X/Y map coordinates.) The adjustment took just a few minutes, but only a QlikView expert would realize that’s how you do it.

To generalize a bit more broadly, then, QlikView really does enable non-technical users to do amazing things, and it really is as powerful as its advocates (myself included) like to claim. But users do need some training to be effective – something that advocates are sometimes reluctant to admit.

The story also illustrates the value of social media. QlikView’s forum is an amazing source of help for users of all skill levels. It works because QlikView has a community of highly engaged advocates who are both expert in the product and willing to help each other.

The forum provides several strategic benefits for QlikView: it helps users become successful (thus driving wider adoption); it lets users succeed even if they don’t receive proper training (which many will not, particularly among users of the free Personal Edition); it reduces the need for paid support staff; and it provides a window into common problems and requirements. It also reinforces the commitment of the engaged users themselves, by publicly rewarding their contributions. Although I’ve never discussed the forum with QlikView management, they obviously understand these benefits well enough to justify their continued investments in it.

This isn’t to say that social media would provide the same value to everyone. QlikView fits several specific conditions – enthusiastic expert users, problems that can be solved fairly easily, etc. – that won’t always apply. But as an example what social media can sometimes accomplish, QlikView is a great case study waiting to be written.

Monday, June 15, 2009

Cloud-Based QlikView Still Isn't Available as a Service

Summary: Pay-as-you-go pricing would make QlikView easier to buy, but the company doesn't offer this option. To make a stronger business case for the purchase, include the value of shifting work from IT to business users, and of producing results faster.

Last week’s post about QlikView 9.0 prompted an inquiry from a manager who has been trying for a year to convince his company to consider the product. Having run into this issue many times, I easily felt his pain and we speculated a bit on what might help things along.

One obvious tactic would be to purchase QlikView on a pay-as-you-go basis, presumably cloud-based. But a quick check with QlikView confirmed that they don’t allow this and have no plans to change.

The closest they come is to let their partners offer QlikView-based applications as a service. For example, they pointed me to SportsDataHub, which lets users analyze football statistics for $40 per year. But the key point about this and similar QlikView services is that you can only access data loaded by the partner. You can't define and load your own data sources directly. At best, you might be able to create your own reports based on the loaded data. (See QlikTech Marketing SVP Anthony Deighton's comment on this post for a little more on the subject.)

I don’t understand QliiView’s reluctance to adopt a Software-as-a-Service model. It has proven viable for many other software companies, including other business intelligence vendors. To me, it seems a natural extension of the company’s “seeing is believing” sales approach as well as a good way to sidestep the barriers raised by corporate IT.

In fact, QlikView’s tremendous ease-of-use makes it an excellent fit for the SaaS model, because business users can deploy it for themselves with minimal technical support. In our conversation last week, QlikTech's Deighton said the majority of clients already implement the system without purchasing any external services. If there was ever a piece of software suited to SaaS, this is it.

Be that as it may. The lack of a proper SaaS offering left my correspondent with several avenues to pursue:

- find a QlikView partner who would build an appropriate application and sell it to him on a services basis. This doesn’t seem very plausible because he probably won’t be able to commit enough funding to make the project worthwhile for the partner. I mean, if he had that much money, he could just buy the software outright in the first place.

- use an alternative system that costs less. Yes, QlikView is unique and wonderful, but products from ADVIZOR Solutions, Lyzasoft, Tableau Software and TIBCO Spotfire offer some of the same advantages at a much lower entry price. Again, this is far from ideal, and it might not work at all because I didn’t explore precisely which aspect of QlikView my correspondent found attractive. Still, it’s better than nothing.

(Vaguely related aside: today, people often cite author Jim Collins’ phrase “good is the enemy of great” as a reason to avoid compromise. Previously, I was more likely to see Voltaire’s “the best is the enemy of the good,” which means that compromise is better than nothing. I’m sure this reversal says something important about our society, although I can’t say what. You're welcome.)

- Find a way to sell QlikView internally. Of course, my correspondent had already been trying, so his question was whether I had any new ideas for how. This actually prompted some very deep thinking over the weekend, which will show up in my Information Management magazine column over the next several months. To summarize four pages in 100 words, there are two approaches to consider:

- do a cost of ownership analysis showing the savings from letting business users perform tasks currently done by IT. Traditional cost analysis compares the time it takes IT to do the work with one tool vs. another. This hides rather than highlights the advantages of QlikView and similar products.

- do a “time to result” analysis that measures the time spent waiting for IT to deliver solutions through multiple iterations. This applies to many analytical databases, not just QlikView, because their flexibility reduces the time spent building conventional BI structures like star schemas and data cubes.

Perhaps one of these will work. I hope so, because we could all benefit from finding ways to take advantage of what new technologies like QlikView have to offer.

Wednesday, June 10, 2009

QlikView 9.0 Reaches for Broader Business Intelligence Market

QlikTech released version 9 of its QlikView business intelligence software today. The product has been in public beta for several months, so the general features are well known to people who care about such things.

Probably the item that attracted the most advance attention is an iPhone version that supports interactive analysis; this also works for other Java Mobile clients like Blackberry. It's cool (or ‘qool’, if you must) but not so important in the grand scheme of things. More significant changes include:

- availability through the Amazon Elastic Compute Cloud (EC2), which lets companies order up a QlikView-equipped server in minutes. (Of course, they still have to purchase a QlikView license.) Users can also expand or reduce the number of servers to match fluctuating needs. Advantages including avoiding the wait for new hardware, no need to physically install a server, and the ability to meet peak demands without making a fixed investment.

- API for real-time updates of in-memory data. This is an extension of previous changes that allowed incremental batch updates and manual data entry. But it still marks a major step towards letting QlikView run time-critical applications such as stock trade analysis, pricing and inventory management. No one will be processing orders on QlikView (hmm, never say never), but the line between analytical and transaction databases just got that much thinner.

- enhanced support for enterprise-level deployments. This includes centralized control panels for multiple servers; load balancing and fail-over; better thin-client support; multi-billion-row data sets; and more efficient calculations. These are critical as QlikView moves from being a departmental solution run primarily by business analysts to a mission-critical system backed by corporate IT.

- free Personal Edition with full development capabilities. The main limit vs. the licensed version is that Personal Edition cannot read QlikView files developed on any other copy of the software, and no one else can read files that Personal Editon generates. The goal is to make it easier for users to try the system on their own – a continuation of the company's long-standing "seeing is believing" strategy.

- functional enhancements including improved visualization, search and automation functions. These are nice but none seemed especially exciting. Changes in previous recent releases, such as set analysis (simultaneously comparing two sets of selected records) were more fundamental. Remember, we're talking about version 9: the system is already quite polished.

Of all these items, the one I found most thought-provoking was the free Personal Edition, which replaces a 15-day free trial. Removing the time limit let users build QlikView into their regular work. The strategy makes sense, but it doesn’t lower the $30,000 - $50,000 investment required for the smallest licensed QlikView installation. Few analysts, who are the most likely users for Personal Edition, have the clout to sponsor so large an investment. Competing analyst tools such as LyzaSoft, ADVIZOR Solutions and Tableau can generally provide a 5-10 user departmental deployment for under $10,000. Although QlikView is vastly more powerful than the others, the lower cost will give them an initial advantage. And once they’re in place, it’s hard to get a company to switch.

On the other hand, maybe QlikView is really moving to compete with traditional business intelligence tools like Cognos, Business Objects and MicroStrategy. QikView’s entry cost is vastly lower than those products, especially once you consider the savings in labor. But most enterprises have a BI tool already in place, so it’s not a matter of comparing entry costs. Rather, the choice is entry cost for QlikView vs. incremental deployment cost on the incumbent. The labor savings with QlikView are so great that it will still be cheaper for many projects. But QlikView will remain be a tough sell because IT departments are reluctant to invest in the staff training needed to support an additional tool.

QlikView will never fully replace the traditional data warehouse and BI tools because its in-memory approach limits the size of its databases. With 64 bit systems, the product can easily handle dozens of gigabytes of data. This is quite a lot, but even the smallest enterprise data warehouses now hold multiple terabytes. QlikView works with such systems by executing SQL queries against them, pulling down limited data sets, loading these into memory, and analyzing them. That’s an excellent and perfectly viable approach, but it does rely on the warehouse being there in the first place.

None of this is to suggest that QlikView has anything but a very bright future. When I first spoke with the company in 2005, it had just reached 2,000 clients; at last count, it had over 11,000. Revenue for 2008 was $120 million and had risen 50% from the previous year. The product has finally attracted attention from analyst firms like Gartner and Aberdeen and is very well rated in Nigel Pendse’s latest BI Survey. My brief fling as a VAR ended two years ago, but I still use it personally for any non-trivial data analysis work and remain absurdly pleased with the results. I won’t say QlikView is better than sex, but its pleasures are equally difficult to describe to the uninitiated. Anyone interested in BI software who hasn’t given it a try (QlikView, not sex) should download a copy and see what they’ve been missing.

Thursday, April 09, 2009

Good Look at QlikView from a Microstrategy Consultant's Viewpoint

I noticed some visitors this morning from the blog of Microstrategy consultancy Aellament, and found they have published a nice look at QlikView on their blog. It's worth a read, and quite interesting in their appreciation of the advantages that QlikView offers over the product they know best. They fairly point out some disadvantages too, of which I think lack of a unified metadata view is probably most significant.

What really resonated was Aellament's sense (in an earlier blog post, which contains the link back to this blog) that QlikView is empowering departmental users to do work that otherwise takes support from technical specialists. That's exactly what I've seen as QlikView's advantage and I think it's fundamentally reshaping the industry.

As I see the future, BI specialists will still be needed to assemble source data into usable forms (i.e., build the data warehouses). This has always been the heavy lifting. But the huge army of people who then essentially reformat that data into cubes, reports, dashboards, etc. will dwindle as business analysts do that for themselves using tools like QlikView. Bad news for Microstrategy consultants (presumably why Aellament is hedging its bets with QlikView training) but good news for business users.

Wednesday, August 06, 2008

More on QlikView - Curt Monash Blog

I somehow ended up posting some comments on QlikView technology on Curt Monash's DBMS2 blog. This is actually a more detailed description than I've ever posted here about how I think QlikView works. If you're interested in that sort of thing, do take a look.

Thursday, July 17, 2008

QlikView 8.5 Does More, Costs Less

I haven’t been working much with QlikView recently, which is why I haven’t been writing about it. But I did receive news of their latest release, 8.5, which was noteworthy for at least two reasons.

The first is new pricing. Without going into the details, I can say that QlikView significantly lowered the cost of an entry level system, while also making that system more scalable. This should make it much easier for organizations that find QlikView intriguing to actually give it a try.

The second bit of news was an enhancement that allows comparisons of different selections within the same report. This is admittedly esoteric, but it does address an issue that came up fairly often.

To backtrack a bit: the fundamental operation of QlikView is that users select sets of records by clicking (or ‘qliking’, if you insist) on lists of values. For example, the interface for an application might have lists of regions, years and product, plus a chart showing revenues and costs. Without any selections, the chart would show data for all regions, years and products combined. To drill into the details, users would click on a particular combination of regions, years and products. The system would then show the data for the selected items only. (I know this doesn’t sound revolutionary, and as a functionality, it isn’t. What makes QlikView great is how easily you, or I, or a clever eight-year-old, could set up that application. But that’s not the point just now.)

The problem was that sometimes people wanted to compare different selections. If these could be treated as dimensions, it was not a problem: a few clicks could add a ‘year’ dimension to the report I just described, and year-to-year comparisons would appear automatically. What was happening technically was the records within a single selection were being separated for reporting.

But sometimes things are more complicated. If you wanted to compare this year’s results for Product A against last year’s results for Product B, it took some fairly fancy coding. (Not all that fancy, actually, but more work than QlikView usually requires.) The new set features let users simply create and save one selection, then create another, totally independent selection, and compare them directly. In fact, you can bookmark as many selections as you like, and compare any pair you wish. This will be very helpful in many situations.

But wait: there’s more. The new version also supports set operations, which can find records that belong to both, either or only one of the pair of sets. So you could easily find customers who bought last year but not this year, or people who bought either of two products but not both. (Again, this was possible before, but is now much simpler.) You can also do still more elaborate selections, but it gives me a headache to even think about describing them.

Now, I’m quite certain that no one is going to buy or not buy QlikView because of these particular features. In fact, the new pricing makes it even more likely that the product will be purchased by business users outside of IT departments, who are unlikely to drill into this level of technical detail. Those users see QlikView as essentially as a productivity tool—Excel on steroids. This greatly understates what QlikView can actually do, but it doesn’t matter: the users will discover its real capabilities once they get started. What’s important is getting QlikView into companies despite the usual resistance from IT organizations, who often (and correctly, from the IT perspective) don’t see much benefit.

Wednesday, April 02, 2008

illuminate Systems' iLuminate May Be the Most Flexible Analytical Database Ever

OK, I freely admit I’m fascinated by offbeat database engines. Maybe there is a support group for this. In any event, the highlight of my brief visit to the DAMA International Symposium and Wilshire Meta-Data Conference conference last month was a presentation by Joe Foley of illuminate Solutions , which marked the U.S. launch of his company’s iLuminate analytical database.

(Yes, the company name is “illuminate” and the product is “iLuminate”. And if you look inside the HTML tag, you’ll see the Internet domain is “i-lluminate.com”. Marketing genius or marketing madness? You decide.)

illuminate calls iLuminate a “correlation database”, a term they apparently invented to distinguish it from everything else. It does appear to be unique, although somewhat similar to other products I’ve seen over the years: Digital Archaeology (now deceased), Omnidex and even QlikView come to mind. Like iLuminate, these systems store links among data values rather than conventional database tables or columnar structures. iLuminate is the purest expression I’ve seen of this approach: it literally stores each value once, regardless of how many tables, columns or rows contain it in the source data. Indexes and algorithms capture the context of each original occurrence so the system can recreate the original records when necessary.

The company is rather vague on the details, perhaps on purpose. They do state that each value is tied to a conventional b-tree index that makes it easily and quickly accessible. What I imagine—and let me stress I may have this wrong—is that each value is then itself tied to a hierarchical list of the tables, columns and rows in which it appears. There would be a count associated with each level, so a query that asked how many times a value appears in each table would simply look at the pre-calculated value counts; a query of how many times the value appeared in a particular column could look down one level to get the column counts. A query that needed to know about individual rows would retrieve the row numbers. A query that involved multiple values would retrieve multiple sets of row numbers and compare them: so, say, a query looking for state = “NY” and eye color = “blue” might find that “NY” appears in the state column for records 3, 21 and 42, while “blue” appears in the eye color for records 5, 21 and 56. It would then return row=21 as the only intersection of the two sets. Another set of indexes would make it simple to retrieve the other components of row 21.

Whether or not that’s what actually happens under the hood, this does illustrate the main advantages of iLuminate. Specifically, it can import data in any structure and access it without formal database design; it can identify instances of the same value across different tables or columns; it can provide instant table and column level value distributions; and it can execute incremental queries against a previously selected set of records. The company also claims high speed and terabyte scalability, although some qualification is needed: initial results from a query appear very quickly, but calculations involving a large result set must wait for the system to assemble and process the full set of data. Foley also said that although the system has been tested with a terabyte of data, the largest production implementation is a much less impressive 50 gigabytes. Still, the largest production row count is 200 million rows – nothing to sneeze at.

The system avoids some of the pitfalls that sometimes trap analytical databases: load times are comparable to load times for comparable relational databases (once you include time for indexing, etc.); total storage (including the indexes, which take up most of the space) is about the same as relational source data; and users can write queries in standard SQL via ODBC. This final point is particularly important, because many past analytical systems were not SQL-compatible. This deterred many potential buyers. The new crop of analytical database vendors has learned this lesson: nearly all of the new analytical systems are SQL-accessible. Just to be clear, iLuminate is not an in-memory database, although it will make intelligent use of what memory is available, often loading the data values and b-tree indexes into memory when possible.

Still, at least from my perspective, the most important feature of iLuminate is its ability to work with any structure of input data—including structures that SQL would handle poorly or not at all. This is where users gain huge time savings, because they need not predict the queries they will write and then design a data structure to support them. In this regard, the system is even more flexible than QlikView, which it in many ways resembles: while QlikView links tables with fixed keys during the data load, iLuminate does not. Instead, like a regular SQL system, iLuminate can apply different relationships to one set of data by defining the relationships within different queries. (On the other hand, QlikView’s powerful scripting language allows much more data manipulation during the load process.)

Part of the reason I mention QlikView is that iLuminate itself uses QlikView as a front-end tool under the label of iAnalyze. This extracts data from iLuminate using ODBC and then loads it into QlikView. Naturally, the data structure at that point must include fixed relationships. In addition to QlikView, iAnalyze also includes integrated mapping. A separate illuminate product, called iCorrelated, allows ad hoc, incremental queries directly against iLuminate and takes full advantage of its special capabilities.

illuminate, which is based in Spain, has been in business for nearly three years. It has more than 40 iLuminate installations, mostly in Europe. Pricing is based on several factors but the entry cost is very reasonable: somewhere in the $80,000 to $100,000 range, including iAnalyze. As part of its U.S. launch, the company is offering no-cost proof of concept projects to qualified customers.

Thursday, March 27, 2008

The Limits of On-Demand Business Intelligence

I had an email yesterday from Blink Logic , which offers on-demand business intelligence. That could mean quite a few things but the most likely definition is indeed what Blink Logic provides: remote access to business intelligence software loaded with your own data. I looked a bit further and it appears Blink Logic does this with conventional technologies, primarily Microsoft SQL Server Analysis Services and Cognos Series 7.

At that point I pretty much lost interest because (a) there’s no exotic technology, (b) quite a few vendors offer similar services*, and (c) the real issue with business intelligence is the work required to prepare the data for analysis, which doesn’t change just because the system is hosted.

Now, this might be unfair to Blink Logic, which could have some technology of its own for data loading or the user interface. It does claim that at least one collaboration feature, direct annotation of data in reports, is unique. But the major point remains: Blink Logic and other “on-demand business intelligence” vendors are simply offering a hosted version of standard business intelligence systems. Does anyone truly think the location of the data center is the chief reason that business intelligence has so few users?

As I see it, the real obstacle is that most source data must be integrated and restructured before business intelligence systems can use it. It may be literally true that hosted business intelligence systems can be deployed in days and users can build dashboards in minutes, but this only applies given the heroic assumption that the proper data is already available. Under those conditions, on-premise systems can be deployed and used just as quickly. Hosting per se has little benefit when it comes to speed of deployment. (Well, maybe some: it can take days or even a week or two to set up a new server in some corporate data centers. Still, that is a tiny fraction of the typical project schedule.)

If hosting isn't the answer, what can make true “business intelligence on demand” a reality? Since the major obstacle is data preparation, then anything that allows less preparation will help. This brings us back to the analytical databases and appliances I’ve been writing about recently : Alterian, Vertica, ParAccel, QlikView, Netezza and so on. At least some of them do reduce the need for preparation because they let users query raw data without restructuring it or aggregating it. This isn’t because they avoid SQL queries, but because they offer a great enough performance boost over conventional databases that aggregation or denormalization are not necessary to return results quickly.

Of course, performance alone can’t solve all data preparation problems. The really knotty challenges like customer data integration and data quality still remain. Perhaps some of those will be addressed by making data accessible as a service (see last week’s post). But services themselves do not appear automatically, so a business intelligence application that requires a new service will still need advance planning. Where services will help is when business intelligence users can take advantage of services created for operational purposes.

“On demand business intelligence” also requires that end-users be able to do more for themselves. I actually feel this is one area where conventional technology is largely adequate: although systems could always be easier, end-users willing to invest a bit of time can already create useful dashboards, reports and analyses without deep technical skills. There are still substantial limits to what can be done – this is where QlikView’s scripting and macro capabilities really add value by giving still more power to non-technical users (or, more precisely, to power users outside the IT department). Still, I’d say that when the necessary data is available, existing business intelligence tools let users accomplish most of what they want.

If there is an issue in this area, it’s that SQL-based analytical databases don’t usually include an end-user access tool. (Non-SQL systems do provide such tools, since users have no alternatives.) This is a reasonable business decision on their part, both because many firms have already selected a standard access tool and because the vendors don’t want to invest in a peripheral technology. But not having an integrated access tool means clients must take time to connect the database to another product, which does slow things down. Apparently I'm not the only person to notice this: some of the analytical vendors are now developing partnerships with access tool vendors. If they can automate the relationship so that data sources become visible in the access tool as soon as they are added to the analytical system, this will move “on demand business intelligence” one step closer to reality.

* results of a quick Google search: OnDemandIQ, LucidEra, PivotLink (an in-memory columnar database), oco, VisualSmart, GoodData and
Autometrics.

Wednesday, February 27, 2008

ParAccel Enters the Analytical Database Race

As I’ve now written more times than I care to admit, specialized analytical databases are very much in style. In addition to my beloved QlikView, market entrants include Alterian, SmartFocus, QD Technology, Vertica, 1010data, Kognitio, Advizor and Polyhedra, not to mention established standbys including Teradata and Sybase IQ. Plus you have to add appliances like Netezza, Calpont, Greenplum and DATAllegro. Many of these run on massively parallel hardware platforms; several use columnar data structures and in-memory data access. It’s all quite fascinating, but after a while even I tend to lose interest in the details.

None of which dimmed my enthusiasm when I learned about yet another analytical database vendor, ParAccel. Sure enough, ParAccel is a massively parallel, in-memory-capable, SQL-compatible columnar database, which pretty much hits all the tick boxes on my list. Run by industry veterans, the company seems to have refined many of the details that will let it scale linearly with large numbers of processors and extreme data volumes. One point that seemed particularly noteworthy was that the standard data loader can handle 700 GB per hour, which is vastly faster than many columnar systems and can be a major stumbling block. And that’s just the standard loader, which passes all data through a single node: for really large volumes, the work can be shared among multiple nodes.

Still, if ParAccel had one particularly memorable claim to my attention, it was having blown past previous records for several of the TPC-H analytical query benchmarks run by the Transaction Processing Council. The TPC process is grueling and many vendors don’t bother with it, but it still carries some weight as one of the few objective performance standards available. While other winners had beaten the previous marks by a few percentage points, ParAccel's improvement was on the order of 500%.

When I looked at the TPC-H Website for details, it turned out that ParAccel’s winning results have since been bested by yet another massively parallel database vendor, EXASOL, based in Nuremberg, Germany. (Actually, ParAccel is still listed by TPC as best in the 300 GB category, but that’s apparently only because EXASOL has only run the 100 GB and 1 TB tests.) Still, none of the other analytic database vendors seem to have attempted the TPC-H process, so I’m not sure how impressed to be by ParAccel’s performance. Sure it clearly beats the pants off Oracle, DB2 and SQL Server, but any columnar database should be able to do that.

One insight I did gain from my look at ParAccel was that in-memory doesn’t need to mean small. I’ll admit to be used to conventional PC servers, where 16 GB of memory is a lot and 64 GB is definitely pushing it. The massively parallel systems are a whole other ballgame: ParAccel’s 1 TB test ran on a 48 node system. At a cost of maybe $10,000 per node, that’s some pretty serious hardware, so this is not something that will replace QlikView under my desk any time soon. And bear in mind that even a terabyte isn’t really that much these days: as a point of reference, the TPC-H goes up to 30 TB. Try paying for that much memory, massively parallel or not. The goods news is that ParAccel can work with on-disk as well as in-memory data, although the performance won’t be quite as exciting. Hence the term "in-memory-capable".

Hardware aside, ParAccel itself is not especially cheap either. The entry price is $210,000, which buys licenses for five nodes and a terabyte of data. Licenses cost $40,000 for each additional node cost $40,000 and $10,000 for each additional terabyte. An alternative pricing scheme doesn’t charge for nodes but costs $1,000 per GB, which is also a good bit of money. Subscription pricing is available, but any way you slice it, this is not a system for small businesses.

So is ParAccel the cat’s meow of analytical databases? Well, maybe, but only because I’m not sure what “the cat’s meow” really means. It’s surely an alternative worth considering for anyone in the market. Perhaps more significant, the company raised $20 million December 2007, which may make it more commercially viable than most. Even in a market as refined as this one, commercial considerations will ultimately be more important than pure technical excellence.

Thursday, January 31, 2008

QlikView Scripts Are Powerful, Not Sexy

I spent some time recently delving into QlikView’s automation functions, which allow users to write macros to control various activities. These are an important and powerful part of QlikView, since they let it function as a real business application rather than a passive reporting system. But what the experience really did was clarify why QlikView is so much easier to use than traditional software.

Specifically, it highlighted the difference between QlikView’s own scripting language and the VBScript used to write QlikView macros.

I was going to label QlikView scripting as a “procedural” language and contrast it with VBScript as an “object-oriented” language, but a quick bit of Wikipedia research suggests those may not be quite the right labels. Still, whatever the nomenclature, the difference is clear when you look at the simple task of assigning a value to a variable. With QlikView scripts, I use a statement like:

Set Variable1 = ‘123’;

With VBScript using the QlikView API, I need something like:

set v = ActiveDocument.GetVariable("Variable1")
v.SetContent "123",true

That the first option is considerably easier may not be an especially brilliant insight. But the implications are significant, because they mean vastly less training is needed to write QlikView scripts than to write similar programs in a language like VBScript, let alone Visual Basic itself. This in turn means that vastly less technical people can do useful things in QlikView than with other tools. And that gets back to the core advantage I’ve associated with QlikView previously: that it lets non-IT people like business analysts do things that normally require IT assistance. The benefit isn’t simply that the business analysts are happier or that IT gets a reduced workload. It's that the entire development cycle is accelerated because analysts can develop and refine applications for themselves. Otherwise, they'd be writing specifications, handing these to IT, waiting for IT to produce something, reviewing the results, and then repeating the cycle to make changes. This is why we can realistically say that QlikView cuts development time to hours or days instead of weeks or months.

Of course, any end-user tool cuts the development cycle. Excel reduces development time in exactly the same way. The difference lies in the power of QlikView scripts. They can do very complicated things, giving users the ability to create truly powerful systems. These capabilities include all kinds of file manipulation—loading data, manipulating it, splitting or merging files, comparing individual records, and saving the results.

The reason it’s taken me so long time to recognize that this is important is that database management is not built into today's standard programming languages. We’re simply become so used to the division between SQL queries and programs that the distinction feels normal. But reflecting on QlikView script brought me back to the old days of FoxPro and dBase database languages, which did combine database management with procedural coding. They were tremendously useful tools. Indeed, I still use FoxPro for certain tasks. (And not that crazy new-fangled Visual FoxPro, either. It’s especially good after a brisk ride on the motor-way in my Model T. You got a problem with that, sonny?)

Come to think of it FoxPro and dBase played a similar role in their day to what QlikView offers now: bringing hugely expanded data management power to the desktops of lightly trained users. Their fate was essentially to be overwhelmed by Microsoft Access and SQL Server, which made reasonably priced SQL databases available to end-users and data centers. Although I don’t think QlikView is threatened from that direction, the analogy is worth considering.

Back to my main point, which is that QlikView scripts are both powerful and easy to use. I think they’re an underreported part of the QlikView story, which tends to be dominated by the sexy technology of the in-memory database and the pretty graphics of QlikView reports. Compared with those, scripting seems pedestrian and probably a bit scary to the non-IT users whom I consider QlikView’s core market. I know I myself was put off when I first realized how dependent QlikView was on scripts, because I thought it meant only serious technicians could take advantage of the system. Now that I see how much easier the scripts are than today’s standard programming languages, I consider them a major QlikView advantage.

(Standard disclaimer: although my firm is a reseller for QlikView, opinions expressed in this blog are my own.)

Thursday, January 10, 2008

One More Chart on QlikTech

Appearances to the contrary, I do have work to do. But in reflecting on yesterday's post, I did think of one more way to present the impact of QlikTech (or any other software) on an existing environment. This version shows the net change in percentage of answers provided by each user role for each activity type. It definitely shows which roles gain capacity and which have their workload reduced. What I particularly like here is that the detail by task is clearly visible based on the size and colors of segments within the stacked bars, while the combined change is equally visible in the total height of the bars themselves.

In case you were wondering, all these tables and charts have been generated in QlikView. I did the original data entry and some calculations in Excel, where they are simplest. But tables and charts are vastly easier in QlikView, which also has very nice export features to save them as images.

Wednesday, January 09, 2008

Visualizing the Value of QlikTech (and Any Others)

As anyone who knows me would have expected, I couldn't resist figuring out how to draw and post the chart I described last week to illustrate the benefits of QlikTech.

The mechanics are no big deal, but getting it to look right took some doing. I started with a simple version of the table I described in the earlier post, a matrix comparing business intelligence questions (tasks) vs. the roles of the people who can answer them.

Per Friday's post, I listed four roles: business managers, business analysts, statisticians, and IT specialists. The fundamental assumption is that each question will be answered by the person closest to business managers, who are the ultimate consumers. In other words, starting with the business manager, each person answers a question if she can, or passes it on to the next person on the list.

I defined five types of tasks, based on the technical resources needed to get an answer. These are:

- Read an Existing Report: no work is involved; business managers can answer all these questions themselves.

- Drilldown in Existing Report: this requires accessing data that has already been loaded into a business intelligence system and preppred for access. Business managers can do some of this for themselves, but most will be done by business analysts who are more fluent with the business intelligence product.

- Calculate from Existing Report: this requires copying data from an existing report and manipulating it in Excel or something more substantial. Business managers can do some of this, but more complex analyses are performed by business analysts or sometimes statisticians.

- Ad Hoc Analyze in Existing: this requires accessing data that has already been made available for analysis, but is not part of an existing reporting or business intelligence output. Usually this means it resides in a data warehouse or data mart. Business managers don't have the technical skills to get at this data. Some may be accessible to analysts, more will be available to statisticians, and a remainder will require help from IT.

- Add New Data: this requires adding data to the underlying business intelligence environment, such as putting a new source or field in a data warehouse. Statisticians can do some of this but most of the time it must be done by IT.

The table below shows the numbers I assigned to each part of this matrix. (Sorry the tables are so small. I haven't figured out how to make them bigger in Blogger. You can click on them for a full-screen view.) Per the preceding definitions, they numbers represent the percentage of questions of each type that can be answered by each sort of user. Each row adds to 100%.

I'm sure you recognize that these numbers are VERY scientific.

I then did a similar matrix representing a situation where QlikTech was available. Essentially, things move to the left, because less technically skilled users gain more capabilities. Specifically,

- There is no change for reading reports, since the managers could already do that for themselves.

- Pretty much any drilldown now becomes accessible to business managers or analysts, because QlikTech makes it so easy to add new drill paths.

- Calculations on existing data don't change for business managers, since they won't learn the finer points of QlikTech or even have the licenses needed to do really complicated things. But business analysts can do a lot more than with Excel. There are still some things that need a statistician's help.

- A little ad hoc analysis becomes possible for business managers, but business analysts gain a great deal of capability. Again, some work still needs a statistician.

- Adding new data now becomes possible in many cases for business analysts, since they can connect directly to data sources that would otherwise have needed IT support or preparation. Statisticians can also pick up some of the work. The rest remains with IT.

Here is the revised matrix:

So far so good. But now for the graph itself. Just creating a bar chart of the raw data didn't give the effect I wanted.

Yes, if you look closely, you can see that business managers and business analysts (blue and red bars) gain the most. But it definitely takes more concentration than I'd like.

What I really had in mind was a single set of four bars, one for each role, showing how much capability it gained when QlikTech was added. This took some thinking. I could add the scores for each role to get a total value. The change in that value is what I want to show, presumably as a stacked bar chart. But you can't see that when the value goes down. I ultimately decided that the height of the bar should represent the total capability of each role: after all, just because statisticians don't have to read reports for business managers, they still can do it for themselves. This meant adding the values for each row across, so each role accumulated the capabilities of less technical rows to their left. So, the base table now looked like:

The sum of each column now shows the total capability available to each role. A similar calculation and sum for the QlikTech case shows the capability after QlikTech is added. The Change between the two is each role's gain in capability.

Now we have what I wanted. Start with the base value and stack the change on top of it, and you see very clearly how the capabilities shift after adding QlikTech.

I did find one other approach that I may even like better. This is to plot the sums of the two cases for each group in a three-way bar chart, as below. I usually avoid such charts because they're so hard to read. But in this case it does show both the shift of capability to the left, and the change in workload for the individual roles. It's a little harder to read but perhaps the extra information is worth it.

Obviously this approach to understanding software value can be applied to anything, not just QlikTech. I'd be interested to hear whether anybody else finds it useful.

Fred, this means you.

Friday, January 04, 2008

Fitting QlikTech into the Business Intelligence Universe

I’ve been planning for about a month to write about the position of QlikTech in the larger market for business intelligence systems. The topic has come up twice in the past week, so I guess I should do it already.

First, some context. I’m using “business intelligence” in the broad sense of “how companies get information to run their businesses”. This encompasses everything from standard operational reports to dashboards to advanced data analysis. Since these are all important, you can think of business intelligence as providing a complete solution to a large but finite list of requirements.

For each item on the list, the answer will have two components: the tool used, and the person doing the work. That is, I’m assuming a single tool will not meet all needs, and that different tasks will be performed by different people. This all seems reasonable enough. It means that a complete solution will have multiple components.

It also means that you have to look at any single business intelligence tool in the context of other tools that are also available. A tool which seems impressive by itself may turn out to add little real value if its features are already available elsewhere. For example, a visualization engine is useless without a database. If the company already owns a database that also includes an equally-powerful visualization engine, then there’s no reason to buy the stand-alone visualization product. This is why vendors to expand their product functionality and why it is so hard for specialized systems to survive. It’s also why nobody buys desk calculators: everyone has a computer spreadsheet that does the same and more. But I digress.

Back to the notion of a complete solution. The “best” solution is the one that meets the complete set of requirements at the lowest cost. Here, “cost” is broadly defined to include not just money, but also time and quality. That is, a quicker answer is better than a slower one, and a quality answer is better than a poor one. “Quality” raises its own issues of definition, but let’s view this from the business manager’s perspective, in which case “quality” means something along the lines of “producing the information I really need”. Since understanding what’s “really needed” often takes several cycles of questions, answers, and more questions, a solution that speeds up the question-answer cycle is better. This means that solutions offering more power to end-users are inherently better (assuming the same cost and speed), since they let users ask and answer more questions without getting other people involved. And talking to yourself is always easier than talking to someone else: you’re always available, and rarely lose an argument.

In short: the way to evaluate a business intelligence solution is to build a complete list of requirements and then, for each requirement, look at what tool will meet it, who will use that tool, what the tool will cost; and how quickly the work will get done.

We can put cost aside for the moment, because the out-of-pocket expense of most business intelligence solutions is insignificant compared with the value of getting the information they provide. So even though cheaper is better and prices do vary widely, price shouldn’t be the determining factor unless all else is truly equal.

The remaining critieria are who will use the tool and how quickly the work will get done. These come down to pretty much the same thing, for the reasons already described: a tool that can be used by a business manager will give the quickest results. More grandly, think of a hierarchy of users: business managers; business analysts (staff members who report to the business managers); statisticians (specialized analysts who are typically part of a central service organization); and IT staff. Essentially, questions are asked by business managers, and work their way through the hierarchy until they get to somebody who can answer them. Who that person is depends on what tools each person can use. So, if the business manager can answer her own question with her own tools, it goes no further; if the business analyst can answer the question, he does and sends it back to his boss; if not, he asks for help from a statistician; and if the statistician can’t get the answer, she goes to the IT department for more data or processing.

Bear in mind that different users can do different things with the same tool. A business manager may be able to do use a spreadsheet only for basic calculations, while a business analyst may also know how to do complex formulas, graphics, pivot tables, macros, data imports and more. Similarly, the business analyst may be limited to simple SQL queries in a relational database, while the IT department has experts who can use that same relational database to create complex queries, do advanced reporting, load data, add new tables, set up recurring processes, and more.

Since a given tool does different things for different users, one way to assess a business intelligence product is to build a matrix showing which requirements each user type can meet with it. Whether a tool “meets” a requirement could be indicated by a binary measure (yes/no), or, preferably, by a utility score that shows how well the requirement is met. Results could be displayed in a bar chart with four columns, one for each user group, where the height of each bar represents the percentage of all requirements those users can meet with that tool. Tools that are easy but limited (e.g. Excel) would have short bars that get slightly taller as they move across the chart. Tools that are hard but powerful (e.g. SQL databases) would have low bars for business users and tall bars for technical ones. (This discussion cries out for pictures, but I haven’t figured out how to add them to this blog. Sorry.)

Things get even more interesting if you plot the charts for two tools on top of each other. Just sticking with Excel vs. SQL, the Excel bars would be higher than the SQL bars for business managers and analysts, and lower than the SQL bars for statisticians and IT staff. The over-all height of the bars would be higher for the statisticians and IT, since they can do more things in total. Generally this suggests that Excel would be of primary use to business managers and analysts, but pretty much redundant for the statisticians and IT staff.

Of course, in practice, statisticians and IT people still do use Excel, because there are some things it does better than SQL. This comes back to the matrices: if each cell has utility scores, comparing the scores for different tools would show which tool is better for each situation. The number of cells won by each tool could create a stacked bar chart showing the incremental value of each tool to each user group. (Yes, I did spend today creating graphs. Why do you ask?)

Now that we’ve come this far, it’s easy to see that assessing different combinations of tools is just a matter of combining their matrices. That is, you compare the matrices for all the tools in a given combination and identify the “winning” product in each cell. The number of cells won by each tool shows its incremental value. If you want to get really fancy, you can also consider how much each tool is better than the next-best alternative, and incorporate the incremental cost of deploying an additional tool.

Which, at long last, brings us back to QlikTech. I see four general classes of business intelligence tools: legacy systems (e.g. standard reports out of operational systems); relational databases (e.g. in a data warehouse); traditional business intelligence tools (e.g. Cognos or Business Objects; we’ll also add statistical tools like SAS); and Excel (where so much of the actual work gets done). Most companies already own at least one product in each category. This means you could build a single utility matrix, taking the highest score in each cell from all the existing systems. Then you would compare this to a matrix for QlikTech and find cells where the QlikView is higher. Count the number of those cells, highlight them in a stacked bar chart, and you have a nice visual of where QlikTech adds value.

If you actually did this, you’d probably find that QlikTech is most useful to business analysts. Business managers might benefit some from QlikView dashboards, but those aren’t all that different from other kinds of dashboards (although building them in QlikView is much easier). Statisticians and IT people already have powerful tools that do much of what QlikTech does, so they won’t see much benefit. (Again, it may be easier to do some things in QlikView, but the cost of learning a new tool will weigh against it.) The situation for business analysts is quite different: QlikTech lets them do many things that other tools do not. (To be clear: some other tools can do those things, but it takes more skill than the business analysts possess.)

This is very important because it means those functions can now be performed by the business analysts, instead of passed on to statisticians or IT. Remember that the definition of a “best” solution boils down to whatever solution meets business requirements closest to the business manager. By allowing business analysts to perform many functions that would otherwise be passed through to statisticians or IT, QlikTech generates a hugh improvement in total solution quality.

Tuesday, November 27, 2007

Just How Scalable Is QlikTech?

A few days ago, I replied to a question regarding QlikTech scalability. (See What Makes QlikTech So Good?, August 3, 2007) I asked QlikTech itself for more information on the topic but haven’t learned anything new. So let me simply discuss this based on my own experience (and, once again, remind readers that while my firm is a QlikTech reseller, comments in this blog are strictly my own.)

The first thing I want to make clear is that QlikView is a wonderful product, so it would be a great pity if this discussion were to be taken as a criticism. Like any product, QlikView works within limits that must be understood to use it appropriately. No one benefits from unrealistic expectations, even if fans like me sometimes create them.

That said, let’s talk about what QlikTech is good at. I find two fundamental benefits from the product. The first is flexibility: it lets you analyze data in pretty much any way you want, without first building a data structure to accommodate your queries. By contrast, most business intelligence tools must pre-aggregate large data sets to deliver fast response. Often, users can’t even formulate a particular query if the dimensions or calculated measures were not specified in advance. Much of the development time and cost of conventional solutions, whether based in standard relational databases or specialized analytical structures, is spent on this sort of work. Avoiding it is the main reason QlikTech is able to deliver applications so quickly.

The other big benefit of QlikTech is scalability. I can work with millions of records on my desktop with the 32-bit version of the system (maximum memory 4 GB if your hardware allows it) and still get subsecond response. This is much more power than I’ve ever had before. A 64-bit server can work with tens or hundreds of millions of rows: the current limit for a single data set is apparently 2 billion rows, although I don’t know how close anyone has come to that in the field. I have personally worked with tables larger than 60 million rows, and QlikTech literature mentions an installation of 300 million rows. I strongly suspect that larger ones exist.

So far so good. But here’s the rub: there is a trade-off in QlikView between really big files and really great flexibility. The specific reason is that the more interesting types of flexibility often involve on-the-fly calculations, and those calculations require resources that slow down response. This is more a law of nature (there’s no free lunch) than a weakness in the product, but it does exist.

Let me give an example. One of the most powerful features of QlikView is a “calculated dimension”. This lets reports construct aggregates by grouping records according to ad hoc formulas. You might want to define ranges for a value such as age, income or unit price, or create categories using if/then/else statements. These formulas can get very complex, which is generally a good thing. But each formula must be calculated for each record every time it is used in a report. On a few thousand rows, this can happen in an instant, but on tens of millions of rows, it can take several minutes (or much longer if the formula is very demanding, such as on-the-fly ranking). At some point, the wait becomes unacceptable, particularly for users who have become accustomed to QlikView’s typically-immediate response.

As problems go, this isn’t a bad one because it often has a simple solution: instead of on-the-fly calculations, precalculate the required values in QlikView scripts and store the results on each record. There’s little or no performance cost to this strategy since expanding the record size doesn’t seem to slow things down. The calculations do add time to the data load, but that happens only once, typically in an unattended batch process. (Another option is to increase the number and/or speed of processors on the server. QlikTech makes excellent use of multiple processors.)

The really good news is you can still get the best of both worlds: work out design details with ad hoc reports on small data sets; then, once the design is stabilized, add precalculations to handle large data volumes. This is vastly quicker than prebuilding everything before you can see even a sample. It’s also something that’s done by business analysts with a bit of QlikView training, not database administrators or architects.

Other aspects of formulas and database design also more important in QlikView as data volumes grow larger. The general solution is the same: make the application more efficient through tighter database and report design. So even though it’s true that you can often just load data into QlikView and work with it immediately, it’s equally true that very large or sophisticated applications may take some tuning to work effectively. In other words, QlikView is not pure magic (any result you want for absolutely no work), but it does deliver much more value for a given amount of work than conventional business intelligence systems. That’s more than enough to justify the system.

Interestingly, I haven’t found that the complexity or over-all size of a particular data set impacts QlikView performance. That is, removing tables which are not used in a particular query doesn’t seem to speed up that query, nor does removing fields from tables within the query. This probably has to do with QlikTech’s “associative” database design, which treats each field independently and connects related fields directly to each other. But whatever the reason, most of the performance slow-downs I’ve encountered seem related to processing requirements.

And, yes, there are some upper limits to the absolute size of a QlikView implementation. Two billions rows is one, although my impression (I could be wrong) is that could be expanded if necessary. The need to load data into memory is another limit: even though the 64-bit address space is effectively infinite, there are physical limits to the amount of memory that can be attached to Windows servers. (A quick scan of the Dell site finds a maximum of 128 GB.) This could translate into more input data, since QlikView does some compression. At very large scales, processing speed will also impose a limit . Whatever the exact upper boundary, it’s clear that no one will be loading dozens of terabytes into QlikView any time soon. It can certainly be attached a multi-terabyte warehouse, but would have to work with multi-gigabyte extracts. For most purposes, that’s plenty.

While I’m on the topic of scalability, let me repeat a couple of points I made in the comments on the August post. One addresses the notion that QlikTech can replace a data warehouse. This is true in the sense that QlikView can indeed load and join data directly from operational systems. But a data warehouse is usually more than a federated view of current operational tables. Most warehouses include data integration to link otherwise-disconnected operational data. For example, customer records from different systems often can only be linked through complex matching techniques because there is no shared key such as a universal customer ID. QlikView doesn’t offer that kind of matching. You might be able to build some of it using QlikView scripts, but you’d get better results at a lower cost from software designed for the purpose.

In addition, most warehouses store historical information that is not retained in operational systems. A typical example is end-of-month account balance. Some of these values can be recreated from transaction details but it’s usually much easier just to take and store a snapshot. Other data may simply be removed from operational systems after a relatively brief period. QlikView can act as a repository for such data: in fact, it’s quite well suited for this. Yet in such cases, it’s probably more accurate to say that QlikView is acting as the data warehouse than to say a warehouse is not required.

I hope this clarifies matters without discouraging anyone from considering QlikTech. Yes QlikView is a fabulous product. No it won’t replace your multi-terabyte data warehouse. Yes it will complement that warehouse, or possibly substitute for a much smaller one, by providing a tremendously flexible and efficient business intelligence system. No it won’t run itself: you’ll still need some technical skills to do complicated things on large data volumes. But for a combination of speed, power, flexibility and cost, QlikTech can’t be beat.

Monday, August 06, 2007

What Makes QlikTech So Good: A Concrete Example

Continuing with Friday’s thought, it’s worth giving a concrete example of what QlikTech makes easy. Let’s look at the cross-sell report I mentioned on Thursday.

This report answers a common marketing question: which products do customers tend to purchase together, and how do customers who purchase particular combinations of products behave? (Ok, two questions.)

The report this begins with a set of transaction records coded with a Customer ID, Product ID, and Revenue. The trick is to identify all pairs among these records that have the same Customer ID. Physically, the resulting report is a matrix with products both as column and row headings. Each cell will report on customers who purchased the pair of products indicated by the row and column headers. Cell contents will be the number of customers, number of purchases of the product in the column header, and revenue of those purchases. (We also want row and column totals, but that’s a little complicated so let’s get back to that later.)

Since each record relates to the purchase of a single product, a simple cross tab of the input data won’t provide the information we want. Rather, we need to first identify all customers who purchased a particular product and group them on the same row. Columns will then report on all the other products they purchased.

Conceptually, Qlikview and SQL do this in roughly the same way: build a list of existing Customer ID / Product ID combinations, use this list to select customers for each row, and then find all transactions associated with those customers. But the mechanics are quite different.

In QlikView, all that’s required is to extract a copy of the original records. This keeps the same field name for Customer ID so it can act as a key relating to the original data, but renames Product ID as Master Product so it can treated as an independent dimension. The extract is done in a brief script that loads the original data and creates the other table from it:

Columns: // this is the table name
load
Customer_ID,
Product_ID,
Revenue
from input_data.csv (ansi, txt, delimiter is ',', embedded labels); // this code will be generated by a wizard

Rows: // this is the table name
load
Customer_ID,
Product_ID as Master_Product
resident Columns;

After that, all that’s needed is to create a pivot table report in the QlikView interface by specifying the two dimensions and defining expressions for the cell contents: count (distinct Customer ID); count (Product ID), and sum(Revenue). QlikView automatically limits the counts to the records qualified for each cell by the dimension definitions.

SQL takes substantially more work. The original extract is similar, creating a table with Customer ID and Master Product. But more technical skill is needed: the user must know to use a “select distinct” command to avoid creating multiple records with the same Customer ID / Product ID combination. Multiple records would result in duplicate rows, and thus double-counting, when the list is later joined back to the original transactions. (QlikView gives the same, non-double-counted results whether or not “select distinct” is used to create its extract.)

Once the extract is created, SQL requires the user to create a table with records for the report. This must contain two records for each transaction: one where the original product is the Master Product, and other where it is the Product ID. This requires a left join (I think) of the extract table against the original transaction table: again, the user needs enough SQL skill to know which kind of join is needed and how to set it up.

Next, the SQL user must create the report values themselves. We’ve now reached the limits of my own SQL skills, but I think you need two selections. The first is a “group by” on the Master Product, Product ID and Customer ID fields for the customer counts. The second is another “group by” on just the Master Product and Product ID for the product counts and revenue. Then you need to join the customer counts back to the more summarized records. Perhaps this could all be done in a single pass, but, either way, it’s pretty trickly.

Finally, the SQL user must display the final results in a report. Presumably this would be done in a report writer that hides the technical details from the user. But somebody skilled will still need to set things up the first time around.

I trust it’s clear how much easier it will be to create this report in QlikView than SQL. QlikView required one table load and one extract. SQL required one table load, one extract, one join to create the report records, and one to three additional selects to create the final summaries. Anybody wanna race?

But this is a very simple example that barely scratches the surface of what users really want. For example, they’ll almost certainly ask to calculate Revenue per Customer. This will be simple for QlikTech: just add a report expression of sum(Revenue) / count(distinct Customer_ID). (Actually, since QlikView lets you name the expressions and then use the names in other expressions, the formula would probably be something simpler still, like “Revenue / CustomerCount”.) SQL will probably need another data pass after the totals are created to do the calculation. Perhaps a good reporting tool will avoid this or at least hide it from the user. But the point is that QlikTech lets you add calculations without any changes to the files, and thus without any advance planning.

Another thing users are likely to want is row and column totals. These are conceptually tricky because you can’t simply add up the cell values. For the row totals, the same customer may appear in multiple columns, so you need to eliminate those duplicates to get correct values for customer count and revenue per customer. For the column totals, you need to remove transactions that appear on two rows (one where they are the Master Product, and other where they are the Product_ID). QlikTech automatically handles both situations because it is dynamically calculating the totals from the original data. But SQL created several intermediate tables, so the connection to the original data is lost. Most likely, SQL will need another set of selections and joins to get the correct totals.

QlikTech’s approach becomes even more of an advantage when users start drilling into the data. For example, they’re likely to select transactions related to particular products or on unrelated dimensions such as customer type. Again, since it works directly from the transaction details, QlikVeiw will instantly give correct values (including totals) for these subsets. SQL must rerun at least some of its selections and aggregations.

But there’s more. When we built the cross sell report for our client, we split results based on the number of total purchases made by each customer. We did this without any file manipulation, by adding a “calculated dimension” to the report: aggr(count (Product ID), Customer ID). Admittedly, this isn’t something you’d expect a casual user to know, but I personally figured it out just looking at the help files. It’s certainly simpler than how you’d do it in SQL, which is probably to count the transactions for each customer, post the resulting value on the transaction records or a customer-level extract file, and rebuild the report.

I could go on, but hope I’ve made the point: the more you want to do, the greater the advantage of doing it in QlikView. Since people in the real world want to do lots of things, the real world advantage of QlikTech is tremendous. Quod Erat Demonstrandum.

(disclaimer: although Client X Client is a QlikTech reseller, contents of this blog are solely the responsibility of the author.)

Friday, August 03, 2007

What Makes QlikTech So Good?

To carry on a bit with yesterday’s topic—QlikTech fascinates me on two levels: first, because it is such a powerful technology, and second because it’s a real-time case study in how a superior technology penetrates an established market. The general topic of diffusion of innovation has always intrigued me, and it would be fun to map QlikView against the usual models (hype curve, chasm crossing, tipping point, etc.) in a future post. Perhaps I shall.

But I think it’s important to first explain exactly just what makes QlikView so good. General statements about speed and ease of development are discounted by most IT professionals because they’ve heard them all before. Benchmark tests, while slightly more concrete, are also suspect because they can be designed to favor whoever sponsors them. User case studies may be most convincing evidence, but resemble the testimonials for weight-loss programs: they are obviously selected by the vendor and may represent atypical cases. Plus, you don’t know what else was going on that contributed to the results.

QlikTech itself has recognized all this and adopted “seeing is believing” as their strategy: rather than try to convince people how good they are, they show them with Webinars, pre-built demonstrations, detailed tutorials, documentation, and, most important, a fully-functional trial version. What they barely do is discuss the technology itself.

This is an effective strategy with early adopters, who like to get their hands dirty and are seeking a “game changing” improvement in capabilities. But while it creates evangelists, it doesn’t give them anything beyond than own personal experience to testify to the product’s value. So most QlikTech users find themselves making exactly the sort of generic claims about speed and ease of use that are so easily discounted by those unfamiliar with the product. If the individual making the claims has personal credibility, or better still independent decision-making authority, this is good enough to sell the product. But if QlikTech is competing against other solutions that are better known and perhaps more compatible with existing staff skills, a single enthusiastic advocate may not win out—even though they happen to be backed by the truth.

What they need is a story: a convincing explanation of WHY QlikTech is better. Maybe this is only important for certain types of decision-makers—call them skeptics or analytical or rationalists or whatever. But this is a pretty common sort of person in IT departments. Some of them are almost physically uncomfortable with the raving enthusiasm that QlikView can produce.

So let me try to articulate exactly what makes QlikView so good. The underlying technology is what QlikTech calls an “associative” database, meaning data values are directly linked with related values, rather than using the traditional table-and-row organization of a relational database. (Yes, that’s pretty vague—as I say, the company doesn’t explain it in detail. Perhaps their U.S. Patent [number 6,236,986 B1, issued in 2001] would help but I haven’t looked. I don’t think QlikTech uses “associative” in the same way as Simon Williams of LazySoft, which is where Google and Wikipedia point go when you query the term.)

Whatever the technical details, the result of QlikTech’s method is that users can select any value of any data element and get a list of all other values on records associated with that element. So, to take a trivial example, selecting a date could give a list of products ordered on that date. You could do that in SQL too, but let’s say the date is on a header record while the product ID is in a detail record. You’d have to set up a join between the two—easy if you know SQL, but otherwise inaccessible. And if you had a longer trail of relations the SQL gets uglier: let’s say the order headers were linked to customer IDs which were linked to customer accounts which were linked to addresses, and you wanted to find products sold in New Jersey. That’s a whole lot of joining going on. Or if you wanted to go the other way: find people in New Jersey who bought a particular product. In QlikTech, you simply select the state or the product ID, and that’s that.

Why is this a big deal? After all, plenty of SQL-based tools can generate that query for non-technical users who don’t know SQL. But those tools have to be set up by somebody, who has to design the database tables, define the joins, and very likely specify which data elements are available and how they’re presented. That somebody is a skilled technician, or probably several technicians (data architects, database administrators, query builders, etc.). QlikTech needs none of that because it’s not generating SQL code to begin with. Instead, users just load the data and the system automatically (and immediately) makes it available. Where multiple tables are involved, the system automatically joins them on fields with matching names. So, okay, someobody does need to know enough to name the fields correctly – but that’s just all the skill required..

The advantages really become apparent when you think about the work needed to set up a serious business intelligence system. The real work in deploying a Cognos or BusinessObjects is defining the dimensions, measures, drill paths, and so on, so the system can generate SQL queries or the prebuilt cubes needed to avoid those queries. Even minor changes like adding a new dimension are a big deal. All that effort simply goes away in QlikTech. Basically, you load the raw data and start building reports, drawing graphs, or doing whatever you need to extract the information you want. This is why development time is cut so dramatically and why developers need so little training.

Of course, QlikView’s tools for building reports and charts are important, and they’re very easy to use as well (basically all point-and-click). But that’s just icing on the cake—they’re not really so different from similar tools that sit on top of SQL or multi-dimensional databases.

The other advantages cited by QlikTech users are speed and scalability. These are simpler to explain: the database sits in memory. The associative approach provides some help here, too, since it reducing storage requirements by removing redundant occurrences of each data value and by storing the data as binary codes. But the main reason QlikView is incredibly fast is that the data is held in memory. The scalability part comes in with 64 bit processors, which can address pretty much any amount of memory. It’s still necessary to stress that QlikView isn’t just putting SQL tables into memory: it’s storing the associative structures, with all their ease of use advantages. This is an important distinction between QlikTech and other in-memory systems.

I’ve skipped over other benefits of QlikView; it really is a very rich and well thought out system. Perhaps I’ll write about them some other time. The key point for now is that people need to understand QlikView using a fundamentally different database technology, one that hugely simplifies application development by making the normal database design tasks unnecessary. The fantastic claims for QlikTech only become plausible once you recognize that this difference is what makes them possible.

(disclaimer: although Client X Client is a QlikTech reseller, they have no responsibility for the contents of this blog.)

Thursday, August 02, 2007

Notes from the QlikTech Underground

You may have noticed that I haven’t been posting recently. The reason is almost silly: I got to thinking about the suggestion in The Power Performance Grid that each person should identify a single measure most important to their success, and recognized that the number of blog posts certainly isn’t mine. (That may actually be a misinterpretation of the book’s message, but the damage is done.)

Plus, I’ve been busy with other things—in particular, a pilot QlikTech implementation at a Very Large Company that shall remain nameless. Results have been astonishing—we were able to deliver a cross sell analysis in hours that the client had been working on for years using conventional business intelligence technology. A client analyst, with no training beyond a written tutorial, was then able to extend that analysis with new reports, data views and drill-downs in an afternoon. Of course, it helped that the source data itself was already available, but QlikTech still removes a huge amount of effort from the delivery part of the process.

The IT world hasn’t quite recognized how revolutionary QlikTech is, but it’s starting to see the light: Gartner has begun covering them and there was a recent piece in InformationWeek. I’ll brag a bit and point out that my own coverage began much sooner: see my DM News review of July 2005 (written before we became resellers).

It will be interesting to watch the QlikTech story play out. There’s a theory that the big system integration consultancies won’t adopt QlikTech because it is too efficient: since projects that would have involved hundreds of billable hours can be completed in a day or two, the integrators won’t want to give up all that revenue. But I disagree for a couple of reasons: first of all, competitors (including internal IT) will start using QlikTech and the big firms will have to do the same to compete. Second, there is such a huge backlog of unmet needs for reporting systems that companies will still buy hundreds of hours of time; they’ll just get a lot more done for their money. Third, QlikTech will drive demand for technically-demanding data integration project to feed it information, and distribution infrastructures to use the results. These will still be big revenue generators for the integrators. So while the big integrators first reaction may be that QlikTech is a threat to their revenue, I’m pretty confident they’ll eventually see it gives them a way to deliver greater value to their clients and thus ultimately maintain or increase business volume.

I might post again tomorrow, but then I’ll be on vacation for two weeks. Enjoy the rest of the summer.