Web 2.0 Expo speaker and serial entrepreneur Hjalmar Gislason today launched an international version of previously Iceland only DataMarket.com. This means users can now use the service to find, visualize and access data from around the world, including organizations such as the World Bank, UN, Eurostat, and Gapminder.
Quick summary of what they offer: DataMarketing has “15,000 data sets from over 40 providers holding tens of millions of time series of statistics ranging from World population and temperature anomalies to the yield of oranges in Cyprus – to name an example,” Hjalmar said.
This spring at Web 2.0 Expo San Francisco, Hjalmar will speak on “The Business of Open Data,” and how businesses can make money by connecting people more easily with Open Data. Previously, he said, the only opportunities businesses saw in Open Data was to provide the public sector with products and services to open up their data.
“This is all good and well, but the real value in Open Data lies in helping people discover all the available data, see its potential and realize how they can make use of it to run their businesses better, make better decisions and identify new opportunities.
“I think that business models that use Open Data today can largely be divided in two categories: Suppliers to government Open Data initiatives and specialized applications that use Open Data to provide highly relevant services to niche audiences. There are great companies, already creating a lot of value in both categories. My favorite examples are Socrata on the supplier end and EveryBlock on the niche audiences side.”
Hjalmar recently spoke to us on his upcoming session and the launch of DataMarket.com, the interview of which you can read below. To see him in person, register for Web 2.0 Expo with discount code websf11bl20 and save 20%.
Q&A with Hjalmar Gislason of DataMarket.com
Kaitlin: Businesses already use all sorts of data in marketing and business decisions. What is your elevator pitch to someone new to Open Data and how it differs from what we had in the past?
Hjalmar: Most businesses have realized how important good data is to their decisions and planning, and many have gone to great lengths to measure key performance indicators, set up business intelligence systems and thereby be able to make really data driven decisions. What has surprised me is that in most cases, this thinking is limited to internal data; data about sales, customer churn, web traffic, call center activity, employee satisfaction and so on. The fact of the matter is that external data is no less important to a business’ success. You could run a company perfectly based on all the internal indicators and still go belly up because you didn’t account for some externalities.
With the wave of Open Data, an enormous amount of valuable secondary data is becoming easily and freely available all over the world. Some of this data has never been made public before and certainly not this accessible. In gathering this data lie thousands – no – probably hundreds of thousands of man-years, whose output has until now only been used to a fraction of its potential. Imagine the value in all that work, in all that data!
And this data has real, practical, monetary value to virtually every business, household and individual. But only a tiny fraction of them know that yet.
In a low-rise building the elevator pitch would be: There’s more data than you can imagine out there to help you run your business, easily available and for free.
Kaitlin: If you want Open Data to be “be kept open and free of charge,” as your session description says, how does a company make money?
Hjalmar: It’s possible to add all sorts of value on top of open data without locking the data itself behind a pay wall. Our approach is the following: We want to make as much public and open data as possible available on DataMarket.com for free, enabling powerful search, visualizations and access for anyone. On top of that, we sell additional features and services. Pro subscribers get additional features such as file downloads in a wider variety of formats, ability to create automated reports from the data that is most important to them and have those reports sent to them in email at scheduled times.
Developers can buy API access to integrate data into their own applications, giving them uniform access to data whose origins are as different as Excel sheets, PDF documents, proprietary APIs and web services.
Finally, users searching on DataMarket.com will also find various premium data sets in our collection. This data – coming from market research companies, analysts, financial markets and other premium data sources – is only available to those that have paid for access. Having done so, all the same functionality is open to them on top of that data as any data from the open collection. Thereby we’re building an active marketplace for data, and while a lot of the products are free of charge, you’ll also see what kind of insight is available from premium sources if you have a budget to spend on it.
Kaitlin: In your session description, you say the business opportunity lies in “helping people discover all this data”. To me this sounds like you mean we need good curators of data. What qualities make a company a good curator of Open Data? Or put another way, what standards of quality curation does DataMarket follow?
Hjalmar: In fact, we’re neutral to the data that we distribute. Our responsibility is to keep the data intact and up to date. We do not judge the quality of data, but enable and encourage our users to go straight to the data source to see the data in its full context. That said, we still try to make sure that the search return the best possible results for any query.
So you could say that our role in data curation is similar to that of Google in curation of web content: Trying to infer the importance and relevance of data sets from any meta-data, user activity and quality indicators that we can get our hands on.
Kaitlin: What are the security issues you and other businesses like you face in dealing with Open Data? How do you deal with these?
Hjalmar: In the case of Open Data, we’re dealing with data that “wants to be distributed”, so the distribution is largely free of security issues.
We’re more concerned with security of the premium data, making sure that only paying subscribers have access to data sets that sometimes are enormously valuable. For the same reason we’re staying away from internal and private data for now. Not because we can’t deal with security, but because it complicates the sale, invites a lot of due diligence and there are plenty of opportunities without those complications. If people want to mix their internal data with secondary data we suggest that they do that in their own BI systems inside their own firewalls. Our API provides them with a perfect tool to retrieve the statistics they need for exactly that purpose.
Kaitlin: Do you have good examples of government opening data up? How about http://www.data.gov/? Is there anything you wish government would do more of?
Hjalmar: Data.gov and Data.gov.uk are the most prominent examples, each with its pros and cons. The UK approach seems to be “deeper” and involve more of a long term, technical strategy with their focus on Linked data. Data.gov is a more lightweight approach, and I tend to agree with that approach for now. In fact I believe that is perfectly in line with Tim Berners-Lee’s “raw data now!” mantra. The first step is to make the data available in ANY format, then think about how you can make it better.
And that’s exactly what I miss the most – more raw data. Despite all the good intentions, the fact of the matter is that both collections are still somewhat disappointing. There is still a lot of low-hanging public data that has not been registered on these data portals or even made available publicly in any format. The public sector is still infected with what Hans Rosling called the “Database hugging disease (DbHD). I think that is largely because they want their data to be “perfect” before they allow others to see it and use it. That will never happen. People just have to let go and understand that the data will usually improve faster after its been made available in its current state, than it will on their own hard drives with a thousand other projects waiting for them to finish.
Our approach has – from day one – been that we’ll adopt to the World as it is instead of waiting for the World to adopt to some unified standards or methodologies. This means that we have to read almost as many different data formats as we have data providers, but it also means that we have a lot of data – now.
Kaitlin: Have you run into issues of sources complaining about your use of their data? How do businesses handle such situations?
Hjalmar: Actually we haven’t. We’ve been well received pretty much everywhere, and I think that our strategy of keeping open data open and free of charge has played a big role in that. Most of the data we distribute is already available under open licenses, and when it isn’t we make sure we seek the approval of the data providers before we make the data available. The same is obviously true of the premium data providers, but in that case we’re establishing a reseller agreement in addition to the license.
Kaitlin: How are business owners making money by opening up their data?
Hjalmar: The airline industry is a perfect example of how opening up data can be beneficial to a business and in fact an industry as a whole. A few years back, you would go directly to an airline’s web site to search for a flight, and then you’d repeat that search on a few other airline’s web sites as well. Then came along specialized flight search engines, such as Kayak and Dohop.com. They started scraping the airlines sites for timetables, prices and search results in order to relief the passengers’ headaches in finding the most appropriate flights. To begin with, the airlines fought this and many blocked and or banned such activity. Gradually – however – they realized that making their flights appear on more sites meant more business. Now, many airlines take pride in providing as good data as they possibly can to such engines and almost none of them block these activities any more.
I think there are parallels to this in many industries that have yet to realize the opportunities that lie in opening up their data. However – while I firmly believe that the public sector provides the most value by opening up any data that is not sensitive due to issues such as privacy or security, the private sector should obviously be a lot more selective and strategic. But there are definitely a lot of opportunities for businesses in opening up their data, and we’ve yet to see even the tip of that iceberg.
Kaitlin: What are some lessons you’ve seen from people that have learned to make better use of Open Data to run their businesses better and make better decisions?
Hjalmar: Two quick examples from our own experience:
1. We helped a local business measure their market share in “real time”, by combining their internal sales numbers to the latest population, car registration and household numbers broken down by post code. Seeing how their market share was different in one town to another and the change from month to month they could learn from their best representatives and implement their strategies across the market, tell each representative how well they were doing etc. Previously their indications were in an annual industry report and ad-hoc calculations done with partial data for important meetings and moments.
2. A start-up company in the financial industry created a brand new index for government bond prices. By naming this index after their company and distributing it openly and avidly – among other things through DataMarket.com – they’ve been able to make a name for themselves, leading to new opportunities and more business. Our role has been to distribute this index automatically to media, web portals and others that want to publish it. They just have to update the index in our systems at the end-of-day and it is immediately available on our site and on several other portals and online media through our API. Another great example of a private company making money from opening up their data – or in this case actually creating new open data.