David Cancel’s Data-driven startup: “Does anyone Gives a shit about your dumb idea?”

Saw David Cancel present the other day at Dogpatch Labs New York his data driven startups thesis. That guy goes straight to the point, no time lost on details. The whole presentation is really a must-read: how you need, from the start, to install a data driven culture in your startup (or in any project you’re starting to that matter). Test everything, get data for everything, and then iterate, iterate and iterate.

The 3 stages you need to go through (and in that order, this is critical) is to first get yourself operational dashboards (simple please! the simplest the better, if not nobody pays attention to it).

Once these are working reports used throughout the organization, you can move to getting yourself some funnel analysis and then move on to cohort analysis.

It’s all pretty clear in his presentation that I’ve embeded below. Direct link to David’s original post here.

Great presentation on the advertising ecosystem

30 slides of explanation on where we stand on the ad ecosystems and the intersection of data / display / search / ad optimization.

Presented at the IAB, this is from Terence Kawaja, an MD at GCA Savvian.

Reblog this post [with Zemanta]

Where Do I Go?

At last night’s NY Tech Meetup, the show got kicked off by some demos from university students. I particularly liked the mashup of Steve Lehrburger leveraging Foursquare API and showing a heat map of your checkins.

Here’s the heat map of my 759 checkins in Foursquare. As I could expect, the 2 big heat points are midtown, where my office is based, and Soho / Nolita / Bowery where I spend most of my time outside of the office. I couldn’t include Brooklyn where I live (doesn’t fit on the map). Anyway, little but neat app…

Florent Peyre Heatmap Foursquare

Reblog this post [with Zemanta]

MSN, Hachette Filipacchi Media U.S. and BermanBraun Unveil Glo, a Premium Online Lifestyle Destination for Women

This was the past year of my life. Thrilled to see it launched today and amazed by the look and feel of the site! Check it out for yourself on Glo.

It’s all about bringing personal “me-time” and personal time for women on the web while presenting that in a visually very aspirational design package.

Full press release there.

Using cellphone data to study human behaviors

I was listening to the Technology Podcast from the NPR that relates results of a study from Laszlo Barabasi, a human behavior researcher from Northeastern University. Laszlo negotiated access to full blind data from 50,000 cellphones subscribers to study their travelings and movements throughout a defined period (all cellphone signals transit through nearby cellphone towers, enabling the tracking).

The key finding is the extremely high percentage of predictability in day-to-day patterns. On average, he was able to see a 93% rate of predictability. That means that in 93% of the cases, you could in theory predict where that specific user would be. A lot of us might tend to believe that we’re fairly diversified creatures but when it comes to daily patterns, you’re pretty much the same as the one sitting next to you in the subway.

But what caught my attention was that phrase from Laszlo:

“We were seeing an average of 93 percent predictability across the user base. What does it mean? That means that for the vast majority of the people, you could, in principle, write an algorithm that could predict 93 percent of the time, correctly, their present location.

Now imagine what you could do with that, once your algorithm is build and you don’t have to rely anymore on actual data (hence getting rid of the immediate issues of privacy, data collection and storage and other Big Brothers driftings). The services you can bring to any organization managing large infrastructures, being it roads, trains, subways, local development etc. If somebody could convince carriers to open all anonymous and blind data through an APIs and let the hordes of developers coming up with applications on top of it, it would probably spur a great deal of innovative services.

The full 4 minutes of the interview are there.

The new definitions of privacy on the web

My Flavors.me Homepage with my Tumblr Blog Feed opened

Upon the sharing of a friend on Facebook (ahhh… the power of social recommendation…), I discovered a new service called  in Flavors.me which enables anyone to build a personalized page on himself or anyone else and then link to it the main social content production factories. They currently carry 14 services including Facebook, Tumblr but also your DVD queue from Netflix, your checkins from Foursquare or the last tracks you’ve been listening on Last.fm.
Once you’ve added all services, the user coming to your page can click on the services you’ve added and a window will display whatever stream of activity you’ve had on that specific service.
Testing it yesterday, I mechanically added all the services I’m using including Netflix and Foursquare. Once I realized that everybody could then follow my physical traces around NY through Foursquare or all photos that I posted to Facebook (and where these only get displayed to a selected list of people), I freaked out and decided to limit that to only the safer LinkedIn and other Twitter feeds.

Well, boy, it was easy to add services but it was a nightmare to remove them. Flavors.me doesn’t include a “Remove The Service” option… Sure, they’re in beta but given the nature of their business, that should probably be part of your MVP feature… So then I went to all the services I wanted to get off my Flavors.me page and remove the authorization for Flavors.me to access these data. But even with that, Flavors.me kept the latest stream of data imported. Sure, nothing new was going to get published but all of the content previously imported was visible.

I ended terminating my account at Flavors.me to clean it. Don’t get me wrong, I think the service is pretty neat (rebuilding a page right now), but I was a little taken aback by the difficulty to keep track of all your social traces. That comes around a fairly large debate, initially provoked by the launch of PleaseRobMe which list empty homes by tapping into Twitter API. While Foursquare is a closed network (you need to approve your friends), more and more people link it automatically with their Twitter account which is an open network, all of a sudden revealing to anyone who wants to find it whether you’re at home or not. Foursquare countered back on that issue but this is just the beginning of more and more debates around open systems.

One of the key improvements there would probably be for the main companies that offers to link your accounts to open systems like Twitter to state clearly that you pushing out data on the open. I also think that the details of permissions given to 3rd-party services should be much more detailed on networks like Facebook and Twitter. You basically should be able to have the same detail of what you’re authoring and to whom as you have in your Facebook privacy settings. For once, I might be pretty ok (actually I know I would…) to display on Flavors.me my Foursquare badges, but not necessarily all my checkins…

The photo above shows my welcome screen on Flavors.me with my Tumblr blog feed open

Reblog this post [with Zemanta]

Why only pushing abstracts through RSS feeds is an absurdity

I’m the new happy owner of an iPhone. One of the first thing that I did was trying to find the right application to bring back all of the feeds that I carefully manage on my Google RSS reader. Because living in Brooklyn means sometime fairly long commutes, reading trade news and the bloggers that matter to me when traveling was critical.
And the app that I found works fairly well (seamlessly download all of the latest articles when I’m connected, that I can then “comfortably read” while commuting).

But the problem that I have is that a bunch of publishers (never bloggers I have to say, either by lack of tech knowledge or maybe because they’re just more open to the social web) think it is smart to parse their feeds and only include abstracts. Here are the usual reasons brought on and why it doesn’t make sense:

– “We need to have the user on our site, if not, people tend to not go anymore to our site…”
Who cares where the media is consumed? The key thing is make sure that you aggregate all of the analytics (and not only your site’s analytics). Most of the analytical packages now include that as a standard. And you might even learn interesting things about your audience (where is the media consumed, through what device or what platform etc.). In the end game, what matters is that the user is in contact with your brand, whether connected or disconnected.

– “We’re losing money since we can’t serve ads…”
While that was true for a long time, there’s more and more solutions coming down the road for publishers to monetize their RSS audience (see the good article from Dosh Dosh on that). And even if it’s not fully perfect, you can actually come up with new interesting packages for your advertisers that would, for example. include location-based services / promotions / coupons etc.

– “Our content is beautiful and should really be consumed on a full screen rather than a micro device…”
True again for a long time but the irruption of smart phones and e-readers is going to revolutionize that radically in terms of media consumption usage while potentially increase the rendering of your content (you might even stretch that argument to say that in some large e-readers that include color, the rendering will be ultimately better than through the current web experience).

There’s also a couple of downsides on the abstract method. The main one (realized from my own use) is that I tend to skip the feeds that are just a couple of lines long. It’s very very frustrating to start reading the abstract, get excited and then being unable to finish the article. Sure, I can always save it for later, but unless that 3 lines abstract was crazy interesting, I’ll never go back to it. So, first effect, I don’t use these feeds anymore (and therefore, I actually stop reading that specific publication, relying on the rest of the feeds to get me informed – good example at Silicon Alley Insider).

You have to follow your users instead of trying to shoehorn them into what you believe is good for you.
In the long run, I realized that with the help of an e-reader and/or a smartphone, I actually consume more media than before. It’s an exciting feeling to board on a plane with no internet connection (that’s getting rare though) and know that you’ll be able to catch up on all these great articles you’ve been saving for a moment like that, a moment when you’re not connected. And that’s also probably a moment where you, as a media company, want to be in the mind of the mind-free user, especially when that user is actually available to connect with your brand.

Reblog this post [with Zemanta]

Towards creating richer data streams between Internet users

I’ve been thinking about that for a while: how painful and multi-tasking it takes accomplish the day to day task that you’re accomplishing every day on the web.
Let me be concrete and consider these situations:

  • You’re talking with a friend on Google Talk and you want to send him a couple of pictures from Flickr while talking about a new restaurant and he wants to paste a map in the conversation
  • You’re tweeting about a great place you’ve been last night that you want to recommend and want to include in your post a shorten URL back to this place website as well as the location map
  • You’re blogging on a specific subject and want to include videos, some related links and tags and some pictures that are currently hosted on an outside service.
  • You’re reading an article and want to tweet about it while including a shorten URL back to it and include some other tweeter users in in.

These operations are all part of creating “enriched web streams” that mix and match photo, text, video, social networks signals etc. But to make that happen, most of the time, the average user will  have to open multiples browsers, cut and paste, login to some accounts, struggle with the limitations of each formats etc.

So obviously, when I got a demo from Google on Google Waves, I was very excited at the project underlying motto: to easily enrich the online conversations you might have, add or remove participants, facilitate sharing and communication (even in different languages). And the demo really works: you start to understand how this richer conversation can potentially change the way we communicate with each other on the web (and even keep track of these online enriched streams). Here’s the full video of the demo of the product. It’s a 1 hour and 20 minutes video but it’s really worth it:

The other very interesting project underway is The Mozilla Labs Ubiquity Project. It’s a slightly different scope and objective: the goal is to make your life easier when you’re using the Internet, just like having on your side a little robot facilitating a lot of your most usual tasks.The project is totally open to anyone to contribute, faithful in that to the spirit of the Mozilla Foundation.

The reason I’m linking the two projects is that they’re both, in their own ways, going to help us enrich the way we’re using the Internet. They also both contribute in blurring the classic frontiers between desktop apps, software and online apps. Here’s a quick video explaining the Ubiquity project (that is just starting…).

Reblog this post [with Zemanta]

“Reading Mr Market’s Mind” -> A positive view

Image representing Fred Wilson as depicted in ...
Image via CrunchBase

From Fred Wilson’s blog today:

“My head is in the same place it was last October and November when “the world was coming to an end”. I think we are in for a bad 2009 and a weak 2010 and maybe a better 2011. I also think we are going to see many large industries changed fundamentally by this downturn.”A VC, Apr 2009

The most cynical analyst tends to say that we’re just going through a classic pendular effect and that it’ll all go back to how it used to be. I actually agree totally with Fred’s thoughts: for a lot of industry segments, we will not get back to what it used to be.

A couple of examples:

– Retail: high end luxury brands, outside of the very high end where prices never went down (Cartier will always prefer destroying old inventory than discount it), will have a though time getting back to the pre-crisis prices. People are getting used to get some of them cheaper or they just switch to more affordable products. The perpetual lookout for bargains should mean a lot of potential for all ecommerce startups that are surfing on that front (I’m thinking Stylefeeder, Shopittome etc.)

– Advertising: most of the advertisers (when they’re buying) are tasting with great pleasure the extra inches that publishers and media companies accept to do to land deals. The bonuses here are more transparency, more efficency, optimized performances, shared metrics… OVERALL: the need for any media platform to now prove to the advetiser its validity as an advertising platform. I don’t think that these guys will actually go back to the black box that was used before. That’s also good news for the internet (which is probably the biggest metric driven ad platform) and all of the startups behind such as BluKai, Simulmedia, Lookery etc. Behavioral targetting, because of its data driven efficiency, should also take off if it’s not blocked by privacy rights advocates.

Reblog this post [with Zemanta]

How to value a blog?

Very good attempt from Douglas A. McIntyre at 24/7 Wall St to try to put a valuation number on the most used and famous blogs even if I get to a very different result. 

Here’s the list and their corresponding valuations:

1. Gawker Properties – $170m

2. Huffington Post – $90m

3. The Drudge Report  – $48m

4. Perez Hilton – $32m

5. Sugar Inc. – $27m

6. TechCrunch – $25m

7. MacRumors – $21m

8. SeekingAlpha – $11m

9. GigaOm – $9.5m

10. Politico – $8.7m

… the other 15 blogs are there.

The calculation is getting to strange results on some cases: Sugar at 27m seems like a bargain, especially when it’s based on the fact that the company has 70 employees. For me, part of these employees are actually working on the Coutorture blog network and, more than that, on Shopstyle which brings its own share of CPA / CPC revenues. Also, Gawker at 170m is probably much too high and that’s based on the fact that Doug takes 100% of the traffic (worldwide 23m UV) whereas only the US portion is monetizable (i.e. 14.4m UVs).

Let’s redo the exercise!

I’ve tried to redo the calculations on the Top 10 based on 24/7 info on employees and cost structures but using the US numbers on traffic (average of Quantcast / Compete / ComScore) with revised rates on remnants vs. premium advertising (closer to what we can see on the industry right now). The basic assumptions are:

  • Traffic: Only US traffic is monetizable. Most of the cases, Doug takes the World traffic. Even if in certain cases, you can monetize it through international remnant networks, it’s going to be the bottom of the barrel.
  • Premium rates can go from $4-5 to $15. I don’t believe in anything above right now, looking at the advertisers coyness. On top of that, you need to take into account that most blog templates includes up to 6 to 7 ads per pages, with maybe only  one or two of them above the fold (commanding higher CPMs). The ads below the fold, because of their very low CTR, are usually heavily discounted.
  • Remnant rates are lucky to be around $1 (that’s when fully optimized with a dozen+ of partners).
  • When blogs are using remnants partners, I’ll assume a premium sell-through and then most of the rest through remnants.
  • For the blogs that are growing quickly, I’ve also assumed a premium of 20% on the revenue since I’ve done all my calculations based on January figures. That’s limitative but that’s a sufficient for the exercise.
  • For the costs, I’ve kept Doug estimates since I don’t have any insights there. From time to time, I’ve just adjusted them based on revenues.


  • Traffic: I used an average between ComScore, Compete and Quantcast. Data are diverging from time to time pretty strongly between the sites but it’s better than nothing. As I was saying, I also only used the US portion of the traffic (24/7 used the worlwide traffic data). Pretty big variances here. Over the Top 10 sites, I came up -49.7% on the UV and -38.6% on the PV (the real driver for when we get to the calculation of the revenue). See below the detailled list and you can find here the Excel spreadsheet. 

Difference on traffic between 24/7 and me

  • Revenues: Obviously, because of the above, revenues vary dramatically, especially for the small sites. Combined revenues for the Top 10 sites get to $60m for me vs. $100m for 24/7 WallSt. That’s a big fallout. The key reason is that I’m very pessimistic on the ability of these sites to get high premium ad sell-through. On top of that, some of them are part of networks (Federated Media etc.), furthermore undermining the net revenue to them.
  • Margin: Well, as expected, this is where it gets ugly. 4 sites out of 10 are losing money (and I do think 24/7 was soft on their cost estimates for most sites), 2 of them are barely breakeven. 
  • Valuations: I’ve took a 2x multiple on revenues on all sites except when it was either a leader in its segment (HuffPo) or particularly influencial (TechCrunch) etc. On operating margins, most of the times I kept a 6 to 8x multiple.  When companies were losing money, I only kept the revenue multiple (which is nice…). 

So, total valuations for the Top 10 properties is at $159m from my end vs $442m for 24/7.  That doesn’t mean it will sell this low since between the intrinsic value of a company and its purchase price comes an infinity of variables (how much money was sunk by VC, how much these guys are expecting, what are the synergies to be expected, how badly does the purchaser wants the company etc.). 

You can find the full spreadsheet detailing the calculations here.