I am a big fan of LinkedIn’s. I even PAY for the service from time to time, just to be a power user for a couple of months. There are many ways in which the LinkedIn experience could be improved – their search capability is appalling and exploration is exhausting (we have a solution for THAT), not to mention the lack of tools for managing and making sense of a large group of connections, as well as their streams of updates of all sorts (we have a solution for THAT too!). But I had a surprise of a different kind this morning. In fact, I had two surprises which are outlined by the two black squares on the screen snapshot (below). The first one is funny and highlights the difficulty of managing a complex system: the black square on the left shows a Yahoo News storyrecommended to me by LinkedIn. The story is about how LinkedIn has surreptitiously coerced 100 million users into sharing private information. Not exactly a positive for LinkedIn, and an upsetting story for many users – their tactics may even be unlawful in some countries. This leads me to the second surprise, this one not funny and rather worrisome (which in some ways relates to the first surprise, though not in a good way). Look at the square on the right and you’ll see the “People You May Know” suggestions from LinkedIn. These are usually people who are second-or third-order connections and going through this list of suggestions has become part of my daily routine –about half of all my connections have come from going through the list and indeed finding people I know and want to connect with. Here, Matt Flannery, Kiva.org’s CEO is a second-order connection, which means that we have several connections in common. I do not know Matt personally, but I am not surprised to see him on the list. The other two suggestions are very spooky. Here is why: we are not second- or third- or even fourth-order connections, but I do know them –sort of. They are both managers of hotels I stayed at in the last two months. I booked one through Expedia and the other one through Tablet Hotels (if you like design hotels and don’t know Tablet Hotels, go now!). How does LinkedIn know about the hotels I booked through different providers? Are Tablet Hotels and Expedia both providing my private information to LinkedIn? I haven’t found any indication of it, but I am still exploring. If the data is not coming from Expedia or Tablet Hotels, then the only way that LinkedIn could have found out about these hotels is from spying on me by following my web surfing. So in a way I am hoping that Expedia and Tablet Hotels did share my data with LinkedIn, though I would be really upset. Let’s assume for a minute that LinkedIn did what I suspect they did (which is not totally unlikely given the story in the left square), why would they risk alienating their users? They recently beat the street with their revenues and earnings, which might be a short term reward for tactics that might backfire.
Complexity Risk at LinkedIn
Building Accurate Predictive Models “without Data”
When developing any type of predictive model, it is normal to think of “data” as a set of numbers, typically numerical measurements based on observations. It is a commonly held opinion that the accuracy of a model depends entirely on the quality and quantity of available data, as reflected in the use of statements such as “garbage in, garbage out.”
There is no doubt that the more information is used in building a model, the more accurate the model is likely to be. However, the notion that quantitative, numerical data are the only type of information needed to build an accurate model is flawed. In fact, I believe that the typical business obsession with numeric data can do more damage than good.