BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Reputation, Recommendation and Influence

This article is more than 10 years old.

Defining the logic -- e.g., for "I follow her because one time she drew an analogy I liked between a candidate's debate response and Willem Dafoe's Big Tuna scene in 'Wild at Heart'" -- is highly complex. We make the connection in an instant.

Once you get past the lowest-hanging fruit (for example: "You follow Jay Rosen. You may also like Clay Shirky."), this kind of inference is incredibly difficult to do purely in code.

It's Complicated

As an illustration of how difficult it is to measure individuals' influence across every topic, internet-wide, consider the complexity of measuring influence in a small group of people across a limited number of factors.

Forbes' Most Powerful People works to define, measure and weight a set of attributes that collectively represent "Power."

Even within a finite set of people - the culling begins from a few thousand candidates - and using just four determining factors

  • Over how many people does this person have power
  • Financial resources controlled by each candidate
  • Do they have influence in more than one sphere
  • How actively they wield their influence

It still takes collective person-months and hundreds of fuzzily-defined editorial decisions -- not the least being agreement on the primary determining factors -- to arrive at the final ranked list of 70.

Accomplishing similar results with machines requires lots of inputs and significant structure throughout those inputs. Machines do well at processing vast amounts of data and surfacing trends. When that data comes from people making explicit, qualitative decisions ("curation"), machines are great at showing commonalities. Inference -- connecting the dots -- is the endeavor of people.

Adding massive amounts of curation data to code, for example Netflix user reviews plus collective rental/streaming history, helps Netflix recommend your next stream. That comes down to clever(**) pairing of decisions people have made -- so-called "collaborative filtering" --  more so than the machine's ability to infer preference or quality based on you, the content, or creators themselves.

(** - taking nothing at all away from mining vast amounts of usage data to surface trends and similarities - it's incredibly elegant work)

Gathering lots of data, such as your page views, comments, blog posts, how long you've been active, sites visited, frequent sources cited (which all have a lot of "structure" them) -- can inform a formula that frames you, but it's not nearly a complete picture of you.

(There is most definitely significant work being done in machine learning and deeper understanding of content and context. The Music Genome Project is a great example. With tremendous progress being made in semantic analysis, artificial intelligence and natural language processing, the current state of the art is still around curation and collaborative filtration.)

e.g.

You've been to HuffPo 5 times this month and the NYT 7.

You have 16 recommendation on your LinkedIn profile.

You've added 213 Facebook friends in the past 18 months.

The machine knows where you've been, but can't speak to what you've read. It can count the 16 recommendations, but won't dig much into their basis. It sees the 213, but doesn't know if they're high school chums, new co-workers, or folks you met at a convention without you explicitly saying so. We'll have data on you while knowing very little about you.

Once you create a reputation and rewards system, it will be abused.

The rewards can be money, ego boosts, or simply the opportunity to be featured next to other people who are seen as "desirable;" some people will work to game the system.

Amazon's product reviews is another great collaborative filtering example: helpful reviewers rise to the top in ranking and reputation. But then some people, as a shortcut to that success, wind up copying sentiments - sometimes plagiarizing passages or entire reviews from reviewers who've already done well with the community. So the people-powered systems have biases in them just as individuals each come with their own slants.

Deriving reputation, making personal recommendations or calculating influence based on a collection of loosely-structured information is an inexact, and incredibly interesting, set of challenges. Working to quantify individuals' influence based on activities across the web and on more than just a few topics is a problem that will see significant innovation.

As community members, getting meaningful input re whom we should follow, listen to, trust is a valuable service. For creators, actively growing a reputation and sphere of influence, accurately, has real implication. It's a space which continues to get more interesting as it develops, even if sometimes awkwardly.

--

Follow me on Twitter.