Thoughts on Everyblock and context

* Note: This post came from a version of this blog that got lost in a server failure. It's been restored from old RSS feeds, Google caches and other sources. As such, the comments, links and associated media have been lost.

First, the grains of salt:


  • I don’t live in an EveryBlock city. I think a key part of EveryBlock is the visceral connection you have with your neighborhood. I don’t have that, so view my comments with that in mind.

  • Huge Adrian Holovaty fan. Huge Django fan. Big believer in breaking out of the story centric worldview.

And now let me get this out of the way: EveryBlock is not something newspapers should fear. Here it is. I was wrong. I wrote that last year. Now, I don’t think newspapers need to fear this. Study it? Emulate parts of it? Learn from it? Absolutely. Fear it? No. Even if EveryBlock becomes crazy popular — and I think it will — it points to your content. If anything, EveryBlock will help people get to your content that interests them.

Okay. I feel better now. On to the meat of the post.

Is EveryBlock a data ghetto? This is tough. I can see why some would read my data ghetto post and think yep, EveryBlock is a data ghetto. I’ve honestly held off writing this post until I could answer the question for myself. It’s not completely clear cut, but I don’t think EveryBlock, as it stands now, is a data ghetto.

Why?

I’ve seen several people say that EveryBlock is data without context, but that’s not entirely true. The context comes from the user through geography. The block is the context. The value you put into that context is based entirely on the fact that you live there. And, from that powerful context, the user provides further context by choosing what’s next, what is interesting to them.

What also gets EveryBlock out of the data ghetto, outside of the whole geographic construct that almost no newspaper.com’s use, is that in some places EveryBlock provides it’s own context via graphing and counting instances of a thing. Compare that to data ghettos as I classified them: couple of search boxes and a results page. I’ll add another data ghetto indicator to the pile: single subject searches. On most newspaper.com data ghettos, you can search real estate transactions or crime - you can’t do both at the same time. Few have taken the next step of putting data together into one place.

But, but, not that context, this context: I think there are some legit complaints about context in EveryBlock. But I also think calling it data without context, perspective or meaning is wrong. The question is what context? And whose? That is a deeply complicated issue.

Mark Schaver points to the pothole paradox (short version: your pothole is interesting to you, the one not on your commute could not be any less interesting). So distance is important — the closer, the more you’re interested. But even that’s not that simple. If my neighbor has his lawn mower stolen, I care a lot. Two blocks away? Meh. But if someone in my neighborhood is murdered 5 or even 10 blocks away, I care a lot. It doesn’t have to be all that close. Same with building permits. I kinda care if my neighbors are pulling permits to build a garage or a bathroom. Much beyond that, I don’t. But if Wal-Mart is building a Super Center one or two neighborhoods over, I really care.

So I think the journalist’s complaint about context in EveryBlock — and it’s valid if you’re thinking like a journalist — is that there’s no mechanism to make one data point stand out from another. The example Schaver pointed to was the death of Heath Ledger. In New York’s EveryBlock? Yep. But on the map view, deaths are just another dot. Hollywood stars, homeless people, crack dealers — just another dot.

Not all dots are created equal. And there are legions of factors as to what makes one more interesting than another, and another legion of factors as to why some things are more interesting to different people. Interestingness — or News Value — is an exponentially complicated equation.

To my mind, that EveryBlock doesn’t try to set some subjective one-size-fits-all standard of importance for data is proof that Adrian isn’t lying when he says EveryBlock is a supplement, not a competitor, to traditional news sources. Pointing out Important Stuff and making a Big Deal Out Of It is what newspapers do. EveryBlock doesn’t even try.

That said, I think the next step — for EveryBlock, for newspaper data ghettos, whoever — is personalization. Imagine if EveryBlock took your home address and a list of the things you cared about and displayed data with some kind of distance/importance weighting algorithm. Not everything, like now. Just a good guess at what’s important to you based on the distance from your home and how important you said certain things were to you. And done in such a way that you can tweak your own settings. Just have a kid? Might be time to move the schools slider up. And while you’re there, move the restaurant review sliders down. Trust me. With a new baby, you won’t be going out for a while.

Even switching to some kind of straight distance algorithm solves one problem with EveryBlock: Just because it didn’t happen in your neighborhood/block/zip code doesn’t mean it wasn’t near you. What if you live on the edge of a neighborhood — one block over is Whispering Pines but you’re in Whispering Oaks? If someone is murdered one block away, but they’re in Whispering Pines, do you care? Of course you do. It’s one block away. Using a distance algorithm, you aren’t hemmed in by boundaries or forced to check more than one neighborhood — no matter how easy it is — to be sure you’ve seen all you want to see.

Don’t take anything I’ve written to mean I don’t think EveryBlock is amazing. It is. I’m blown away by it. I’m humbled by it — first thing I thought when I was going through it was “crap, I need to work harder/sleep less/take brain steroids if this is the bar being set.” I just want to be clear that for EveryBlock and for anyone doing news data apps, context is an extremely important factor … and extremely complicated.

By: Matt Waite | Posted: Jan. 27, 2008 | Tags: Journalism, Databases | 1 comment