Posts Tagged ‘metadata’
Photo Booth And Canons With Cheap Lenses Dominate Tumblr
The Tumblr staff has done an interesting little breakdown of the metadata on Tumblr blog photos. I’m sure you guys have seen Flickr’s equally-interesting Camera Finder page, which is used as a sort of talking point by Apple fans due to the iPhone dominance; this was a similar examination, though with seriously different results. Tumblr’s analysis also takes a look at the lenses being used by the Canon users, a metric more interesting to gearheads than tech buffs.
This kind of information is a dream come true for people who like to transmute raw data into conclusions. They call themselves analysts, but it’s more alchemical than analytical, isn’t it? At any rate, the data are interesting to anyone interested in photography or blogging, so take a look.
Facebook Gets Three Times More Efficient At Finding Photos In Its Humungous Haystack
With more than 15 billion photos (and 60 billion image files with replication for different sizes), Facebook eats up a lot of storage with its photo application alone. Members are adding members add 220 million new photos every week. Facebook currently has more than 1.5 petabytes of storage for its photos, and that is growing at a rate of 25 terabytes a week. Last year, Facebook spent an estimated $30 million on NetApp storage appliances alone just to keep up with the growth of photos and other uploaded content. To reduce some of these costs, Facebook decided to engineer its own storage architecture called Haystack.
Now more details have emerged about how that system actually looks and works. In a nutshell, Haystack will allow Facebook to switch from expensive, commercial storage appliances to commodity off-the shelf hardware. It is going from a traditional network file system to something more akin to stripped-down network application that does only what it needs to do. Not only will Facebook get the cost savings of going commodity, but they also get a 3X improvement in storage capacity. In other words, what used to take 30 discs to store, now will take only 10.
With so many billions of images, serving the right one is like finding the proverbial needle in the haystack. With a traditional network file system, a lot of metadata goes flying around detailing when files were last modified, what directories they are listed in, and so on. All of this metadata creates a bottleneck as it is passed back and forth. So the two Facebook engineers who built Haystack (Doug Beaver, Peter Vajgel, and Jason Sobel) decided to get rid of much of this metadata. As they explain on Facebook’s engineering blog:
The new photo infrastructure merges the photo serving tier and storage tier into one physical tier. It implements a HTTP based photo server which stores photos in a generic object store called Haystack. The main requirement for the new tier was to eliminate any unnecessary metadata overhead for photo read operations, so that each read I/O operation was only reading actual photo data (instead of filesystem metadata)
All of that metadata is stored in what Facebook is calling “needle.” Each needle pulls together the metadata for hundreds of thousands of images. The needles are paired with an index to make up the Haystack object store. You can read all the technical details on the Facebook engineering blog. The company will keep its existing network file system for the 15 billion photos already uploaded (after all, those NetApp boxes are sunk costs). But going forward, all new photo uploads will be handled by Haystack. And in the future, Facebook may even open-source the architecture so other companies can benefit from it. Not bad, for something that was built by three engineers.
(Photo credit: Flickr/Vitor Antunes)
Crunch Network: CrunchBoard because it’s time for you to find a new Job2.0
BuddyPress Launches: May A Thousand Social Networks Bloom (Someday)
BuddyPress, the side project of blogging powerhouse WordPress, has just hit version 1.0 and has officially launched. It’s basically a social layer that you can lay on top of your WordPress (MU — more on that below) blog to give it some of the social network features that you’re already familiar with from larger social networking sites.
Here’s what version 1.0 features: Extended profile, private messaging, friends, groups, “the wire,” activity stream, blog tracking and forums. Yes, that’s a lot of stuff in a first version — and it looks great (see the screenshots below). All of these features should be relatively straightforward from their names, except “the wire,” which is basically like your Wall on Facebook. People can go to that area and leave messages.
And slated for release in 2009 are yet more features, including: Status updates and photo albums. Sound familiar?
While WordPress founder Matt Mullenweg is quick to point out that BuddyPress is not meant to be yet another stand-alone social network in your life, his post about it seems to poke directly at the larger networks like Facebook and MySpace. “I mean all your friends are already on Myspace, but if you wanted to start something new maybe with more control, friendlier terms of service, or just something customized and tweaked to fit exactly into your existing site, then BuddyPress is a great framework to use. Maybe even someday you’ll be able to connect your BuddyPresses to each other and to the existing monolithic social networks,” Mullenweg writes.
That reads a lot like, “hey a lot of people are pissed off by the big social networks terms of service issues, and their set ways of thinking, why not use BuddyPress?” And depending on how well this impressive feature set works, some people just might. It’s also worth noting that in an interview he did with us a couple weeks ago, Mullenweg described BuddyPress as “Facebook-in-a-box.”
But there’s also a catch to BuddyPress for the time being: To install it, you have to be using WordPress MU, the multiple-user variety of the blogging software that is a bit more complicated to set up and is used much less than traditional WordPress. But Mullenweg’s comment that BuddyPress “currently requires” WordPress MU, would seem to indicate that eventually it will roll out to the larger WordPress community as well.
BuddyPress has been in development for over a year, and was originally called “ChickSpeak.” This name is much better.
Crunch Network: CrunchBoard because it’s time for you to find a new Job2.0
Live: George Zachary Interviews Tesla CEO Elon Musk
This afternoon Charles River Ventures partner George Zachary is sitting down for a one on one interview with Elon Musk, the CEO of Tesla whose other credentials include cofounding SpaceX and PayPal. The interview promises to be an interesting one - Musk hasn’t been known to pull any punches. Michael, who is attending the event, is streaming it live using Qik.
Crunch Network: CrunchGear drool over the sexiest new gadgets and hardware.
Ad.com Sells For $1.4 Million

The domain Ad.com sold for $1.4 million yesterday at domain name registration company Moniker’s TRAFFIC conference in Silicon Valley. The winning bidder was Divyank Turakhia of Directi.com and CEO of Skenzo, a domain parking company.
Moniker made more than $2 million in domain names at the TRAFFIC auction, with Ad.com taking the highest bid. Bottledwater.com took the no. 2 spot at $45,000 and Athletic.com received the third highest amount, selling for $40,000.
$1.4 million may sound like a lot to spend on a domain, especially given the current state of the economy. But Ad.com is a two-letter domain that is easily pronouncable and actually means something, so it’s definitely valuable in the domain market. And a recession doesn’t seem to be stopping companies from spending the big bucks for desirable domain names so Turakhia may be able to flip Ad.com for a profit. Travelzoo bought Fly.com for $1.8 million in January. Vibrators.com was sold for $1 million a back in November and A&T’s YellowPages.com paid $3.85 million for YP.com in December.
Crunch Network: CrunchGear drool over the sexiest new gadgets and hardware.
Ruba: An Online Travel Guide Where Photos Speak Louder Than Words
Travel guides are a dime a dozen on the web. But for the most part, they’re not very conducive to really exploring - it’s not much fun to click through various guides to get a feel for where you’d like to visit, because each guide is loaded with a wall of text. Ruba, a new travel site that launches today, is looking to offer users a way to visually browse through cities and their attractions around the world, offering photo-rich guides and an emphasis on making it easy to quickly discover new locations.
The site is headed by Mike Cassidy, who has founded a number of successful companies, including Xfire, which sold to Viacom in 2006 for $102 million. Cassidy says that his team has worked to create a very clean site that is very snappy and easy to casually browse through, with a strong technical emphasis placed on search. Rather than ask users to tag the guides they create, the search engine identifies keywords in their descriptions and titles. And while this can be prone to false positives, in the demo I saw it seemed to work quite well (a search for ‘kids london’ resulted in guides like “Top 5 Things To Do With Kids In London” and “The London Aquarium”).
Guides are all written and submitted by users, with Ruba pulling from Google and Flickr APIs to help pinpoint locations and provide some sample photos (users can submit their own, too). Cassidy says that users have been building guides to both serve as references for others, and also as a way to chronicle their own trips (which other users can in turn benefit from).

Travel is a tough space to break into, with well established sites like TripAdvisor dominating user review submissions and countless smaller sites looking to find their niche. From the get-go, Ruba is looking to spread virally. The site features integration with Twitter and Facebook Connect, allowing users to broadcast where they’re headed and ask friends for input (I think this will work especially well on Facebook, as items will appear in users’ News Feeds).
The biggest question about the site’s viability is the fact that everything is user-submitted, including the blurbs about each location. If the site can build a strong community as Yelp and Flickr have, then this won’t be an issue. But if people frequently visit the site and can’t find the city or attraction they’re looking for, they won’t be likely to submit their own guides (it’s the chicken and the egg problem all over again). That said, I suspect that many people will enjoy building their own guides. Trip Advisor may have a larger community, but there’s a certain appeal to being able to stamp your name on your guide, embedding it in your blog and sharing it with friends so your thoughts don’t get lost in the shuffle.

Crunch Network: CrunchGear drool over the sexiest new gadgets and hardware.



