Yahoo Pipes as seen by a UFO

Around 4 years back, content syndication and its wide-spread usage was still in a state of infancy. There were a growing number of bloggers, many readers, and many blogs which generated a web feed for syndication, but few websites provided syndication for public consumption. Around that time, like many other bloggers, I started subscribing to syndicated web feeds for aggregated personal reading, but the lack of syndication support for a majority of websites left much to desire. It instigated me to build a tool which could automatically generate a web feed from virtually any given webpage that doesn’t have any native form of syndication (like RSS). The idea morphed into a prototype called FeedTier, and now into a functional service called Feedity.

Feedity, didn’t bake bread (yet), but it diversified the scope of web content that could now be syndicated from any source and then be subscribed by anyone. Overall it made content syndication simpler and flexible. So far, so good. Everything was normal in the “syndicated world”, just like any other day, till the time the clever guys at Yahoo! launched a new online service called “Pipes”. Yahoo Pipes is much more than its trivial definition: “an interactive feed aggregator and manipulator.”

It lets you create data mashups in an AJAX-based interactive visual environment (a pipeline manager). Yahoo Pipes is a fine example of a visual interface to an XML pipeline engine. My first reaction to Yahoo Pipes was that of being overwhelmed. As much as I love to think of it and hate to say it, I expected something this clever from Google, but Yahoo has done a great job with this one. The fact that it provides a visual canvas & wiring tool is not new, neither is the XML pipeline concept; but what impressed me the most is the fact that its “web-based”.

The AJAX-driven interface of Pipes is quite interactive, although it only worked in Firefox and didn’t load for me in IE6 (on WinXP). In my opinion, the initial usability of Pipes might still be limited to developers and the tech-savvy. The pipeline toolbox features several modules such as sources, input, linear & non-linear operators, sub-pipes etc., which can be “plugged-in” together to channel the flow of data and manipulate it on the way. You make a Pipe by dragging pre-configured modules onto a canvas and wiring them together. Once all the Pipes are glued, the pipeline (output in RSS/Atom/JSON) can be shared with other users or reused in other applications (your newsreader or a data integration project). Each Pipe accepts a source stream (RSS/RDF/Atom) and outputs the same (to a connected Pipe or the console) after some processing. But hey, what about those data sources which don’t provide a syndication format like RSS to act as a Pipe’s input? Walla! Feedity at help, it will let you generate a custom web feed from any feedless webpage, which can then be used as the source of the Fetch module in the pipeline. This external data sourcing provides a huge potential for extensive design & usage of custom data mashup solutions. Here’s a demo pipe I made in less than 10 minutes, which displayes photos from Flickr for all past sale items at Woot.

Yahoo Pipes - Woot Flickr Feedity Sample

This sample pipe demonstrates the usage of the Webpage-to-RSS sub-pipe I’ve built for fellow “plumbers”. The Webpage-to-RSS sub-pipe generates RSS from any webpage which doesn’t have native syndication. It utilizes Feedity for data mining. This pipe is basically to be used as a integrated sub-pipe in your own pipeline. Clone and reuse!

In the software world, the term pipe originated during the early evolution of Unix. Programs normally have some input and some output. A “pipe” allowed interfacing the standard input and standard output between modules or programs. The output of one module cound be interfaced as the input of another, thus creating a “pipe-line” of chained modules which produced a filtered output for ad-hoc data analysis. The use of “pipes” meant that programs could be very modular, each one performing some narrow, but useful task.

In the Internet world, a standard output is typically the webpage content, or a syndication (web feed). In some business processes, this web output is required as an input to further process or manipulate the data. With the pipeline management of Yahoo Pipes combined with the data sourcing capabilities of Feedity, such business processes can be streamlined or reengineered. The whole technical concept of an XML pipeline (and the XPL language) is not relativlely new to the world of computer science, but it lacked a usable front-end (that too a visual web-based one) for data manipulation, which is finally sufficed by Yahoo Pipes. Besides some rough edges, Yahoo Pipes works like a charm to resolve a growing problem (that of multi-source data mashups) in the world of content syndication.

Web-based applications like Yahoo Pipes, CogHead, Ning, Dapper etc. are just the begining (some wink at Web 3.0), and a very good one, mind you. That day is not far when a software consumer will be able to self-build an application online (in a user-friendly visual editor) as per their needs, or maybe rent one.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>