Radical media, politics and culture.

chapter1

No Price Signals! No Managerial Command! Little need of coordination. Incremental and granular. Diffusion of former function of the editor. Given a sufficiently large number of contributors the incentives question becomes trivial - as the level of bait offered need only be scaled to that small increment. Peer production is limited not by costs modularity - how many can participate and with what variation of investement by the actors. Can it be broken down? granularity- how small and increment can be contributed minimal investment cost of integration

decrease communication costs increase human salience

information opportuniy costs, how do you decide how to act, reduce uncertainty about different forms of action vast set of resources, many agents then it's better way of identifying the right perso for any givcen task at a givn moment economies of scale and scope. boundless rather than bounded as in the firm

other capital inputs are now chaeper, except humans! I'm not a capital inputs defection restraints CPRs, endogenous technologies, social norms, iterative peer production of integration redudancies evened out

Together, the FastTrack and Gnutella protocols currently boast an outstanding 2.9 million simultaneous users (www.slyck.com, July 2, 2002).

When Morpheus first joined the Gnutella network, its population exploded to over 500,000 users. Now we're witnessing its population hover at only 160,000. We talked to a LimeWire representative and discovered several key reasons for its decline.

Intro We estimate that if Napster were built on a cC-S architecture, for the number of songs “on” its network at its peak, Napster would have had to purchase over 5,000 NetApp F840 Enterprise Filers to host all the songs shared among its users

Using an estimnate of 65 million users at its peak, allegedly having 171 mp3s apiece for a total of 33,345 tb (costing 666, 900,000) and a webnoize estimate of three billion downlaods a month, each mp3 conservatively estimated at 3mb, would require 27, 778 mbits of bandwdth per second (45 OC 12 lines for a further total of 6,698,821 per month). Alternatively, to purchase the necessary PCs and ISP accous would cost 1,495,000,000.

The availability of large amouts of unused bandwidth and storage space on users computers has facilitated the emergence of widescale peer to peer network dedicated to sharing content. The attention focussed on file-sharing due to the contested nature of ts legality, has impeded consideration of other applications such as the virtualisation of storage space across networks (already a significant industry) and bandwidth pooling for the authorised dissemination of content lying outside the ownership and control of the major media and communications conglomerates. These peer to peer systems are typically established at an application level , employ their own routing mechanisms and are either independent or ephemerally dependent on dedicated servers.

Note that the decentralized nature of pure P2P systems means that these properties are emergent properties, determined by entirely local decisions made by individual resources, based only on local information: we are dealing with a self-organized network of independent entities.

File sharing devices behave like cahe clusters, keeping traffic local. Description of (1) Content Distribution Networks Peer networks can be used to deliver the services known as Content Distribution Networks (CDNs), essentially comprising the storage, retrieval and dissemination of information. Companies such as Akamai and Digital Harbour have already achieved significant success through installing their own proprietary mdels of this function on a global network level, yet the same functions can be delivered by networks of users even where they have only a dial-up connection. Napster constituted the first instantiation of this potential and subsequent generations of file-sharing technology have delivered important advances in terms of incrasing the robustness and efficiency of such networks. In order to understand the role that peers can be play in this context we must first examine the factors which determine data flow rates in the network in general.

(2) Content Storage Systems Fibre Channel Storage Area Networking dominates the market in storage currently. A market i storage space on end-user equipment is easy to imagine and would provide a real competitor to the incumbent market players. Principal inhibotor of tranfer speed is geographically determined latency. As storage space continues to be cheaper than bandwidth, local storage options ar attractive. Ocean store are assembling a network fo untrusted storage nodes for basic data. "Fragmentation and distribution yield redundancy. Distributed autonomous devices connected on a network create massive redundancy even on less-than-reliable PC hard drives. Redundancy on a massive scale yields near-perfect reliability. Redundancy of this scope and reach necessarily utilizes resources that lead to a network topology of implicitly “untrusted” nodes. In an implicitly untrusted network, one assumes that a single node is most likely unreliable, but that sheer scale of the redundancy forms a virtuous fail-over network. Enough “backups” create a near-perfect storage network."

These companies, like the file-sharing networks, function through virtualising resources, unting them across the network i an object oriented manner. IDC estimated the market for storage service providers at 2, 379 million dollars in 2002 (internet 3.0 at p.80).

Unused Storage Capacity "We looked at over 275 PCs in an enterprise environment to see how much of a desktop’s hard drive was utilized. We discovered that on average, roughly 15% of the total hard drive capacity was actually utilized by enterprise users." (Internet 3.0) at p.81 Economic incentive derives not from the value of memory, but rather the potentil to integrate the space into a distributed system that can obviate xpensive network transfers.

Akamai now operate in the storage busineess as well, in combination with Scale Eight who have four storage centers and provide access for cliens through a customised browser.

1(a) Breakdown of congestion points on networks. The slow roll-out of broadband connections to home users has concentrated much attention on the problem of the so-called 'last mile' in terms of connectivity. Yet, the connection between the user and their ISP is but ne of four crucial variables deciding the rate at which we access the data sought. Problems of capacity exist at multiple other points in the network, and as the penetration of high speed lines into the 'consumer' population increases these other bottlenecks will becme more apparent. If the desired information is stored at a central server the first shackle on speed is the nature of the connection between that server and the internet backbone. Inadequate bandwidth or attempts to access by an unexpected number of clients making simultaneous requests will handicap transfer rates. This factor is known as the 'first mile' problem and is highlighted by instances such as the difficulty in accessing documentation released during the clinton impeachment hearings and more frequently by the 'slash-dot effect'. In order to reach its destination the data must flow across several networks which are connected on the basis of what is known as 'peering' arrangements between the netwrks and faciltated by routers which serve as the interface. Link capacity tends to be underprovided relative to traffic leading to router queuing delays. As the number of ISPs continues to grow this problem is anticipated to remain as whether links are established is essentially an economic question. The third point of congestion is located at the level of the internet backbone through which almost all traffic currently passes at some point. The backbones capacity is a function of its cables and more problematically its routers. A mismatch in the growth of traffic and the pace of technological advance in the area of router hardware and software package forwarding. As more data intensive trasfers proliferate this discrepancy between demand and capacity is further exacerbated leading to delays. Only after negotiating these three congestion points do we arrive at delay imposed at the last mile.

Assessing Quality of Service What are the benchmarks to evaluate Quality of Service ("Typically, QoS is characterized by packet loss, packet delay, time to first packet (time elapsed between a subscribe request send and the start of stream), and jitter. Jitter is effectively eliminated by a huge client side buffer [SJ95]."Deshpande, Hrishikesh; Bawa, Mayank; Garcia-Molina, Hector, Streaming Live Media over a Peer-to-Peer Network) jm For those who can deliver satisfaction of such benchmarks the rewards can be substantioal as Akamai demonstrates: 13,000 network provider data centers locations edge servers click thru - 20%- [10 - 15% abdonment rates] [15% + order completion]

Although unable to enusre the presence of any specific peer in the network at a given time, virtualised CDNs function by possessing a necessary level of redundancy, so that the absence or departure of a given peer does not undermine the functioning of the network as a whole. In brief, individual hosts are unreliable and thus must be made subject to easy substitution. From a techniucal vantage point the challenge then becomes how to polish the transfer to replacement nodes (sometimes reffered to as the problem of the 'transient web').

To facilitate this switching between peers, the distribution level applications must be able to identify alternative sources for the same content, which requires a consistent indentification mechanism, so as to generate a 'content addressable web', currently absorbing the efforts of commercial and standard setting initiatoives [Bitzi, Magnet, OpenContentNetwork].

In short the effectiveness of any given peer neytwork will be determined by: a) Connectivity Structure and b) The efficiency with which it utilises the underlying physical topography of the network, which ultimately has limited resources.

Techniques used for managing network congestion: (b) - load balancing/routing algorithms "Load balancing is a technique used to scale an Internet or other service by spreading the load of multiple requests over a large number of servers. Often load balancing is done transparently, using a so-called layer 4 router?." [wikipedia] Lb Appliances, Software, Intelligent Switches, Traffic Distributors - Cisco (DistributedDirector), GTE Internetworking (which acquired BBN and with it Genuity's Hopscotch), and Resonate (Central Dispatch) have been selling such solutions as installable software or hardware. Digex and GTE Internetworking (Web Advantage) offer hosting that uses intelligent load balancing and routing within a single ISP. These work like Akamai's and Sandpiper's services, but with a narrower focus. - wired

- caching NAT (Network Address Translation) Destination NAT can be used to redirect connections pointed at some server to randomly chosen servers to do load balancing. Transparent proxying: NAT can be used to redirect HTTP connections targeted at Internet to a special HTTP proxy which will be able to cache content and filter requests. This technique is used by some ISPs to reduce bandwidth usage without requiring their clients to configure their browser for proxy support using a [wikipedia] layer 4 router. See Inktomi. Cahing servers intercept requests for data, checks to see whether it is present locally. If it is not, then the caching server forwrds the request to the irginator, and passess it back to the requestee having made a copy so as to serve the next quesry for the same file more quickly.

Local Internet storage caching is less expensive than network retransmission and, according to market research firm IDC, becomes more attractive by about 40% per year. Caches are particularly efficient for international traffic and traffic that otherwise moves across large network distances. at 83

content delivery network/streaming media network

- Akamai Akamai freeflow hardware software mix: algorithms plus machines mapping server (fast to check hops to region) and content server http://www.wired.com/wired/archive/7.08/akamai_pr. html sandpiper applications Data providers concerned to provide optimal delivey to end users are increasingly opting to use specialist services such as Akamai to overcome these problems. Akamai delivers faster content through a combination of propritary load balancing and distribution algorithms combined with a network of machines installed across hundreds of networks where popularily requested data will be cached. (11,689 servers across 821 networks in 62 countries). This spead of servers allows the obviation of much congestion as the data is provided from the server cache either on the network itself (bypassing the peering and backbone router problems and mitigating that of the first mile) or the most efficient available network given load balancing requirements.

(c) Evolution of filesharing netwoks

Popular filesharing utilities arose to satisfy a more worldly demand than the need to ameliorate infrastructural shortfalls.

- Napster When Shaun Rhyder released his Napster client the intention was to allow end-users to share MP3 files through providing a centralised index of all songs available on the network at a given moment and the ability for users to connect to one another directly to receive the desired file. Thus Napster controlled the gate to the inventory but was not burdened with execution of the actual file transfer that occurred over HTTP (insert note on the speculative valuation of the system provided by financial analysts, with qualification). Essentially popular file sharing utilities enable content pooling. As is well known, the centralised directory look-up made Napster the subject of legal action, injunction and ultimately decline.

Nonetheless, Napser's legal woes generated the necessary publicity to encourage user adoption and for new competitors to enter the market and to innovate further. In the following section I describe some of the later generations of file sharing software and chart their innovations which have brought them into a space of competition with Akamai et al.

- Gnutella Original implementation has been credited to [Justin Frankel]? and [Tom Pepper]? from a programming division of AOL (then-recently purchased Nullsoft Inc.) in 2000. On March 14th, the program was made available for download on Nullsoft's servers. The source code was to be relased later, supposedly under the GPL license. The event was announced on Slashdot, and thousands downloaded the program that day. The next day, AOL stopped the availability of the program over legal concerns and restrained the Nullsoft division from doing any further work on the project. This did not stop Gnutella; after a few days the protocol had been reverse engineered and compatible open source clones started showing up. (from Wikipedia) The Gnutella network (Bearshare/Limewire) represents the first, which is decentralized client server application. This allows a much more robust network in the sense that connectivity is not dependent on the legal health of a single operator. A trade-off with this is inefficiency in the locating of files and the problem of free riding users, which actually impede the functionality of the system beyond simply failing to contribute material. Limewire addresses this problem to some degree by providing the option to refuse to download files to users who do not share a threshold number of files. Unfortunately this cannot attenuate the problem of inefficient searches per se, merely offering a disciplinary instrument to force users to contribute. In order to sharpen search capacities in the context of a problematic network design, these networks have taken recourse to nominating certain nodes as super-peers, by virtue of the large number of files they are serving themselves. While essentially efficacious, the consequence is to undermine the legal robustness of the network. The threat is made clear in a paper published last year by researchers at PARC Xerox that analyzed traffic patterns over the Gnutella network and found that one per cent of nodes were supplying over ninety per cent of the files. These users are vulnerable to criminal prosecution under the no electronic theft act and the digital millennium copyright act. The music industry has been reluctant to invoke this form of action thusfar, principally because of their confidence that the scaling problem of the Gnutella community reduces the potential commercial harm it can inflict. As super-peering etc. becomes more effective this may change

Incompatabilities between Gnutella supernodes/ultrapeers - Fast Track/Kazaa Similar systems are now been offered by these companies to commercial media distributors such as Cloudcast (Fasttrack) and Swarmcast, using technical devices to allow distributed downloads that automate transfer from other notes when one user logs off. The intention here is clearly the development of software based alternatives to the hardware offered by Akamai, the principle player in delivering accelerated downloads and used by CNN, Apple and ABC amongst others.

Edonkey/Overnet, Edonkey and Freenet distinguish themselves fro the other utilities by their use of hashing so as to identify and authenticate files. As data blocks are entered into a shared directory a hash block is generated (on which more below). Freenet introduced the idea of power-law searches into the p2p landscape, partially inspired by the speculation that the gnutella network would not scale due to a combination of its use of broadcast search model, the large number of users on low speed data connections, and the failure of many users to share. Edonkey became the first to poularise p2p weblinks and to employ the Multicast file Transfer Protocol so as to maximise downlaod speed by exploiting multiple sources simultaneously and allowing each user to become a source of data blocks as they were downloaded. In addition the Donkey allows the use of partial downlaods from other peers as part of the pool from the download is sourced, dramatically improving availability.

One drawback to the Edonkey is its proprietary character. Happily, recent months have seen the apparance of an open source donkey client called Mule.http://www.emule-project.net/ Mldonkey does something similar (http://www.infoanarchy.org/story/2002/8/7/45415/23698)

Freenet "Each node maintains its own local datastore which it makes available to the network for reading and writing, as well as a dynamic routing table containing addresses of other nodes and the keys that they are thought to hold." (Hong, Theodore (et al.) (2001). Freenet: A Distributed Anonymous Information Storage and Retrieval System. In Federrath, H. (ed.) Designing Privacy Enhancing Technologies: International Workshop on Design Issues in Anonymity and Unobservability, LNCS 2009. New York: Springer ) The information retained by Freenet nodes distinguishes it from the gnutell network. Given the presence of more information at query routing level less bandwidth is spent on redundant simultaneous searches. In addition, copies of documents requested are deposited at each hop in the return route to the requestor. Freenet effectively reproduces the caching mechanism of the web at a peer to peer level so as to respond to the actual demand on the network. If there are no furher requests for the document, it will eventually be replaced by other transient data. All locally stored data is encrypted and sourced through hash tables. Any node will maintain knowledge of is own hash tables and those of several other nodes.

Having overcome the need for scattershot searches Freenet theoretically manages bandwidth resources in a much more efficient manner than Gnutella. On the other hand, the importance allocted to maintaining anonymity through encryption detracts from its potential to become a mass insallation file-sharing program. Little surprse then, that in the las eighteen months Freenet's inventor, Ian Clarke, has founded a company called Uprizer that is porting the freenet design concept into the commercial arena whilst jettisoning the privacy/anonymity aspect. Specifically, in order to cloak the identity of the requestor the file is conveyed backwards through the same nodes that resolved the query, utilizing bandwidth unnecessarily in transit.

(ii) Connectivity Structure Traffic volum derived from search requests has become a significant problem. The Gnutella client Xolox for example, introducing a requery option to their search, produced what 3was described in Wired a low-level denial of service attack .(http://www.salon.com/tech/feature/2002/08/08/gnutella_developers/index.html) Search methods Centralised look up. The most efficient way to search decentralised and transint content is through a centralised directory lookup. Napster functioned in this way and conomised on bandwidth as a result. Alas it also left them vulonerable to litigation and it is safe to say that any p2p company providing suvc a service will have the same fate.

Broadcast. In this case queseries are sent to all nodes connected to the requestor. The queeries are then forward to nodes connected to those nodes. This leads to massive volume, often sufficient to saurte a dial-up vcpnnection. It is also extremely inefficient as the search continues even after a successful solution has been achieved. Most searches have a 'time to live' to limit the extent of the search, and where there are many weaker links thesearch can die without ever reaching large parts or even the majority of the network. This is the search method initially used by Gnutella. A bandwidthù-based tragedy of the commons effectvely obliged the creatio of super-peers to centralise knowledge about their local networks so as to have better look-up. Such a step is a deviation from the pure p2p model and raises the spectre of attractive litigation targets one again.

Milligram A 1967 experiment by Stanley Milligram on the structur of social networks yielded surprising results. A random sapmple of 160 people in the US Mid-Wes were asked to convey a letter to a stockbroker in Boston using only intermediairies known on a firs name basis. 42 of the letters arrived in a media of 5.5 hops, fully one third of the successfully delivered letters passed through the same shopkeeper. The evidence drawn from this experiment was that whilst most people's social networks are narrow and incestuous each group contains individuals who act as spokes to other groups. The conclusion drawn were dubbed the 'small world effect' for obvious reasons. Power/ law. Hess, Edonkey bots,