Phd Thesis (download In French)

Cooperative Caches System for Large Scale Distributed Information System

Summary 

Over the last years, the emergence of new technologies has given rise to a new class of services, named on-line services. The main feature of these services lies in interactions with users who request for information, through a shared communication  medium. The overall architecture composed of accessible on-line services, communication networks and users, forms an information system. Such a distributed system is further qualified as a Large scAle Distributed Information System (LADIS), when the number of  available services is not bounded. The most common example of LADIS is the World Wide Web.

In this thesis,  we are concerned with the improvement of LADIS's access latency since they are victims of their success and as a consequence currently offer poor response times to users. After a thorough study of the access patterns over LADISs, we show that our objective can be reached by using a method for the dynamic distribution and replication of information at the level of the communication network, also known as caching method. A cache saves, in a local space, the information requested by its users. By intercepting the user's requests and by providing the corresponding information (if this one is present in the local space), the cache decreases the LADIS's response time, reduces the use of the network, and decrease the on-line services load. Unfortunately, cache effectiveness is quite restricted when using existing caching strategy.

To improve caching effectiveness, our proposals consists in: (i) virtually increasing the cache size so as to enable storing more information, (ii) refining the cache replacement algorithm so as to keep in the cache the pieces of information that are the most frequently requested, and (iii) virtually increasing the number of cache clients so as to increase the number of user profiles and hence the probabilities of common profiles. Virtually increasing the cache size is achieved through the LRU-QOS algorithm, which lies in degrading the information stored in the cache. Refinement of the cache's replacement algorithm leads us to introduce the LRU-QOS algorithm, which accurately determines the pieces of information that have the strongest probability to be re-accessed among the information stored in the cache. Better replacement decisions are obtained by taking into account the access frequencies to the pieces of information as delivered by the on-line services from which they originate. Finally, we increase the number of clients of a cache by building a cooperation system over a set of caches. The main difficulty  lies in proposing an extensible protocol of cooperation, since the benefit of this approach is reached only for a very large number of cooperating caches. The resulting protocol is named SCOOPS and consists in a fair distribution of the state of the cooperating cache system. Our overall solution is named SCOOPS++ and integrates the three aforementioned cache management techniques. Compared to alternative solutions, it is shown through experiment that SCOOPS++ decreases the LADIS's response time of about 30% for users.