Distributed Computing Among Independent Web Browsers Applied to Text and Image Processing


Branislava Šandrih, Vladimir Filipović, Saša Malkov, Aleksandar Kartelj




Distributed computing implies presence of unused software resources available on multiple computers that work as a single system. This kind of computing uses a system with parallel architecture and varying node reliability. As a consequence, an adequate programming paradigm has to be used. Web application, described in this paper, is designed with such paradigm in mind. It is developed using popular technologies. Proposed approach can attract two types of users: ones that need additional computing resources (in further text seekers) and ones that are willing to contribute by putting their computing resources on disposal (in further text helpers). Seeker is obligated to share their data which is then divided into equal segments. Number of these equal segments is defined by seeker in advance. Secondly, seeker has to define processing procedure, i.e. code for processing these segments separately. Eventually, they should define the way how processed segments are reduced into final result. Described programming paradigm is known as MapReduce. Data can be in arbitrary format (at the moment, the system is evaluated for text and images) as long as the map-function handles it in the appropriate way. Helper is assigned a segment of the input data. Map-function, defined by the seeker, is then executed within helper’s Web browser and its result is being returned to the system when processing procedure finished. The Web application’s efficiency depends on the number and configuration of computing nodes. Four different use-cases are demonstrated in this paper: 1) word counting in file containing text, 2) finding the largest number in the text file that contains numbers, 3) sharpening of the corrupted image and 4) applying blur effect on the image file. Since its simplicity and universality, the system has potential for other more complex computations and could, in the future, be applied in the domain of distributed content digitalization, analysis of the data obtained from telescopes etc.