dc.contributor.author | Moturi, Christopher A | |
dc.contributor.author | Maiyo, Silas K. | |
dc.date.accessioned | 2013-02-13T08:58:37Z | |
dc.date.available | 2013-02-13T08:58:37Z | |
dc.date.issued | 2012-12 | |
dc.identifier.citation | International Journal of Computer Applications (0975 – 8887) Volume 56– No.7 | en |
dc.identifier.uri | http://erepository.uonbi.ac.ke:8080/xmlui/handle/123456789/9734 | |
dc.description.abstract | This paper studied the design, implementation and evaluation
of a MapReduce tool targeting distributed systems, and multicore
system architectures. MapReduce is a distributed
programming model originally proposed by Google for the
ease of development of web search applications on a large
number of clusters of computers. We addressed the issues of
limited resource for data optimization for efficiency,
reliability, scalability and security of data in distributed,
cluster systems with huge datasets. The study’s experimental
results predicted that the MapReduce tool developed
improved data optimization. The system exhibits undesired
speedup with smaller datasets, but reasonable speedup is
achieved with a larger enough datasets that complements the
number of computing nodes reducing the execution time by
30% as compared to normal data mining and processing. The
MapReduce tool is able to handle data growth trendily,
especially with larger number of computing nodes. Scaleup
gracefully grows as data and number of computing nodes
increases. Security of data is guaranteed at all computing
nodes since data is replicated at various nodes on the cluster
system hence reliable. Our implementation of the MapReduce
runs on distributed cluster computing environment of a
national education web portal and is highly scalable. | en |
dc.language.iso | en | en |
dc.subject | MapReduce, Hadoop, Scalability | en |
dc.title | Use of MapReduce for Data Mining and Data Optimization on a Web Portal | en |
dc.type | Other | en |
local.publisher | School of Computing and Informatics | en |