발간년도 : [2021]
논문정보 |
|
논문명(한글) |
[Vol.16, No.6] A Study on Big Data Processing-based Data Concentrated Computation |
|
논문투고자 |
Leesang Cho, Jinhong Kim |
|
논문내용 |
Over the last decades one could observe a drastic increase in the generation and storage of data in both, industry and science. While the field of data analysis is not new, it is now facing the challenge of coping with an increasing size, bandwidth and complexity of data. This renders traditional analysis methods and algorithms ineffective. This problem has been coined as the Big Data challenge. Concretely in science the major data producers are large-scale monolithic experiments and the outputs of domain simulations. Up until now, most of this data has not yet been completely analyzed, but rather stored in data repositories for later consideration due to the lack of efficient means of processing. We proposes a design and prototypical realization of such a framework based on the experience collected from empirical applications, so we called BDP(Big Data Processing). For this, selected scientific use cases, with an emphasis on earth sciences, were studied. In particular, these are object segmentation in point cloud data and biological imagery, outlier detection in oceanographic time-series data as well as land cover type classification in remote sensing images. In order to deal with the data amounts, two analysis algorithms have been parallelized for shared- and distributed-memory systems. The presented parallelization strategies have been abstracted into a generalized paradigm, enabling the formulation of scalable algorithms for other similar analysis methods. Moreover, it permits a large-scale data analysis framework and algorithm library for heterogeneous, distributed high-performance computing systems. |
|
첨부논문 |
|
|
|
|
|