Automatic detection of source code similarities using machine learning techniques

Authors

  • Marina Elizabeth Cardenas, Doctoranda Grupo de Investigación, Desarrollo y Transferencia en Aprendizaje Automático, Lenguajes y Autómatas- Centro de Investigación y Desarrollo de Software- Facultad Regional Córdoba - Universidad Tecnológica Nacional – Argentina
  • Julio Javier Castillo Director

DOI:

https://doi.org/10.33414/ajea.5.745.2020

Keywords:

source code, similarities, reuse, machine learning, text, analysis

Abstract

This thesis proposal proposes the development of a model for detection of source code similarities in order to determine the existence of reuse practices applying techniques related to computational linguistics, such as text data mining and natural language processing. The identification of code similarities have several aims, including the study of the evolution of the source code of a project, detection of reuse practices, extraction of a code fragment for “refactoring” of the project, monitoring of defects for correction, among others.

Downloads

Metrics

PDF views
247
Oct 07 '20Oct 10 '20Oct 13 '20Oct 16 '20Oct 19 '20Oct 22 '20Oct 25 '20Oct 28 '20Oct 31 '20Nov 01 '203.0
| |

Published

2020-10-05

How to Cite

Cardenas, M. E., & Castillo, J. J. (2020). Automatic detection of source code similarities using machine learning techniques. AJEA (Proceedings of UTN Academic Conferences and Events), (5). https://doi.org/10.33414/ajea.5.745.2020