Automatic detection of source code similarities using machine learning techniques

Authors

  • Marina Elizabeth Cardenas, Doctoranda Grupo de Investigación, Desarrollo y Transferencia en Aprendizaje Automático, Lenguajes y Autómatas- Centro de Investigación y Desarrollo de Software- Facultad Regional Córdoba - Universidad Tecnológica Nacional – Argentina
  • Julio Javier Castillo Director

DOI:

https://doi.org/10.33414/ajea.5.745.2020

Keywords:

source code, similarities, reuse, machine learning, text, analysis

Abstract

This thesis proposal proposes the development of a model for detection of source code similarities in order to determine the existence of reuse practices applying techniques related to computational linguistics, such as text data mining and natural language processing. The identification of code similarities have several aims, including the study of the evolution of the source code of a project, detection of reuse practices, extraction of a code fragment for “refactoring” of the project, monitoring of defects for correction, among others.

Downloads

Download data is not yet available.

Published

2020-10-05

How to Cite

Cardenas, M. E., & Castillo, J. J. (2020). Automatic detection of source code similarities using machine learning techniques. AJEA (Proceedings of UTN Academic Conferences and Events), (5). https://doi.org/10.33414/ajea.5.745.2020