| dc.contributor.advisor | CATRUC, Mariana | |
| dc.contributor.author | CEBAN, Dan | |
| dc.date.accessioned | 2026-02-26T09:04:44Z | |
| dc.date.available | 2026-02-26T09:04:44Z | |
| dc.date.issued | 2026 | |
| dc.identifier.citation | CEBAN, Dan. Applying GNN for Source Code Analysis: Code Smell Identification. Teză de master. Programul de studiu Ingineria software. Conducător ştiinţific CATRUC Mariana, lect. univ. Universitatea Tehnică a Moldovei. Chișinău, 2026. | en_US |
| dc.identifier.uri | https://repository.utm.md/handle/5014/35484 | |
| dc.description | Fişierul ataşat conţine: Abstract, Contents, Introduction, Bibliography. | en_US |
| dc.description.abstract | The main idea behind this study is that source code is naturally graph-structured and that automated analysis must work with representations that keep the grammatical hierarchy, control-flow semantics, and data dependencies intact. Linear or token-based representations compress these dimensions, leading to significant information loss and an incapacity to analyze non-local interactions crucial for comprehending software quality. To address this discrepancy, the thesis utilizes graph-based representations informed by compiler theory, particularly the Code Property Graph, which integrates Abstract Syntax Trees, Control- Flow Graphs, and Data-Flow Graphs into a cohesive and expressive framework. This format allows for the modeling of programs as interconnected systems instead of separate sequences of instructions. This gives a more accurate picture of how software is designed and run. The research builds on this structural base and suggests a GNN-based learning framework that can find code smells as abnormalities at the graph or node level in software systems. Vulnerabilities are usually small logical flaws, while code smells are bigger architectural problems like too much coupling, too little cohesion, too many complex control structures, or too much concentration of responsibilities. These traits are inherently relational and topological, which makes them perfect for graph-based analysis. The suggested method sees code smell detection as a structural learning problem. In this case, the GNN learns to link certain graph patterns and neighborhood configurations with known design anti-patterns. The results show that GNNs can learn design principles without having to explicitly write them down. They do this by learning representations that fit with common sense in software engineering. In addition to raw speed, the study looks at how easy it is to understand the models. It shows that the learnt representations may be used to find important nodes and edges in the code graph. This feature is very important for practical use since it lets developers link predictions back to real design problems in the source code. In summary, this thesis presents empirical proof that Graph Neural Networks, in conjunction with comprehensive graph representations of source code, deliver a robust and scalable solution for automated code smell detection. The suggested method improves the state of the art in software quality assessment by bringing together traditional static analysis and current machine learning. The findings indicate that graph-based learning models may underpin next-generation developer tools designed to proactively address technical debt, enhance maintainability, and ultimately improve the long-term sustainability of complex software systems. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Universitatea Tehnică a Moldovei | en_US |
| dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
| dc.subject | source code | en_US |
| dc.subject | Code Property Graph | en_US |
| dc.subject | Graph Neural Network | en_US |
| dc.title | Applying GNN for Source Code Analysis: Code Smell Identification | en_US |
| dc.type | Thesis | en_US |
The following license files are associated with this item: