HierarchyNet: Learning to Summarize Source Code with Heterogeneous Representations

EACL 2024

Abstract:

We propose a novel method for code summarization utilizing Heterogeneous Code Representations (HCRs) and our specially designed HierarchyNet. HCRs effectively capture essential code features at lexical, syntactic, and semantic levels by abstracting coarse-grained code elements and incorporating fine-grained program elements in a hierarchical structure. Our HierarchyNet method processes each layer of the HCR separately through a unique combination of the Heterogeneous Graph Transformer, a Tree-based CNN, and a Transformer Encoder. This approach preserves dependencies between code elements and captures relations through a novel Hierarchical-Aware Cross Attention layer. Our method surpasses current state-of-the-art techniques, such as PA-Former, CAST, and NeuralCodeSum.

Paper: https://arxiv.org/pdf/2205.15479.pdf

Code: https://github.com/FSoft-AI4Code/HierarchyNet

Authors

Minh Nguyen Huynh (Batch-2 AI Resident), Nghi Bui (mentor), Truong Son Hy (mentor), Long Tran-Thanh (mentor), Tien N. Nguyen (mentor)

Leave A Comment