logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
AsifTraynor  
#1 Posted : 13 March 2020 14:46:46(UTC)
AsifTraynor

Rank: Newbie

Groups: Registered
Joined: 24/02/2020(UTC)
Posts: 0

The data in this article pair source code with three artifacts from 108,568 projects downloaded from Github that have a redistributable license and at least 10 stars. The first set of pairs connects snippets of source code in C, C++, Java, and Python with their corresponding comments, which are extracted using Doxygen. The second set of pairs connects raw C and C++ source code repositories with the build artifacts of that code, which are obtained by running the make command. The last set of pairs connects raw C and C++ source code repositories with potential code vulnerabilities, which are determined by running the Infer static analyzer. The code and comment pairs can be used for tasks such as predicting comments or creating natural language descriptions of code. The code and build artifact pairs can be used for tasks such as reverse engineering or improving intermediate representations of code from decompiled binaries. The code and static analyzer pairs can be used for tasks such as machine learning approaches to vulnerability discovery.

Ref: https://www.sciencedirec...le/pii/S2352340919310674
Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.