Skip to content

Communication Linking

jordanell edited this page Jul 3, 2012 · 1 revision

This page describes our approach to linking communication artifacts to git repository commit IDs. We describe a two pass approach which we believe will be the most accurate for linking.

First Pass

The first pass to link communication artifacts to commit IDs is the most straight forward. Depending on the project type (Bugzilla/Jira), we use a specific regex for parsing git commit comments. Most projects use bug linking in their comments by giving the bug ID number or some sort of regex to identify that the commit is linked to a specific bug in their issue tracking database. We simply find these links and use them as a starting point.

Once a commit to bug link is found through the regex, we simply take that bug report (thread) and take all communication (items) that happened on that thread up to that commits date and link them to the commit. That is, all communication that has happened on a bug report is linked to a commit when the communication item's date is earlier than the commit.

Second Pass

If a communication artifact fails to link from the first pass, it will be attempted to link with this second pass. We have five types of natural language processing we can attempt to use to link artifacts: code snippets, stack traces, diffs, files mentioned and bug numbers mentioned.

For each of these types, we use regex to determine what files are mentioned. However, we do not get full paths to files. So if there is a src/A/Foo.java and src/B/Foo.java would could link to both files.

After all files are found we search for commits around the date that the artifact was created and have any of the files mentioned in its changes. We then do a simple calculation to get the weight of the link. If we have a artifact that mentions file A, B, C and D and we have a commit that changes file A and B we have 2 / 4 files linked which gives us .5 weight. This link is then stored as a result.

We can use a threshold to weed out some very weak links if we want.

Clone this wiki locally