Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph

Published in ICSME 2018, 2018

We construct an API caveats knowledge graph for Android APIs from the API documentation on the Android Developers website. We study the abundance of different subcategories of API caveats and use a sampling method to manually evaluate the quality of the API caveats knowledge graph. We also conduct a user study to validate whether and how the API caveats knowledge graph may improve the accessibility of API caveats in API documentation. ICSME 2018 IEEE TCSE Distinguished Paper Awards.

Recommended citation: Hongwei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, Xuejiao Zhao: Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME 2018) http://mingwei-liu.github.io/files/icsme2018_apicaveatskg.pdf

Automatic Generation of API Documentations for Open-Source Projects

Published in ICSME 2018 (Workshop), 2018

Open-source projects often have only incomplete and insufficient API documentations. To improve the efficiency of development and ensure the correctness of API usage, it is desired that the developers can be supported with automatically generated documentation based on a combination of knowledge from different sources. In this paper, we describe OpenAPIDocGen, a system that can automatically generate API Documentations for open-source projects, including an overview of the system and the data sources and techniques used to generate different parts of the documentation.

Recommended citation: Xin Peng, Yifan Zhao, Mingwei Liu, Fengyi Zhang, Yang Liu, Xin Wang, Zhenchang Xing: Automatic Generation of API Documentations for Open-Source Projects. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME 2018 Workshop) http://mingwei-liu.github.io/files/ICSME2018DocGen.pdf

Searching StackOverflow Questions with Multi-Faceted Categorization

Published in Internetware 2018, 2018

We propose a multi-faceted and interactive approach for searching StackOverflow questions (called MFISSO), which leverages these attributes of the questions.

Recommended citation: Mingwei Liu, Xin Peng, Qingtao Jiang, Andrian Marcus, Junwen Yang, Wenyun Zhao: Searching StackOverflow Questions with Multi-Faceted Categorization. Internetware 2018: 10:1-10:10 http://mingwei-liu.github.io/files/Internetware18SO.pdf

A learning-based approach for automatic construction of domain glossary from source code and documentation

Published in ESEC/FSE 2019, 2019

In this paper, we propose a learning-based approach for automatic construction of domain glossary from source code and software documentation. The approach uses a set of high-quality seed terms identified from code identifiers and natural language concept definitions to train a domain-specific prediction model to recognize glossary terms based on the lexical and semantic context of the sentences mentioning domain-specific concepts.

Recommended citation: Chong Wang, Xin Peng, Mingwei Liu, Zhenchang Xing, Xuefang Bai, Bing Xie, Tuo Wang: A learning-based approach for automatic construction of domain glossary from source code and documentation. ESEC/SIGSOFT FSE 2019: 97-108 http://mingwei-liu.github.io/files/fse19-wang-glossary.pdf

Generating query-specific class API summaries

Published in ESEC/FSE 2019, 2019

We propose an approach for generating on-demand, extrinsic hybrid summaries for API classes, relevant to a programming task, formulated as a natural language query. The summaries include the most relevant sentences extracted from the API reference documentation and the most relevant methods.

Recommended citation: Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, Yang Liu: Generating query-specific class API summaries. ESEC/SIGSOFT FSE 2019: 120-130 http://mingwei-liu.github.io/files/fse19-liu-APISummary.pdf

API Method Recommendation via Explicit Matching of Functionality Verb Phrases

Published in ESEC/FSE 2020, 2020

We identified 356 different functionality verbs from the descriptions, which were grouped into 87 functionality categories, and we extracted 523 phrase patterns from the verb phrases of the descriptions. Building on these findings, we propose an API method recommendation approach based on explicit matching of functionality verb phrases in functionality descriptions and user queries, called PreMA.

Recommended citation: Wenkai Xie, Xin Peng, Mingwei Liu, Christoph Treude, Zhenchang Xing, Xiaoxin Zhang, Wenyun Zhao: API Method Recommendation via Explicit Matching of Functionality Verb Phrases. Proceedings of the 2020 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020) http://mingwei-liu.github.io/files/fse2020-FuncVerb.pdf

Generating Concept based API Element Comparison Using a Knowledge Graph

Published in ASE 2020, 2020

we propose a knowledge graph based approach APIComp that automatically extracts API knowledge from API reference documentation to support the comparison of a pair of API classes or methods from different aspects.

Recommended citation: Yang Liu, Mingwei Liu, Xin Peng, Zhenchang Xing, and Xiaoxin Zhang: Generating Concept based API Element Comparison Using a Knowledge Graph. 35th IEEE/ACM International Conference on Automated Software Engineering (ASE 2020) http://mingwei-liu.github.io/files/ase2020-APIComp.pdf

Source Code based On-demand Class Documentation Generation

Published in ICSME 2020 (Workshop), 2020

In this paper, we present OpenAPIDocGen2, a tool that generates on-demand class documentation based on source code and documentation analysis. For a given class, OpenAPIDocGen2 generates a combined documentation for it, which includes functionality descriptions, directives, domain concepts, usage examples, class/method roles, key methods, relevant classes/methods, characteristics and concepts classification, and usage scenarios.

Recommended citation: Mingwei Liu, Xin Peng, Xiujie Meng, Huanjun Xu, Shuangshuang Xing, Xin Wang, Yang Liu, Gang Lv: Source Code based On-demand Class Documentation Generation. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME 2020 Workshop) http://mingwei-liu.github.io/files/icsme2020-document-generate.pdf

Learning based and Context Aware Non-Informative Comment Detection

Published in ICSME 2020 (Workshop), 2020

This report introduces the approach that we have designed and implemented for the DeClutter challenge of DocGen2, which detects non-informative code comments. The approach combines both comment based text classification and code context based prediction. Based on the approach, our “fduse” team achieved the best F1 score (0.847) in the competition.

Recommended citation: Mingwei Liu, Yanjun Yang, Xin Peng, Chong Wang, Chengyuan Zhao, Xin Wang, Shuangshuang Xing: Learning based and Context Aware Non-Informative Comment Detection. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME 2020 Workshop) http://mingwei-liu.github.io/files/icsme2020-comment.pdf

Learning-Based Extraction of First-Order Logic Representations of API Directives

Published in ESEC/FSE 2021, 2021

In this paper, we propose LeadFOL, a learning based approach for extracting first-order logic representations of API directives (FOL directives for short).

Recommended citation: Mingwei Liu, Xin Peng, Andrian Marcus, Christoph Treude, Xuefang Bai, Gang Lyu, Jiazhan Xie, Xiaoxin Zhang: Learning-Based Extraction of First-Order Logic Representations of API Directives. Proceedings of the 2021 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021) http://mingwei-liu.github.io/files/fse2021-Directive.pdf

Automatic Code Semantic Tag Generation based on Software Knowledge Graph

Published in Journal of Software 2021, 2021

Code snippets in open-source and enterprise software projects and posted on various software development websites are important software development resources. However, developer needs for code search often reflect high-level intentions and topics, which are difficult to be satisfied through information retrieval based code search techniques. It is thus highly desirable that code snippets can be accompanied with semantic tags reflecting their high-level intentions and topics to facilitate code search and understanding. Existing tag generation technologies are mainly oriented to text content or rely on historical data, and cannot meet the needs of large-scale code semantic annotation and auxiliary code search and understanding. Targeted at the problem, this paper proposes a software knowledge graph based approach (called KGCodeTagger) that automatically generates semenatic tags for code snippets. KGCodeTagger constructs a software knowledge graph based on concepts and relations extracted from API documentations and software development Q&A text and uses the knowledge graph as the basis of code semantic tag generation. Given a code snippet, KGCodeTagger identifies and extracts API invocations and concept mentions, and then links them to the corresponding concepts in the software knowledge graph. On this basis, the approach further identifies other concepts related to the linked concepts as candidates and selects semantic tags from relevant concepts based on the diversity and representativeness.

Recommended citation: Shuangshuang Xing, Mingwei Liu, Xin Peng: Automatic Code Semantic Tag Generation based on Software Knowledge Graph. Journal of Software, 2021 (in Chinese) http://mingwei-liu.github.io/files/JournalOfSoftware2021-KGCodeTagger.pdf

API-Related Developer Information Needs in Stack Overflow

Published in IEEE Transactions on Software Engineering 2021, 2021

Stack Overflow (SO) provides informal documentation for APIs in response to questions that express API related developer needs. Navigating the information available on SO and getting information related to a particular API and need is challenging due to the vast amount of questions and answers and the tag-driven structure of SO. In this paper we focus on identifying and classifying fine-grained developer needs expressed in sentences of API-related SO questions, as well as the specific information types used to express such needs, and the different roles APIs play in these questions and their answers. We derive a taxonomy, complementing existing ones, through an empirical study of 266 SO posts. We then develop and evaluate an approach for the automated identification of the fine-grained developer needs in SO threads, which takes a thread as input and outputs the corresponding developer needs, the types of information expressing them, and the roles of API elements relevant to the needs.

Recommended citation: Mingwei Liu, Xin Peng, Andrian Marcus, Shuangshuang Xing, Christoph Treude, and Chengyuan Zhao: API-Related Developer Information Needs in Stack Overflow. IEEE Transactions on Software Engineering (TSE) 2021 January http://mingwei-liu.github.io/files/TSE2021-DeveloperNeed.pdf

How to Formulate Specific How-To Questions in Software Development?

Published in ESEC/FSE 2022, 2022

We propose an approach (TaskKG4Q) that interactively helps developers formulate a programming related how-to question. TaskKG4Q is using a programming task knowledge graph (task KG in short) mined from Stack Overflow questions, which provides a hierarchical conceptual structure for tasks in terms of [actions], [objects], and [constraints].

Recommended citation: Mingwei Liu, Xin Peng, Andrian Marcus, Christoph Treude, Jiazhan Xie, Huanjun Xu, Yanjun Yang: How to Formulate Specific How-To Questions in Software Development?. Proceedings of the 2022 30th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022) http://mingwei-liu.github.io/files/fse22-taskkg.pdf