Teddy: Automatic Recommendation For Pythonic Idiom Usage
Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul∗, Thanwadee Sunetnanta∗, Morakot Choetkiertikul∗, Faculty of Information and Communication Technology (ICT), Mahidol University
Pythonic code is idiomatic code that follows guiding principles and practices within the Python community. Offering performance and readability benefits, Pythonic code is claimed to be widely adopted by experienced Python developers, but can be a learning curve to novice programmers. To aid with Pythonic learning, we create an automated tool, called Teddy, that can help checking the Pythonic idiom usage. The tool offers a prevention mode with Just-In-Time analysis to recommend the use of Pythonic idiom during code review and a detection mode with historical analysis to run a thorough scan of idiomatic and non-idiomatic code. In this paper, we first describe our tool and an evaluation of its performance. Furthermore, we present a case study that demonstrates how to use Teddy in a real-life scenario on an Open Source project. An evaluation shows that Teddy has high precision for detecting Pythonic idiom and non-Pythonic code. Using interactive visualizations, we demonstrate how novice programmers can navigate and identify Pythonic idiom and non-Pythonic code in their projects.
Pythonic and Non-Pythonic Idiom Database
Pythonic Database Set
|Pythonic dict comprehension||Declaration of
|Pythonic list comprehension||Declaration of
|Pythonic enumerate||For-loop iteration using
|Pythonic if statement||Using implicit truthfulness for
|Pythonic file reading statement||Using
|Pythonic tuple||Unpacking data for multiple assignment at once|
|Pythonic variable swapping||Using tuple to swap values between two or more variables|
|Pythonic string formatting||Concatenation of multiple string formatting statements, use of
|Pythonic code formatting||Proper use of indentation for code blocks and writing one statement per one line|
Non-Pythonic Database Set
|Non-Pythonic dict comprehension||Separate declaration and for-loop element assignment of a
|Non-Pythonic list comprehension||Separate declaration and for-loop element assignment of a
|Non-Pythonic enumerate||for-loop iteration without the use of
|Non-Pythonic if statement||direct comparison of variable with
|Non-Pythonic file reading statement||Opening a file without using
|Non-Pythonic set||Using for-loop to create a unique collection of item|
|Non-Pythonic tuple||Explicitly assigning variables with elements in a collection|
|Non-Pythonic variable swapping||Using a
|Non-Pythonic string formatting||Sequence of one string formatting commands per one line, using ‘+’ to concatenate static string and variable(s) together, or using ‘%’ as string variable placeholder|
|Non-Pythonic code formatting||Using ‘;’ to put more than one statement in a single line|
The full database source codes can be accessed via this link.
Ground truth data that are used to evaluate the configuration of Siamese are in this link.
Case Study on Project Flask
The full detail of the project is in this GitHub repository.