The rise of GenAI: A Software Engineering Perspective.
"What is your opinion about using GenAI in the field of Software Development?", is what I asked to ChatGPT when I thought about writing this article. Its response was interesting and quiet balanced to be honest. Being a skeptic by nature, I expected it to be biased towards being pro GenAI, but it was honest and balanced in highlighting the advantages and limitations of using GenAI for Software Engineering. I guess that's the keyword here, BALANCE. Before I go any further, I would like to highlight that this article was not written using ChatGPT or any other LLM chatbots. Maybe the ones I write in future might, but not this one. And No, there is no biasness towards not using ChatGPT or any other LLMs based chatbots for me to write this article.
Anyways, let's get down to the crux of the matter. GenAI has seen an exponential rise in experimentation and usage during late 2021 and early 2022, shortly after the release of ChatGPT 3.5. This rise has also translated into companies coming up with different products in different spaces including Software Engineering that use or leverage these LLMs. In this article we will be talking about a few of those popular products developed using GenAI in the space of Software Engineering. We will talk about their features and limitations that might help you explore a product further to see how it benefits you or your organization and in case you are completely brand new to the idea of GenAI in Software Engineering, this article might help trigger that interest to explore this area.
GitHub Copilot:
GitHub Copilot is probably the most popular one, when we talk about AI powered pair programming assistants. Launched in middle of 2021, it is powered by an AI model developed by GitHub, Microsoft and OpenAI. Unlike ChatGPT and some of the other LLMs, this has been designed specifically for Programmers or Developers and works directly out of your IDE which makes it quiet easier to use.
Features:
1. Programming Languages & IDE Support: Having been trained on all the public repositories available on GitHub, Copilot supports almost all programming languages including but not limiting to C#, Python, Javscript, Java, Go and many more. The accuracy of the code suggestions may vary for some of the legacy languages as compared to say something like Javascript which has the most public repos thereby providing the most training data. It integrates well with almost all of the major IDEs including but not limiting to vscode, Visual Studion, JetBrains IntelliJ, Vim and many more.
2. Code Generation and Suggestions: There are multiple ways in which you can use Copilot to generate code. You could write comments which can generate code suggestions in your IDE window. It also allows you to toggle between between code suggestions to select the most appropriate one. It can also suggest code while you start typing in your IDE, it will use things like project context, conventions you use and your cursor's position to suggest code for you. You can also use the Copilot's chat window in your IDE to give it a file name as context to generate more specific suggestions. It also has an inline chat where you could give it a specific prompt while you are editing a code file to get more specific results.
3. Troubleshooting & Debugging: The newer version of Copilot has gotten a lot better at troubleshooting errors in your code. There is an option for you now to go to your IDE's output window, look for the Copilot icon against an error and Copilot will look at your output window for the latest error and accordingly generate suggestions on what the is error and how you can troubleshoot the error. This is especially helpful if you are starting to learn a new language or a new framework and run into setup or configuration issues.
4. Voice Assistant: Microsoft has this new extension for vs code called VS Code Speech, which will help you give context and prompts to Copilot using your voice. This will save a lot of typing and can help the developers get more productive. You could also use the VS Code Speech to do things such as write up commit messages or write your PR descriptions.
5. Code Security: GitHub Copilot uses a vulnerability prevention system which does a vulnerability check on the code suggestions before they make their way to the IDE. Copilot can also detect vulnerabilities while you write your code, to the level that it can also detect vulnerabilities in incomplete code fragments. Although there are measures that GitHub Copilot takes to make sure you have a code that is free from vulnerabilities, it does mention that this system is not perfect and is not a replacement for your existing DevSecOps pipeline and processes.
6. Open Source License Consideration: Using licensed open source code can have legal implications and the developer will need to adhere to the license requirements in a said code. Copilot has a code referencing system which is in preview stage which can flag such code references from licensed open source repositories in developers IDE which will help the developer to take necessary steps to mitigate the risk of using such code. Copilot also has a code referencing filter which can help mitigate this risk by not suggesting code that comes from such open source licensed repositories when this filter is turned on.
7. Pricing: Individual license cost starts from $10 USD per user per month. However, if you fit certain criteria for example, if you are a teacher or a student or a contributor to open source project, then you are eligible for a free license. The professional license cost starts from $19 USD per user per month.
Limitations:
1. Code suggestions made by GitHub Copilot are somewhat similar to taking a piece of code from the internet, you would still need to tweak it to make sure it fits your requirements. Also, it could contain bugs, defects that would need fixing before you could use them. The code generated would also need to adhere to your organization's DevSecOps standards and practices and you might need to make further changes to the code.
2. Although Copilot is free for a certain group of individuals, for most of the developer folks out there, you will need to shell out at least $10 USD per month to leverage Copilot's services.
3. Although Copilot can learn from the files and project opened in your IDE as context, there is no direct way of supplying your organization's repositories beforehand to help train Copilot on your organization's best practices and standards.
4. Even though there are security and code reference checks available in GitHub Copilot, the responsibility of the code in the IDE whether it's with using the suggested code or changing the suggested code will be with the Developer and their organization.
AWS Code Whisperer:
AWS Code Whisperer, similar to GitHub Copilot is an IDE based AI pair programming tool. Unlike GitHub Copilot, AWS Code Whisperer was a little late to the game and is mostly playing catchup in terms of adoption as compared to GitHub Copilot. Code Whisperer claims it has been trained on internal AWS repositories as well as publicly available source code. However, unlike GitHub Copilot, Code Whisperer hasn't exactly listed out its sources for training data.
Features:
1. Programming Languages & IDE Support: AWS Code Whisperer supports a plethora of languages, including but not limiting to Python, C#, Javascript, C++, Ruby, and Rust. It also supports most of the popular IDEs such as vscode, JetBrains IntelliJ, Visual Studio (preview) and many more.
2. Code Generation and Suggestions: Code Whisperer looks at the English comments provided and also the surrounding code to infer and make appropriate code suggestions. Similar to GitHub Copilot, you can toggle between different code suggestions available to choose the appropriate one for you or discard the suggestions and continue writing code for better context. Unlike Copilot, Code Whisperer generates code suggestions line by line, which is helpful if you are that type of developer who prefers to write a small snippet first and then think about what your next code will be rather than generating everything at once and then trying to figure out what you need and what you don't.
3. Code Security: There are a couple of ways Code Whisperer helps you with the security aspect of generated code. It has got a proactive system which prevents it from suggesting code with potential security vulnerabilities, this way most of the code is scanned for security vulnerabilities before it gets to you. It can also detect security vulnerabilities in your IDE after the code suggestion has been accepted, this making it easier to detect vulnerabilities even at the IDE level.
4. Open Source License Consideration: There is a configuration setting in Code Whisperer available at the IDE and Administrator level to switch off code suggestions with references to known licensed open source repositories. This allows you to either filter the licensed open source code before it makes its way to your IDE as a suggestion or flag this licensed open source code at the IDE level, leaving it to the developer to take a call whether to keep or discard the code suggestion.
5. Customization: It is worth noting that AWS Code Whisperer gives you the ability to securely connect Code Whisperer to your organization's internal repositories making it aware of your internal libraries, APIs, architectural patterns and best practices making it more relevant for your organization's developers to use. This feature however is currently in preview mode and may have some limitations on usage.
6. Pricing: Code Whisperer is free for individual use available by creating a AWS builder Id. It does not require a credit card for you to sign up. For Code Whisperer professional, the price starts with $19 USD per user, per month and provides the administrator of your organization to setup and enable SSO for your employees.
Limitations:
1. Code Whisperer is new to the game as compared to GitHub Copilot, hence the community adoption for Whisperer is not yet there yet. Which means there is less help available in the community for any setup or configuration issues that you may run into.
2. AWS Code Whisperer has been trained on a lot of internal AWS data, which makes it easier to use it when you are developing against any AWS services. It has also been trained on a lot of publicly available source code, but developers in general have found it easier to use it for developing against AWS services. Although this doesn't exactly make Code Whisperer a bad option to use for other use cases, but maybe something you might need to consider.
3. For generating code through comments, Code Whisperer only supports English language at this point in time.
ChatGPT-3.5:
Although the original intent of ChatGPT-3.5 might never have been towards AI pair programming, it has gained a lot of popularity amongst the developer community to use either the ChatGPT-3.5 chatbot or ChatGPT-3.5 playground as an AI powered pair programming assistant.
Features:
1. Matured LLM: Although there are tools available which are specifically designed for software engineering, the fact that ChatGPT-3.5 has been trained on multiple open source repositories and has a LLM that has matured over time which makes it a viable choice to be considered for purposes of Software Engineering.
2. Programming Languages & IDE Support: Since it has been trained on multiple open source repositories, it has been exposed to a variety of different programming languages and hence supports all major the major ones including but not limited to Python, C#, Java, Go, Javascript, Rust and many more. As compared to GitHub copilot, it has been exposed to other public repositories such as GitLab, Bitbucket etc., which can result in better accuracy during code suggestions. Having said that, accuracy can also depend on other factors such as prompt provided, context etc. There are extensions available for popular IDEs, however all of those extensions have been developed and contributed to by individuals, hence they may not be the best choice for Enterprise settings. The best way of using ChatGPT for Software Engineering at this point in time remains through the ChatGPT Chatbot, APIs or playground.
3. Code Security: Without the IDE extensions, there is no inbuilt mechanism to ensure that the code provided is free from any security vulnerabilities. This means that you would need to rely on your organization's security practices to make sure that the code suggestions you are planning to use are free from security vulnerabilities.
4. Code Context: There are a few ways you could use ChatGPT for Software Engineering purposes. a) You could give some comments and ask it to generate code for you. b) You could give it an existing piece of code and ask it to perform tasks such as write unit tests, optimize the code or find bugs in it. c) You could use it to develop a new feature by incorporating the driver and navigator pair programming model in which you could switch between being a navigator (who provides direction for writing code) and driver (who is focused on writing and optimizing the code).
5. Open Source License Consideration: There are no internal checks in ChatGPT that will flag code references for suggested code which may be identical to code from licensed open source repositories. Which means the onus is on the developers to use the generated code with caution.
Limitations:
1. The biggest limitation or restrictions with using ChatGPT as your AI powered pair programming assistant is that without any IDE support, you will be switching between your browser and your IDE to incorporate code suggestions. This will take away from the overall developer experience.
2. Since ChatGPT wasn't designed to be used specifically for software engineering, there would be things such as Code Security, Static Code Analysis etc. that you would need to incorporate yourself and cannot rely on ChatGPT for the same.
3. There are certain enterprise wide policy control and configurations that are missing in ChatGPT but are provided by tools such as Copilot and Code Whisperer. These features will be crucial in an Enterprise setting.
Summary:
There are a few options available to use GenAI in the software engineering space. If you are a Medium to Large Enterprise then going with either GitHub Copilot or AWS Code Whisperer would make the most sense, since it would give you that Enterprise level security and policy control. If you are AWS heavy in your organization, then it would make sense to go for AWS Code Whisperer, else GitHub Copilot is a good fit for other use cases. For individuals just starting to explore this space and are mostly looking towards GenAI to help them with their personal pet project, going with something like ChatGPT playground or chatbot is a good option, especially if you don't wish to spend your own money but still get some level of productivity gain.
When you are using the output from any of the above mentioned tools, one thing you will need to keep in mind is that you will need to iterate over the suggestions till you end up with a workable code that suits your requirements. The number of iterations could vary from developer to developer, for example an experienced developer might be able to give more context to the assistant which might require them to have lesser iterations, whereas an entry level developer might not be able to give sufficient context and may require more iterations to make the generated code fit the requirement. This is somewhat similar to how some people are better at doing google search than others.
Also, the good news for developers is that none of these tools are branding themselves as something that can replace developers, they are simply what they call themselves, an assistant, a productivity tool that can help increase developer experience and productivity. And that's the best way to use these tools, to have a BALANCE between using your own developer skills and creativity and using these tools to help you accelerate.
Note: At the time of writing this article, there has been a new entry into the Software Engineering Gen AI space with Google announcing the launch of Duet AI for Developers. The announcement is very latest and the tool is in beta testing and hence not included as part of this article.
IT Transformation Professional using Technology, Agile Values and Coaching.
1yThanks for this Vishal, Very exhaustive coverage of how Gen AI can support software development.