Bitcoin

GitHub Copilot Agent Now Looks More Promising Than it Did a Couple of Months Ago

1. GitHub Copilot in VS2022

I am working on the development of .NET8/C#/ASP.NET8/EF8 application, which is now around 123.000 lines of code (SLOC), out of which 50.000 is EF-database-first model, in Visual Studio 2022.

I have a subscription to GitHub Copilot Pro + license. So far, that AI tool has been good for limited-scope tasks. I wanted to try the new GitHub Copilot Agent mode. Below are notes from my regular work.

2. Anecdotal Experience with Real ASP.NET8 project

All below was done with:

  • Visual Studio 2022, 17.14.4
  • GitHub Copilot (GHC). License Copilot Pro+
  • Agent mode, GPT-4o, GPT-4.1, Claude 3.7 Sonnet

2.1 Code review of Data-Access-Layer dll

  • Context is DAL layer (.dll) in this moment is ~62.000 SLOC C# (and it will grow), and out of that 50.000 is EF model (EF-database-first).
  • That is a kind of limited scope task, clear C# code, .NET8
  • Wanted to do a code review and see what this AI thing will do
  • (Run1-GHC (GPT-4o) ) My request COMMAND to GHC (GPT-4o) was (*)
    • I want you to do a Code Review of the project TBServerUI.Data.csproj. Identify possible software defects, and places for possible improvement of code.
  • Response was quick, and it said it “read 786 files” of the project. That sounds like a good start.
  • Replay was quite shallow. I kind of expected it would understand a bit what that .dll is doing, but it was giving me suggestions like “avoid magic number/strings” in code and “refactor method because it has too many parameters”. I do things like “shorten string to max 30 characters” before I put it in the database, and number 30 is ok there, it would just obfuscate code if I were to define const THIRTY=30 or similar garbage. I do not need that, problems I have are like 1000 times more complex, and AI gives worthless suggestions.
  • My expectations were higher. No human can keep and memory 786 files, so my hope was that the AI machine would load it all and figure out some smarter advice, that I can not see because there are so many files. I actually know places where I made “dirty code” to make things work and left it there, to refine it later. That GHC thing didn’t see that.
  • (Run2-GHC (GPT-4o) ) I run several times same command (*) to GHC (GPT-4o), to give it a chance to improve on response. Funny thing, I told it to do the same all over again (same command), but it did NOT read again all 786 classes, just 2 of them. I saw that before, that GHC thing is not following even full sentence instructions, it does what it wants to do.
  • (Run3-GHC (GPT-4o) ) I needed to close all open files in the editor to make it re-read all 786 files. Why do I need to do that? I was giving explicit full sentence commands (*) to review the whole project?
  • Again, a pretty formal response, several pages of text, but no real substance.
  • (Run4-GHC (GPT-4.1)) I tried same command (*) to GHC (GPT-4.1).
  • Answer is similar, formal, with suggestions if I want “deeper review, I must ask for it specifically for a certain class/file”. There are many files; why do I need to ask file by file?
  • (Run5-GHC (Claude 3.7 Sonnet)) I tried the same command (*) to GHC (Claude 3.7 Sonnet).
  • This time, it was quite slow in response. It said it read 786 files. Does slow mean it is doing something? I got a response finally.
  • The first thing is a funny one. AI thing was reading MY OWN COMMENTS IN CODE, not the code itself, and was telling me I sound “insecure” in my comments. I do not need an opinion on my comments, I need an opinion on my code. Based on my comments, it gives an evaluation of my code. If I have put a comment like “this is a great method,” would it think it is great? So, the quote of GHC is here: “The comment ‘hope this is enough to prevent SQL injection’ indicates uncertainty about the security measures. This should be thoroughly tested and validated.” If my company sees that, they will need to hire a Junior guy, they are ALWAYS SURE they covered all the possibilities, no matter what code they write. Funny, GHC is giving a psychological profile of developers instead of looking into real code to see if it works.
  • It picked on some interfaces in libraries that are marked as “obsolete”, and I use them deliberately because the Project Owner wants me to keep them all on.NET8, and I would need to upgrade the project to .NET9 to use the latest interfaces. GHC is quite shallow here, such comments would mislead someone into the project that can not be compiled.
  • OK, it found some properties that it sees as mutable, I do only reads from them, but yes, they are not marked as read-only, so in multithreading, there could be problems… I didn’t see that before…
  • Some other minor comments… nothing I find useful… refactor this or that, split interface into several smaller interfaces, all cosmetic code changes.
  • Ok. Enough of code review for now.

2.2 Create a new method based on the existing

  • The task is to create a similar method based on a 200-line C# template method
  • The task to GHC was to create a DB access method based on a similar one, just to reverse Contract-Account roles. The original method is ~200 lines of C# code, but it uses EF-Core, LINQ, and some Data library classes and references some DTO classes. Still, it is quite easy, I would need 15 min, but I understand the DAL layer in this moment is ~62.000 SLOC (and it will grow), and out of that 50.000 is EF model (EF-database-first).

2.2.1 GHC (GPT-4o)

  • (Run1-GHC (GPT-4o) ) My request COMMAND to GHC (GPT-4o) was (**):
    • Based on method Accounts_AccountsForContractListDT in file #DbWork.Contracts.cs, create a similar method where Contract and Account roles are switched, called Contracts_ContractsForAccountListDT. Context is TBServerUI.Data project.
  • So, it was running for like full 4 minutes doing something, automated VS2022, source code window opened, it was entering text, then tried a build, and the report shows no errors in the second build. Looked impressive, that automation of VS2022 made me first time feel there is somebody else present. Let us see what it made.
  • It created a method. Added to proper file (not the same file where original Accounts_AccountsForContractListDT (DbWork.Accounts.cs) method is, but to DbWork.Contracts.cs.
  • I was looking into the generated method content. It looks pretty good. The truth is I am doing a lot of work with the proper naming of all the classes in the project, so it could find the DTO just by naming conventions. But still, it found the right DTO classes. I am not a machine/compiler, but the method looks good to me. I can not really see all the details, if they are completely right.
  • One of the problems I noticed before with Gen-AI is that they produce a lot of text, and it is quite difficult and time-consuming to review it properly. It happened in the past, I was accepting Gen-AI code, but there was a small bug hidden that I did not notice.
  • So, key LINQ/EF queries looked fine. It even figured German names from the database, it is KONTO, VERTRAG table, not ACCOUNTS, CONTRACTS. And primary keys are KONTO_NR, VERTRAG_NR, etc. It figured it, and it looks like it is doing the right filtering.
  • It added some processing to the result that is specific for Contracts, and is not symmetric to the Accounts method I instructed it. It looks like it is quite clever, it looks like it found it somewhere in the project source code and inserted it here. Yes, the guess is right, it should be there. That is my code that is copied here, just even I can not fully review (do not remember) if all details are fine. This AI thing, of course, does not know what “Ebics Users” is, but it just saw the property “numberEbicsUsers” and it hacked it, found somewhere in the project method that calculates that property and assigned it. And the guess was good.
  • There is one Boolean flag that stayed unused, but that is a minor thing compared to all the other. I will set that flag.
  • So, looks like a good job. It compiles. No hallucinated properties/methods this time. Much better than what I have seen before for GHC.
  • Let’s see what other LLMs will do.

2.2.2 GHC (GPT-4.1)

  • (Run2-GHC (GPT-4.1) ) My request COMMAND to GHC (GPT-4.1) was (**) again.
  • It started to work on something. Then, in the chat window, it was loading like 6 files from the project. Then it said it is going to load 786 files of the project. Then it breaks, with an error in VS2022:
    • {“error”:{“message”:”prompt token count of 64602 exceeds the limit of 64000″,”code”:”model_max_prompt_tokens_exceeded”}}
  • It failed. Some GHC error. Let’s restart VS2022 and try all again.
  • (Run3-GHC (GPT-4.1) ) My request COMMAND to GHC (GPT-4.1) was (**) again.
  • It was running for a while. From the Chat dialog ( which actually serves as a Log window) it can be seen that it loaded a limited number of files. It was stuck, it couldn’t find/load some model C# file. It tried several times and failed. Then I manually navigated (actually, I was just checking if the file was there) and it continued.
  • Then it proposed a plan of changes and stopped. I was looking, but nothing was generated. So I just typed: “continue with your plan,” and it continued and did it. And it created code, and built it, and the build was successful.
  • I looked into the solution, and it looked the same as above by GPT-4o. At least the key parts that I checked are the same. All comments from above apply here too.

2.2.3 GHC (Claude 3.7 Sonnet)

  • (Run4-GHC (Claude 3.7 Sonnet) ) My request COMMAND to GHC (Claude 3.7 Sonnet) was (**) again.
  • Execution failed after 5 minutes with message:
    • {“error”:{“message”:”prompt token count of 95998 exceeds the limit of 90000″,”code”:”model_max_prompt_tokens_exceeded”}}
  • It failed. Let’s restart VS2022 and try all again.
  • (Run5-GHC (Claude 3.7 Sonnet) ) My request COMMAND to GHC (Claude 3.7 Sonnet) was (**) again.
  • Execution failed after 5 minutes. The error was different:
    • [Conversations Information] [CopilotClient] Copilot Internal User response: Conversations.Abstractions.Auth.CopilotUserData; [Conversations Information] [CopilotModels] Environment variable COPILOT_USE_DEFAULTPROXY found: False; [Conversations Information] Copilot auth status: OK. Copilot badge status: Active
  • I checked. My internet connection looked fine. On this machine I have 190Mbps/19Mbps.
  • Let’s restart VS2022 and try all again.
  • (Run6-GHC (Claude 3.7 Sonnet) ) My request COMMAND to GHC (Claude 3.7 Sonnet) was (**) again.
  • Execution failed after 10 minutes. Error was:
    • [PersistedCopilotSessionRepository Error] Error saving updated session: MessagePack.MessagePackSerializationException: Failed to serialize Microsoft.VisualStudio.Copilot.CopilotInteraction value; [Conversations Information] [CopilotClient] Copilot Internal User response: Conversations.Abstractions.Auth.CopilotUserData; [Conversations Information] [CopilotModels] Environment variable COPILOT_USE_DEFAULTPROXY found: False; [Conversations Information] Copilot auth status: OK. Copilot badge status: Active

3. Conclusion

These are, of course, limited tests involving the generation of a single method based on a clear pattern. However, even with such straightforward tasks, GHC previously either failed entirely or produced code that required substantial manual correction.

So far, the improvements are dramatic compared to two months ago. Back then, the code generated by GHC often wouldn’t compile right away and suffered from odd syntax issues, like misplaced or mismatched brackets around code blocks. Even worse, it frequently included hallucinated method or property names that didn’t exist in the project. It did not look like a meaningful work process.

Now, the generated code has correct syntax, compiles right away, and the referred properties/methods do exist. This time, it actually starts to look like a proper work process. GHC Agent automates VS2022, so on a psychological level, it actually looks like someone else has overtaken control of your dev tools.

I’ll keep testing it/working with more complex scenarios, but the GitHub Copilot Agent mode already feels like a significant step forward in quality.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button