-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GBK encoding caused CodeQL to detect code written in Java/Kotlin, but it was unable to process any of it #18527
Comments
👋 @Weijin-wj I'm glad you found a workaround! Let me circle back to the internal team to see if UTF8 encoding is a known requirement or if this is a bug we need to solve. Even if it were the former case, we would definitely need to improve how this is reported. |
I have tried extracting a few repositories using GBK encoding, but am not able to reproduce what you describe. Could you please give an example of a repository that fails? Are you using CodeQL on the command-line? If so, what command did you run? Alternatively are you using it via Github Actions? If so, did you use CodeQL default setup or an advanced configuration with an explicit action YAML file? |
@redsun82 @smowton Sorry, I can't provide the original code, but I reproduced the previous issue with the following code. I created the database from the command line, and the command is as follows:
I cannot determine whether the issue is due to my operation or other reasons. I haven't been working with CodeQL for very long, and there are still many areas I am not familiar with. I apologize if my improper operation has caused you any trouble. |
If I download your Hello-Java-Sec-master.zip file and use that command I observe:
Could you provide the full log from the terminal and the content of the |
Of course, I'm happy to provide it. I've put the terminal log and the log for generating the database into a zip file. |
Thanks. I suspect your JAVA_HOME might be Java 8 or lower, which is causing us to use our minimal shipped JDK to run extraction, which in turn doesn't support many character encodings. Workaround: set your JAVA_HOME to a Java >= 9 that supports GBK encoding (likely, any non-minified JDK). As a proper fix I will revise the CodeQL JDK to support more charsets so this doesn't occur in future. |
@smowton Thank you for answering my confusion. If I didn't know the reason, I think I would be confused for a long time. CodeQL is really a great tool that has helped me a lot. Thank you again for your response. |
Hello, I encountered the following issue while creating a database using codeql:
Later I found out it was because the Maven project used GBK encoding, and the pom file is configured as follows:
I then changed GBK to UTF-8 in the pom file, successfully created the database, and there was no prompt as above. Could you please explain why GBK encoding causes this issue? I feel that my solution is not elegant enough. Are there any other better methods? Thank you.
The text was updated successfully, but these errors were encountered: